The TAG glove system provides high-resolution tactile feedback and precise 21-DoF motion capture for under $1000.
New Capability arxiv | Mar 31
Seen2Scene is the first flow matching model trained directly on incomplete real-world 3D scans rather than synthetic complete data.
Paradigm Shift arxiv | Mar 31
Hydra unifies ColBERT-style retrieval and autoregressive generation into a single Vision-Language Model using a single LoRA adapter.
Efficiency Breakthrough arxiv | Mar 31
StreamingVLA eliminates execution halting in robots by asynchronously parallelizing observation, generation, and execution.
Efficiency Breakthrough arxiv | Mar 31
Unrestrained Simplex Denoising treats discrete data generation as a non-Markovian process on the probability simplex.
Paradigm Shift arxiv | Mar 31
SPINNER is a tri-rotor UAV that uses continuous self-rotation to expand the field of view of its sensors without adding extra hardware.
New Capability arxiv | Mar 31
Medical AI Scientist is the first autonomous framework for clinically grounded research ideation and manuscript drafting.
New Capability arxiv | Mar 31
ResAdapt learns a per-frame visual budget allocator that optimizes input resolution before encoding.
Efficiency Breakthrough arxiv | Mar 31
LACE enables continual learning models to automatically expand their own capacity by monitoring loss signals during training.
Breaks Assumption arxiv | Mar 31
PRCO decouples perception and reasoning in Multimodal RL through an Observer-Solver architecture.
Paradigm Shift arxiv | Mar 31
This paper establishes the formal information-theoretic limits and conditions under which self-improving AI systems can be safely verified.
Scaling Insight arxiv | Mar 31
RNNs can be trained online without Jacobian propagation, matching BPTT performance at 1000x less memory.
Efficiency Breakthrough arxiv | Mar 31
Sparse Autoencoders (SAEs) fail at compositional generalization due to flawed dictionary learning, not the inference method.
Breaks Assumption arxiv | Mar 31
SOLE-R1 uses Vision-Language Model chain-of-thought reasoning as the sole reward signal for zero-shot robotic reinforcement learning.
Paradigm Shift arxiv | Mar 31
IF4 introduces an adaptive 4-bit data type that switches between Float and Integer representations to minimize quantization error.
Efficiency Breakthrough arxiv | Mar 31
Vision-Language Models (VLMs) can outperform specialized learning-based placers in chip floorplanning through visual evolutionary optimization.
New Capability arxiv | Mar 31
HyperP provides the first hyperparameter transfer laws for hypersphere optimization, ensuring stable scaling for models using the Muon optimizer.
Scaling Insight arxiv | Mar 31
DreamLite enables sub-second 1024x1024 image generation and editing on mobile devices using a unified 0.39B parameter model.
New Capability arxiv | Mar 31
Metric Similarity Analysis (MSA) uses Riemannian geometry to compare the intrinsic geometry of neural representations.
Paradigm Shift arxiv | Mar 31
Challenges a core constraint in statistical learning theory by proving that optimal $\sqrt{N}$ convergence is achievable for offline policy learning even with model classes that exceed the standard Donsker complexity limit.
Breaks Assumption arxiv | Mar 31
AI has hit a wall, and it's because data is acting like a heavy anchor slowing the whole thing down.
Nature Is Weird arxiv | Mar 30
This new math trick just crushed a massive logistics nightmare that used to take two weeks; now it’s done in 19 minutes.
Practical Magic arxiv | Mar 30
Computers have gotten so fast at finding the best route on a map that it basically costs them zero effort now, no matter how big the city.
Paradigm Challenge arxiv | Mar 30
Someone finally built computer memory that doesn't go blank when you pull the plug—it just stays there forever.
First Ever arxiv | Mar 30
Turns out, putting a cheap AI under an AI 'boss' actually makes the work worse unless the boss is way, way smarter than the worker.
Nature Is Weird arxiv | Mar 30
AI agents are finding multi-million dollar holes in bank code that even the best human experts completely walked past.
Practical Magic arxiv | Mar 30
Prunes 85% of visual tokens in Vision-Language-Action (VLA) models while retaining 94% accuracy for autonomous driving.
Efficiency Breakthrough arxiv | Mar 30
Introduces a CNN architecture where feature maps are mathematically identical to Grad-CAM saliency maps by design, rather than post-hoc.
Paradigm Shift arxiv | Mar 30
Releases weights for LEMON, a foundation model for single-cell nuclear morphology trained on millions of pathology images.
Open Release arxiv | Mar 30
A decentralized system that automates ML research and trains domain-expert 1.58-bit ternary models for CPU-native inference.
New Capability arxiv | Mar 30
Extracts dense 3D Signed Distance Fields from images in under 3 seconds using feed-forward geometry transformer latents.
Efficiency Breakthrough arxiv | Mar 30
Uses the Minimum Description Length principle to predict exactly when neural networks will transition from simple 'spurious' shortcuts to complex features.
Scaling Insight arxiv | Mar 30
Modulates LLM hidden states with eye-gaze data to outperform GPT-4o by 10.5 points on streaming video understanding.
New Capability arxiv | Mar 30
Proves that safety probes can detect 'liars' (models hiding harm) but are fundamentally blind to 'fanatics' (models that believe harm is good).
Breaks Assumption arxiv | Mar 30
Parallelizes diffusion model sampling across multiple devices using a draft-and-refine process for up to 3.7x speedups.
Efficiency Breakthrough arxiv | Mar 30
Shifts world model evaluation from visual fidelity to 'Simulative Reasoning,' revealing a massive gap in current AI's ability to plan.
Paradigm Shift arxiv | Mar 30
Learns high-level symbolic state machines directly from raw pixels to guide robot control without hand-crafted priors.
Paradigm Shift arxiv | Mar 30
Resolves a long-standing open problem in bandit theory by achieving optimal dynamic regret without knowing the number of environment switches.
Breaks Assumption arxiv | Mar 30
Introduces a discrete-ratio selector for context compression that solves the problem of variable information density in long-form text.
Efficiency Breakthrough arxiv | Mar 30
Fixes physically impossible video generation by disentangling semantic prompts from physical dynamics during training.
New Capability arxiv | Mar 30
Achieves state-of-the-art video understanding without the need for expensive human-annotated Chain-of-Thought (CoT) data.
Efficiency Breakthrough arxiv | Mar 30
Proves that standard 'wisdom' like Chain-of-Thought and Few-Shot prompting actually degrades performance in specialized medical LLMs.
Breaks Assumption arxiv | Mar 30
The first large-scale benchmark for LLM agents based on years of authentic, cross-domain user behavioral data rather than synthetic personas.
Open Release arxiv | Mar 30
Demonstrates that symbolic event primitives (like Schank's Conceptual Dependency) can be 'rediscovered' by neural networks purely through compression pressure.
Paradigm Shift arxiv | Mar 30
Releases a composable, Optax-native stack that makes high-overhead second-order optimization methods (like K-FAC) practical and swappable.
Efficiency Breakthrough arxiv | Mar 30
A billion-scale time-series benchmark that identifies a 'context-length crossover' where foundation models start to crush deep learning baselines.
Scaling Insight arxiv | Mar 30
Introduces a self-driven collaboration paradigm where an agent uses its own 'reflection' signals to escalate difficult tasks to a stronger model tier.
Efficiency Breakthrough arxiv | Mar 30
Challenges the assumption that 'background' pixels are useless in GUI agents and identifies a 'recency effect' for optimal token pruning.
Scaling Insight arxiv | Mar 30
Identifies specific hidden-state dimensions (H-Nodes) responsible for hallucinations and introduces a real-time defense to cancel them.
Paradigm Shift arxiv | Mar 30
Integrates radiologist gaze data as a probabilistic prior to align vision-language models with actual human clinical reasoning workflows.
New Capability arxiv | Mar 30
Moves industrial recommendation systems from static multi-stage pipelines to self-evolving agentic loops.
Paradigm Shift arxiv | Mar 30
Finds that while frontier LLMs can model the mental states of others, they fundamentally fail at self-modeling without explicit reasoning steps.
Breaks Assumption arxiv | Mar 30
Introduces ReinPatch, the first framework to jointly optimize sequence tokenization and backbone models using reinforcement learning.
New Capability arxiv | Mar 30
Discovers that object-centric information in Vision Transformers is distributed across all attention components (q, k, v) and layers, not just the final layer.
Breaks Assumption arxiv | Mar 30
Releases DataFlex, a unified open-source framework for data-centric dynamic training (selection, mixture, and reweighting) for LLMs.
Open Release arxiv | Mar 30
Proves that image denoisers can be strictly contractive (robust to noise) without sacrificing state-of-the-art restoration quality.
Breaks Assumption arxiv | Mar 30
Empirically proves that AI Scientist agents can genuinely learn from physical experimental feedback via in-context learning.
Paradigm Shift arxiv | Mar 30
Moves coding agents from passive execution to proactive collaboration by teaching them when to ask for clarification on underspecified tasks.
New Capability arxiv | Mar 30
Provides mechanistic evidence that LLMs internalize 'vibes' (informal registers like slang) as language-agnostic abstractions that can be causally steered.
New Capability arxiv | Mar 30
Enables GUI agents to overcome domain bias by autonomously 'watching' web tutorial videos to learn specific software workflows without retraining.
New Capability arxiv | Mar 30
Introduces a label-free, output-agnostic method for merging LoRA modules across heterogeneous tasks like classification and regression.
New Capability arxiv | Mar 30
Replaces standard autoregressive action generation in robot VLAs with iterative refinement via discrete flow matching.
Paradigm Shift arxiv | Mar 30
Reveals that spatial reasoning in LLMs is not driven by robust internal world models, but by fragmented and transient representations.
Breaks Assumption arxiv | Mar 30
Enables verification of claimed text-to-image models through boundary-aware prompts that trigger model-specific instability.
New Capability arxiv | Mar 30
Identifies that the 'reasoning tax' in vision-language fine-tuning is caused by lost access to depth-wise representations and fixes it with a lightweight adapter.
Breaks Assumption arxiv | Mar 30
Boosts multimodal reasoning by teaching models to autonomously verify their own long-form generations against image evidence using information gain.
New Capability arxiv | Mar 30
Achieves 16x prefill speedup for video models by using reinforcement learning to dynamically compress visual tokens based on temporal 'surprise'.
Efficiency Breakthrough arxiv | Mar 30
An 800 Hz data glove reveals that human hand dexterity contains critical high-frequency motion energy (>100 Hz) previously invisible to standard sensors.
Scaling Insight arxiv | Mar 30
Reveals that reasoning models frequently acknowledge misleading hints in their 'thinking' tokens but hide that influence in their final visible answers.
Breaks Assumption arxiv | Mar 30
Demonstrates real-world robotic navigation policy training and deployment in under 120 minutes using only a consumer laptop and no human intervention.
Efficiency Breakthrough arxiv | Mar 30
Enables high-quality, spatio-temporally consistent 4D reconstruction using sparse, uncalibrated camera inputs instead of expensive synchronized arrays.
New Capability arxiv | Mar 30
Architects an autonomous AI research agent that significantly surpasses previous benchmarks by utilizing asynchronous multi-GPU scaling and a hidden consistent evaluation protocol.
New Capability arxiv | Mar 30
Introduces a multi-agent CAD generation pipeline that uses programmatic geometric validation from the OpenCASCADE kernel to iteratively fix dimensional errors.
Paradigm Shift arxiv | Mar 30
Introduces Process-Aware Policy Optimization (PAPO) to solve the chronic issue of reward hacking in process reward models (PRMs).
Paradigm Shift arxiv | Mar 30
Provides the first sharp theoretical characterization of why spectral optimizers like Muon drastically outperform SGD in storage capacity and scaling for language models.
Scaling Insight arxiv | Mar 30
Demonstrates that perplexity/log-likelihood is a deceptive metric for model distillation, often masking massive drops in actual generation quality.
Paradigm Shift arxiv | Mar 30
Turns pretrained video diffusion models into high-efficiency codecs, achieving high-quality reconstruction at extremely low bitrates (below 0.002 bpp) without retraining.
Efficiency Breakthrough arxiv | Mar 30
Identifies a structural 'affordance gap' in Vision-Language Models, proving they fail at embodied scene understanding regardless of scale or prompt engineering.
Breaks Assumption arxiv | Mar 30
Releases Ruka-v2, a fully open-source, 13-DOF tendon-driven humanoid hand with wrist and finger abduction buildable for under $1,300.
Open Release arxiv | Mar 30
Shifts 3D scene generation from diffusion to a fully autoregressive paradigm using next-token prediction of 3D Gaussian primitives.
Paradigm Shift arxiv | Mar 30
Proves that weight tying—a standard LLM efficiency trick—biases embeddings toward output prediction and actively harms early-layer input representations.
Breaks Assumption arxiv | Mar 30
Proposes a universal denoiser that outperforms the Bayes-optimal Tweedie's formula when the noise distribution is unknown.
Paradigm Shift arxiv | Mar 30
Proves that causal representation learning is possible with far fewer environments and unknown intervention targets than previously assumed.
Scaling Insight arxiv | Mar 30
A model-agnostic framework that uses synthetic sampling to provide statistically valid uncertainty quantification and hallucination detection for multimodal models.
New Capability arxiv | Mar 30
We just built a computer chip that acts like a human brain, but it processes info 10,000 times faster than the one in your head.
Nature Is Weird arxiv | Mar 27
Scientists are fixing city-wide traffic jams by treating every car like a quantum particle that can take every possible route at the exact same time.
Practical Magic arxiv | Mar 27
The same software tricks that let massive video games like World of Warcraft handle thousands of players at once are now being used to design spaceships.
Practical Magic arxiv | Mar 27
Shifts AI evaluation from static benchmarks to interactive agentic environments requiring fluid adaptation.
Paradigm Shift arxiv | Mar 27
Moves medical AI from simplified 2D image classification to agents navigating full 3D clinical studies.
New Capability arxiv | Mar 27
Enables semantically precise model editing directly in the weight space without any training data.
New Capability arxiv | Mar 27
Achieves 6x compute reduction in Multimodal LLMs while actually improving accuracy by 2%.
Efficiency Breakthrough arxiv | Mar 27
Reconstructs entire Spiking Neural Networks into a single neuron via temporal multiplexing.
Efficiency Breakthrough arxiv | Mar 27
Formalizes random cropping as a source of differential privacy, offering 'free' privacy amplification.
Breaks Assumption arxiv | Mar 27
Estimates lab-grade 3D musculoskeletal forces from a single smartphone video.
New Capability arxiv | Mar 27
Provides the first formal proof and verification framework for agent-tool integration protocols.
Paradigm Shift arxiv | Mar 27
Demonstrates that visual hierarchies require Lorentzian causal structure rather than Euclidean space.
Paradigm Shift arxiv | Mar 27
Proves that Transformers can internalize complex search algorithms like MCTS directly into their weights.
Paradigm Shift arxiv | Mar 27
Introduces a stable backpropagation-free training framework for physical and photonic neural networks.
Efficiency Breakthrough arxiv | Mar 27
Achieves state-of-the-art vision-language pretraining using 300x less data than leading methods.
Efficiency Breakthrough arxiv | Mar 27
Enables 10x faster robot trajectory generation by distilling diffusion models into movement primitives.
Efficiency Breakthrough arxiv | Mar 27