Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
Efficiency Breakthrough
Demonstrates real-world robotic navigation policy training and deployment in under 120 minutes using only a consumer laptop and no human intervention.
New Capability
Enables high-quality, spatio-temporally consistent 4D reconstruction using sparse, uncalibrated camera inputs instead of expensive synchronized arrays.
New Capability
Architects an autonomous AI research agent that significantly surpasses previous benchmarks by utilizing asynchronous multi-GPU scaling and a hidden consistent evaluation protocol.
Paradigm Shift
Introduces a multi-agent CAD generation pipeline that uses programmatic geometric validation from the OpenCASCADE kernel to iteratively fix dimensional errors.
Paradigm Shift
Introduces Process-Aware Policy Optimization (PAPO) to solve the chronic issue of reward hacking in process reward models (PRMs).
Scaling Insight
Provides the first sharp theoretical characterization of why spectral optimizers like Muon drastically outperform SGD in storage capacity and scaling for language models.
Paradigm Shift
Demonstrates that perplexity/log-likelihood is a deceptive metric for model distillation, often masking massive drops in actual generation quality.
Efficiency Breakthrough
Turns pretrained video diffusion models into high-efficiency codecs, achieving high-quality reconstruction at extremely low bitrates (below 0.002 bpp) without retraining.
Breaks Assumption
Identifies a structural 'affordance gap' in Vision-Language Models, proving they fail at embodied scene understanding regardless of scale or prompt engineering.
Open Release
Releases Ruka-v2, a fully open-source, 13-DOF tendon-driven humanoid hand with wrist and finger abduction buildable for under $1,300.
Paradigm Shift
Shifts 3D scene generation from diffusion to a fully autoregressive paradigm using next-token prediction of 3D Gaussian primitives.
Breaks Assumption
Proves that weight tying—a standard LLM efficiency trick—biases embeddings toward output prediction and actively harms early-layer input representations.
Paradigm Shift
Proposes a universal denoiser that outperforms the Bayes-optimal Tweedie's formula when the noise distribution is unknown.
Scaling Insight
Proves that causal representation learning is possible with far fewer environments and unknown intervention targets than previously assumed.
New Capability
A model-agnostic framework that uses synthetic sampling to provide statistically valid uncertainty quantification and hallucination detection for multimodal models.
Nature Is Weird
We just built a computer chip that acts like a human brain, but it processes info 10,000 times faster than the one in your head.
Practical Magic
Scientists are fixing city-wide traffic jams by treating every car like a quantum particle that can take every possible route at the exact same time.
Practical Magic
The same software tricks that let massive video games like World of Warcraft handle thousands of players at once are now being used to design spaceships.
Paradigm Shift
Shifts AI evaluation from static benchmarks to interactive agentic environments requiring fluid adaptation.
New Capability
Moves medical AI from simplified 2D image classification to agents navigating full 3D clinical studies.
New Capability
Enables semantically precise model editing directly in the weight space without any training data.
Efficiency Breakthrough
Achieves 6x compute reduction in Multimodal LLMs while actually improving accuracy by 2%.
Efficiency Breakthrough
Reconstructs entire Spiking Neural Networks into a single neuron via temporal multiplexing.
Breaks Assumption
Formalizes random cropping as a source of differential privacy, offering 'free' privacy amplification.
New Capability
Estimates lab-grade 3D musculoskeletal forces from a single smartphone video.
Paradigm Shift
Provides the first formal proof and verification framework for agent-tool integration protocols.
Paradigm Shift
Demonstrates that visual hierarchies require Lorentzian causal structure rather than Euclidean space.
Paradigm Shift
Proves that Transformers can internalize complex search algorithms like MCTS directly into their weights.
Efficiency Breakthrough
Introduces a stable backpropagation-free training framework for physical and photonic neural networks.
Efficiency Breakthrough
Achieves state-of-the-art vision-language pretraining using 300x less data than leading methods.
Efficiency Breakthrough
Enables 10x faster robot trajectory generation by distilling diffusion models into movement primitives.
Scaling Insight
Reveals that synthetic rewriting is a quality multiplier for high-grade data, but fails to fix low-quality source data.
Breaks Assumption
Proves that stereo matching can reach state-of-the-art performance without the computationally heavy cost volumes used by almost all modern methods.
Efficiency Breakthrough
Speeds up RL-based reasoning training by 1.7x using an online quality head to prune failing rollouts mid-generation.
Paradigm Shift
Introduces a multi-answer RL objective that trains models to represent a distribution of valid answers in a single forward pass.
Breaks Assumption
Proves platform-determinism is necessary for trustworthy AI and implements an integer-only engine for bitwise identical inference across ARM and x86.
New Capability
Quantifies near-verbatim data extraction risk in LLMs at 1/5000th the computational cost of standard Monte Carlo methods.
New Capability
Enables graph-based retrieval and reranking for RAG without the maintenance overhead of a knowledge graph.
Breaks Assumption
Reduces visual tokens in robot policies by 78% by using inter-layer rank consistency instead of simple attention magnitude.
Breaks Assumption
This paper demonstrates that the order of training examples alone can encode information not present in any individual example, allowing models to bypass established sample complexity bounds.
Scaling Insight
A systematic study reveals that grokking is not an architectural property of Transformers but an interaction between weight decay and optimization stability.
Paradigm Shift
The 'Reasoning Contamination Effect' shows that Chain-of-Thought (CoT) reasoning actually disrupts a model's internal confidence signal, leading to poorer calibration.
Breaks Assumption
Large Language Models process instructions as social acts rather than technical specifications, making 'imperative mood' prompts behave inconsistently across different languages.
New Capability
GeoNDC introduces a queryable neural data cube that compresses 20 years of planetary satellite data by 95x while allowing on-demand continuous-time reconstruction.
Efficiency Breakthrough
Sparton is a specialized Triton kernel that solves the massive memory bottleneck of Learned Sparse Retrieval (LSR) models like Splade.
New Capability
Intern-S1-Pro is the first trillion-parameter scientific multimodal foundation model, outperforming proprietary models on specialized scientific reasoning.
New Capability
AirVLA successfully transfers manipulation-trained Vision-Language-Action (VLA) models to underactuated aerial robots using a payload-aware guidance mechanism.
Paradigm Shift
R1Sim applies the 'Reasoning-RL' paradigm (popularized by DeepSeek-R1) to traffic simulation, achieving superior safety and diversity in multi-agent behaviors.
Paradigm Shift
SIGMA resolves 'trajectory divergence' in molecular string generation by enforcing geometric symmetry recognition through contrastive learning.
Efficiency Breakthrough
A fully differentiable agent-based traffic simulator enables calibration and control of million-vehicle networks 173x faster than real-time.