AI & Machine Learning

2,557 papers · Page 29 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Efficiency Breakthrough

Demonstrates real-world robotic navigation policy training and deployment in under 120 minutes using only a consumer laptop and no human intervention.

Enables high-quality, spatio-temporally consistent 4D reconstruction using sparse, uncalibrated camera inputs instead of expensive synchronized arrays.

Architects an autonomous AI research agent that significantly surpasses previous benchmarks by utilizing asynchronous multi-GPU scaling and a hidden consistent evaluation protocol.

Introduces a multi-agent CAD generation pipeline that uses programmatic geometric validation from the OpenCASCADE kernel to iteratively fix dimensional errors.

Introduces Process-Aware Policy Optimization (PAPO) to solve the chronic issue of reward hacking in process reward models (PRMs).

Scaling Insight

Provides the first sharp theoretical characterization of why spectral optimizers like Muon drastically outperform SGD in storage capacity and scaling for language models.

Demonstrates that perplexity/log-likelihood is a deceptive metric for model distillation, often masking massive drops in actual generation quality.

Efficiency Breakthrough

Turns pretrained video diffusion models into high-efficiency codecs, achieving high-quality reconstruction at extremely low bitrates (below 0.002 bpp) without retraining.

Breaks Assumption

Identifies a structural 'affordance gap' in Vision-Language Models, proving they fail at embodied scene understanding regardless of scale or prompt engineering.

Releases Ruka-v2, a fully open-source, 13-DOF tendon-driven humanoid hand with wrist and finger abduction buildable for under $1,300.

Shifts 3D scene generation from diffusion to a fully autoregressive paradigm using next-token prediction of 3D Gaussian primitives.

Breaks Assumption

Proves that weight tying—a standard LLM efficiency trick—biases embeddings toward output prediction and actively harms early-layer input representations.

Proposes a universal denoiser that outperforms the Bayes-optimal Tweedie's formula when the noise distribution is unknown.

Scaling Insight

Proves that causal representation learning is possible with far fewer environments and unknown intervention targets than previously assumed.

A model-agnostic framework that uses synthetic sampling to provide statistically valid uncertainty quantification and hallucination detection for multimodal models.

Nature Is Weird

We just built a computer chip that acts like a human brain, but it processes info 10,000 times faster than the one in your head.

Practical Magic

Scientists are fixing city-wide traffic jams by treating every car like a quantum particle that can take every possible route at the exact same time.

Practical Magic

The same software tricks that let massive video games like World of Warcraft handle thousands of players at once are now being used to design spaceships.

Shifts AI evaluation from static benchmarks to interactive agentic environments requiring fluid adaptation.

Moves medical AI from simplified 2D image classification to agents navigating full 3D clinical studies.

Enables semantically precise model editing directly in the weight space without any training data.

Efficiency Breakthrough

Achieves 6x compute reduction in Multimodal LLMs while actually improving accuracy by 2%.

Efficiency Breakthrough

Reconstructs entire Spiking Neural Networks into a single neuron via temporal multiplexing.

Breaks Assumption

Formalizes random cropping as a source of differential privacy, offering 'free' privacy amplification.

Estimates lab-grade 3D musculoskeletal forces from a single smartphone video.

Provides the first formal proof and verification framework for agent-tool integration protocols.

Demonstrates that visual hierarchies require Lorentzian causal structure rather than Euclidean space.

Proves that Transformers can internalize complex search algorithms like MCTS directly into their weights.

Efficiency Breakthrough

Introduces a stable backpropagation-free training framework for physical and photonic neural networks.

Efficiency Breakthrough

Achieves state-of-the-art vision-language pretraining using 300x less data than leading methods.

Efficiency Breakthrough

Enables 10x faster robot trajectory generation by distilling diffusion models into movement primitives.

Scaling Insight

Reveals that synthetic rewriting is a quality multiplier for high-grade data, but fails to fix low-quality source data.

Breaks Assumption

Proves that stereo matching can reach state-of-the-art performance without the computationally heavy cost volumes used by almost all modern methods.

Efficiency Breakthrough

Speeds up RL-based reasoning training by 1.7x using an online quality head to prune failing rollouts mid-generation.

Introduces a multi-answer RL objective that trains models to represent a distribution of valid answers in a single forward pass.

Breaks Assumption

Proves platform-determinism is necessary for trustworthy AI and implements an integer-only engine for bitwise identical inference across ARM and x86.

Quantifies near-verbatim data extraction risk in LLMs at 1/5000th the computational cost of standard Monte Carlo methods.

Enables graph-based retrieval and reranking for RAG without the maintenance overhead of a knowledge graph.

Breaks Assumption

Reduces visual tokens in robot policies by 78% by using inter-layer rank consistency instead of simple attention magnitude.

Breaks Assumption

This paper demonstrates that the order of training examples alone can encode information not present in any individual example, allowing models to bypass established sample complexity bounds.

Scaling Insight

A systematic study reveals that grokking is not an architectural property of Transformers but an interaction between weight decay and optimization stability.

The 'Reasoning Contamination Effect' shows that Chain-of-Thought (CoT) reasoning actually disrupts a model's internal confidence signal, leading to poorer calibration.

Breaks Assumption

Large Language Models process instructions as social acts rather than technical specifications, making 'imperative mood' prompts behave inconsistently across different languages.

GeoNDC introduces a queryable neural data cube that compresses 20 years of planetary satellite data by 95x while allowing on-demand continuous-time reconstruction.

Efficiency Breakthrough

Sparton is a specialized Triton kernel that solves the massive memory bottleneck of Learned Sparse Retrieval (LSR) models like Splade.

Intern-S1-Pro is the first trillion-parameter scientific multimodal foundation model, outperforming proprietary models on specialized scientific reasoning.

AirVLA successfully transfers manipulation-trained Vision-Language-Action (VLA) models to underactuated aerial robots using a payload-aware guidance mechanism.

R1Sim applies the 'Reasoning-RL' paradigm (popularized by DeepSeek-R1) to traffic simulation, achieving superior safety and diversity in multi-agent behaviors.

SIGMA resolves 'trajectory divergence' in molecular string generation by enforcing geometric symmetry recognition through contrastive learning.

Efficiency Breakthrough

A fully differentiable agent-based traffic simulator enables calibration and control of million-vehicle networks 173x faster than real-time.