AI & Machine Learning

2,371 papers · Page 19 of 48

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Achieves 'zero forgetting' in continual learning by stacking frozen domain-specific MoE-LoRA adapters with a meta-router.

First humanoid robot system to achieve consecutive ping-pong strikes using only onboard egocentric vision and whole-body coordination.

Breaks Assumption

Reveals a 'Reasoning Shift' where increased context length silently causes models to skip self-verification and shorten their reasoning traces by up to 50%.

Efficiency Breakthrough

Introduces S0 tuning for hybrid RNN-attention models, outperforming LoRA by 10.8% with zero inference overhead.

Efficiency Breakthrough

Reduces the compute cost of LLM test-time scaling by up to 67% using conformal prediction to calibrate reasoning paths.

Replaces standard relative Softmax attention with 'Multiscreening' to allow absolute query-key relevance, yielding 3.2x faster inference at 100K context.

Scaling Insight

Simple Self-Distillation (SSD) improves LLM code generation (e.g., Qwen3-30B) by 13% Pass@1 without any external verifiers or teacher models.

Breaks Assumption

Provides causal evidence that reasoning models often decide on an action (like a tool call) before they even start generating their 'Chain-of-Thought'.

Efficiency Breakthrough

Combines the YOCO architecture with recursive computation to scale representational depth without inflating the KV cache.

Efficiency Breakthrough

Solves the long-standing trade-off in low-rank matrix recovery by achieving both optimal sample complexity and fast convergence.

Breaks Assumption

Provides a theoretical explanation for why Transformers often fail compared to linear models in financial time series forecasting.

Efficiency Breakthrough

Enables Gaussian Processes to scale on modern parallel hardware by removing the need for Cholesky decompositions.

Introduces 'deconfounding scores' to enable reliable causal effect estimation even when treatment and control groups have very little overlap.

Delivers a state-of-the-art universal phone recognition model across 100+ languages with full open-source release.

Researchers have designed a new internet protocol specifically for a 10-node colony network spanning Earth, the Moon, and Mars.

Practical Magic

Everyday 5G cell towers can be repurposed as a massive radar system capable of tracking drones hidden in urban noise.

Nature Is Weird

AI voice assistants can be tricked into 'hearing' voices and events that never actually happened with near-perfect accuracy.

Practical Magic

Future wireless signals could be boosted by walls that physically shift and morph their shape to bounce waves toward your phone.

Paradigm Challenge

Researchers have mapped out all 19.3 million chords the human hand can play on a piano to reveal why some sound 'clear' and others 'muddy.'

Interfaces LLMs with Wikidata-scale graphs for multi-hop reasoning without any retraining of the model or the query executor.

A unified, open-source framework that converts complex post-training quantization workflows into a single-line, hardware-aware pipeline.

Efficiency Breakthrough

Decouples data mixture ratio selection from continual pre-training by optimizing distribution vectors post-hoc with 15-35x lower compute cost.

Achieves an 80x improvement in stable generation length for occupancy world models, enabling 4km+ autonomous driving simulations from a single frame.

Replaces the heuristic constant momentum (0.9) with a parameter-free, physics-inspired schedule that speeds up convergence by nearly 2x.

Leverages model reprogramming as an 'active signal amplifier' to proactively audit privacy leakage in LLMs and Diffusion models.

Efficiency Breakthrough

Combines differentiable optimization with exact ILP solvers to achieve a 10x performance gain in solving NP-hard combinatorial scheduling problems.

Proposes a mathematical framework where 'spectral gaps' in parameter updates control phase transitions like grokking and loss plateaus.

Breaks Assumption

Large-scale experiments reveal that self-organizing LLM agents spontaneously outperform manually designed hierarchical structures by 14%.

Efficiency Breakthrough

A fabricated 16nm SoC that performs real-time 3D occupancy mapping under 6 mW, reducing query energy by over 80%.

Proposes a neuroscience-grounded memory architecture that makes interactions cheaper and more accurate with experience, rather than relying on expanding context windows.

Breaks Assumption

Reveals that parallel translated data is surprisingly unnecessary for creating aligned multilingual representations in LLMs.

Breaks Assumption

Discovers that pretraining Implicit Neural Representations (INRs) on structured $1/f^\alpha$ noise performs as well as data-driven initialization.

Introduces DASES, a framework that replaces passive validation with active 'falsification' to ensure scientific models learn actual mechanisms rather than just winning benchmarks.

Efficiency Breakthrough

Generates complete, simulatable analog circuits in milliseconds, outperforming search-based methods by over 600x.

Breaks Assumption

Demonstrates that integer multiplication is not a long-range dependency problem, and that current architectures like Transformers and Mamba are fundamentally using the wrong 'computational spacetime.'

Efficiency Breakthrough

Introduces PolarQuant, a quantization method that uses Hadamard rotation to make LLM weights near-lossless at 5-bit without calibration data.

Breaks Assumption

Demonstrates that the 'modality gap' in CLIP-style models is a feature that can be exploited to increase robustness without retraining.

Achieves a +48pp accuracy gain in agents using a non-parametric online learning framework that reuses procedural plans without updating model weights.

Efficiency Breakthrough

Scales curvature-aware bilevel optimization to BERT-sized models using KFAC, significantly outperforming standard gradient unrolling.

Switches the training objective from hard Next-Token Prediction to predicting 'concepts' (sets of semantically related tokens).

Breaks Assumption

Challenges the assumption that architecture and loss are the primary levers for neural simulators by proving the 'carried state' design is the dominant bottleneck.

Proves that LLM agent capability (pass@1) and reliability (consistency) diverge systematically, with frontier models often having the highest 'meltdown' rates.

Introduces a way for diffusion models to generate a single, sharp 'mental average' of a concept rather than blurry pixel-wise averages.

A massive multimodal release for 10 low-resource African languages, reducing SOTA Word Error Rates (WER) by up to 61% relative.

Efficiency Breakthrough

Enables infinite-length video understanding on a single consumer GPU (RTX 3090) through a training-free visual memory mechanism.

Learns stable, interpretable Koopman generators for nonlinear PDEs from trajectory data alone without any physics supervision.

A massive 270K-sample multi-view video corpus specifically for embodied AI agents in complex retail environments.

Introduces a scalable reinforcement learning framework that enables high-fidelity control of a whole-body human musculoskeletal system with over 700 muscles.

Proposes 'Nomad', an exploration-first agent architecture that autonomously discovers insights in data without being limited by human prompts or questions.

Breaks Assumption

Reveals that many massive LLM benchmarks provide highly redundant information, with major leaderboards often containing only ~2 independent axes of measurement.