Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
Breaks Assumption
LLM-based user simulators create an 'easy mode' for agents that fails to capture real human frustration, ambiguity, and feedback nuances.
Breaks Assumption
Machine unlearning in LLMs is often a 'mirage' that can be bypassed using simple multi-hop reasoning or entity aliasing.
Efficiency Breakthrough
InstantHDR achieves high-quality 3D HDR reconstruction 700x faster than current optimization-based methods.
Paradigm Shift
Theoretical analysis proves that Langevin dynamics is fundamentally non-robust to score function errors, justifying the shift to Diffusion Models.
Paradigm Shift
HAPO resolves the advantage collapse problem in sparse-reward RL for reasoning models using a Thompson-sampled hindsight mechanism.
Scaling Insight
Adversarial prompt injection causes jailbreak success rates to transition from polynomial to exponential scaling with inference-time samples.
New Capability
RewardHackingAgents establishes a benchmark for evaluating whether ML-engineering agents are actually solving tasks or just tampering with the evaluation code.
Efficiency Breakthrough
TimeSqueeze achieves 20x faster convergence and 8x higher data efficiency for time-series foundation models by using dynamic, content-aware patching.
Breaks Assumption
MirrorDrift demonstrates a successful SLAM-targeted attack on production-grade 'secure' LiDARs using simple actuated mirrors rather than complex signal injection.
Breaks Assumption
An evaluation of 17 LLMs reveals a 'conversation tax' where multi-turn interactions consistently degrade diagnostic reasoning compared to single-shot prompts.
Paradigm Shift
This paper introduces Finsler geometry to manifold learning, allowing for the capture of asymmetric data relationships like density hierarchies that Riemannian methods ignore.
Breaks Assumption
Re-evaluating high-profile medical AI safety claims reveals that reported triage failures were artifacts of the 'exam-style' evaluation format rather than model incapacity.
Efficiency Breakthrough
DART enables real-time multi-class detection for open-vocabulary models like SAM3, achieving up to 25x speedup without any weight modifications.
Breaks Assumption
Softmax normalization mathematically mandates the creation of attention sinks to serve as 'null states' when models need to ignore input.
Efficiency Breakthrough
LongFlow provides an 11x throughput boost for reasoning models by specifically optimizing KV cache for long-output (vs long-input) scenarios.
Paradigm Shift
Manifold-Optimal Guidance reformulates Classifier-Free Guidance (CFG) as a Riemannian control problem, eliminating the artifacts and saturation typical of high guidance scales.
Open Release
Tiny Aya is a 3.35B parameter multilingual model that achieves state-of-the-art results across 70 languages, challenging the need for massive scale in global AI.
Breaks Assumption
An empirical study reveals that models under 7B parameters have a fundamental utilization bottleneck that prevents them from using retrieved context effectively.
Efficiency Breakthrough
Mobile-GS achieves real-time Gaussian Splatting on mobile devices by replacing the sorting-based alpha-blending bottleneck with depth-aware order-independent rendering.
Paradigm Shift
Expert Threshold Routing (ET) replaces standard top-k token-choice with an independent thresholding mechanism, achieving 1.6x faster training convergence.
New Capability
RoboClaw introduces 'Entangled Action Pairs' to allow robots to autonomously collect data by learning to reset their own environment.
Breaks Assumption
The discovery of 'Helicoid Dynamics' identifies a critical safety failure where frontier LLMs accurately name their reasoning errors but are structurally unable to stop repeating them.
Efficiency Breakthrough
Achieves 99.5% performance on Needle-In-A-Haystack benchmarks while retaining only 3% of the KV cache budget.
Scaling Insight
Applying Rotary Positional Embeddings (RoPE) to only 10% of hidden dimensions is sufficient for full model convergence, enabling 10x memory savings in positional caches.
Efficiency Breakthrough
Distills high-fidelity joint audio-visual generation into a real-time streaming model capable of 25 FPS on a single GPU.
Breaks Assumption
Shows that simple sequential fine-tuning with LoRA outperforms complex algorithms for continual reinforcement learning in VLA models.
Breaks Assumption
Proves that policy gradient algorithms naturally collapse entropy and provides a mathematical fix to preserve exploration and diversity.
Efficiency Breakthrough
Achieves hour-scale real-time human animation by solving the unbounded memory growth and inconsistent noise states in autoregressive diffusion.
Paradigm Shift
Introduces the Compression-Consistency Principle, arguing that LLMs prefer truth only when false alternatives are structurally harder to compress.
New Capability
Replaces unstructured LLM debates with 'Deliberative Collective Intelligence,' producing formal decision packets with minority reports and accountability trails.
Scaling Insight
Provides a learning-theoretic characterization of model collapse, proving exactly when replaying past outputs destroys model diversity.
Paradigm Shift
Enables agents to autonomously discover the group structure of their environments to learn disentangled representations without human priors.
Efficiency Breakthrough
Unifies leading membership inference attacks into a single framework and uses Bayesian variance inference to enable privacy auditing with 10x less compute.
New Capability
Automates the entire robotic data generation loop, including a self-resetting mechanism that restores unstructured workspaces without human intervention.
New Capability
Bridges the gap between parametric CAD and direct B-Rep synthesis using LLMs and primitive grounding.
Paradigm Shift
Eliminates lookahead bias in financial backtesting through a series of yearly-partitioned pretrained LLMs.
Efficiency Breakthrough
Recovers hidden ODE parameters from sparse data with a 487x speedup over gradient-based methods.
Efficiency Breakthrough
Eliminates the 2.5x latency penalty of dynamic adapters in LLMs via pre-gating and fused CUDA kernels.
New Capability
Enables concurrent perception and reasoning for continuous video streams in Multimodal Large Language Models.
Efficiency Breakthrough
Fits promptable visual segmentation (SAM) into a 1.3M parameter model for real-time in-sensor execution.
New Capability
First framework for interpreting 4D molecular trajectories into natural language explanations.
Scaling Insight
Exhaustive circuit mapping of a biological foundation model reveals massive redundancy and annotation bias.
Paradigm Shift
Solves GNN over-squashing by using global effective resistance to identify and rewire structural bottlenecks.
New Capability
Cross-domain sensor model that handles variable signal lengths and resolutions without retraining.
Efficiency Breakthrough
Achieves high-fidelity one-step (1 NFE) 3D robotic manipulation using training-time drifting fields.
Open Release
Introduces the first billion-scale SAR vision foundation model and a massive unified benchmark for all-weather geospatial semantic segmentation.
Breaks Assumption
Demonstrates that simply using XML tags during translation outperforms complex pipelines for cross-lingual label projection while actually improving translation quality.
Efficiency Breakthrough
Achieves up to 14.4x higher decoding throughput in long-context LLMs via a training-free framework that reuses sparse memory at semantic boundaries.
New Capability
Enables multimodal agents to continually improve from experience and skills without any parameter updates through a dual-stream visual grounding framework.
New Capability
A 3D vision-language pipeline that grounds medical diagnosis in longitudinal brain MRI via regional volumetric assessments to eliminate VLM hallucinations.