Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
Efficiency Breakthrough
A cross-graph tuning-free prompting framework for GNNs that achieves massive gains on unseen graphs without retraining.
Paradigm Shift
Proposes a decision-centric architecture that separates signal estimation from control policy to make LLM system decisions explicit and inspectable.
New Capability
Enables zero-shot humanoid navigation in unseen environments using only 5 hours of human walking data and no robot-specific data.
New Capability
A white-box membership inference attack using 'gradient-induced feature drift' to outperform all existing confidence-based methods.
Efficiency Breakthrough
Self-Routing removes the need for learned routers in Mixture-of-Experts (MoE) by using hidden states directly for expert assignment.
Efficiency Breakthrough
Improves Qwen2.5-7B performance on AIME2024 by 137% through test-time iterative rethinking and majority-voted pseudo-labels.
Efficiency Breakthrough
Automates mathematical optimization modeling using reinforcement learning with solver-derived rewards instead of human process supervision.
Breaks Assumption
Reveals that many 'polysemantic' neurons in LLMs are actually firing for shared word forms (lexical) rather than compressed semantic concepts.
Paradigm Shift
Truth Anchoring (TAC) provides a post-hoc calibration method to align LLM uncertainty metrics with actual factual correctness.
Scaling Insight
Demonstrates that LLM judge panels follow power-law discovery curves, where panel size and persona diversity are critical for uncovering edge-case failures.
Paradigm Shift
Identifies 'diversity collapse' in the popular GRPO reinforcement learning method and introduces MUPO to maintain broad reasoning paths.
New Capability
Introduces the first auto-regressive framework for Gaussian Splatting, enabling parallel, progressive next-scale 3D generation.
Efficiency Breakthrough
Optimizes LLM inference scheduling by treating output length as a heavy-tailed distribution rather than a point estimate.
Efficiency Breakthrough
Introduces negative early exit and adaptive boosting to make Monte Carlo Tree Search (MCTS) practical for real-time LLM inference.
Efficiency Breakthrough
Achieves a major breakthrough in dataset distillation, reaching 60% accuracy on ImageNet-1K using only a handful of synthetic images.
Efficiency Breakthrough
Enables 'Elastic Inference' where a single trained model can be converted to multiple lower-precision formats on-the-fly without retraining.
New Capability
Proposes a parameter-efficient LLM adaptation method that enables rapid specialization on non-stationary streams while preventing catastrophic forgetting.
Paradigm Shift
Replaces manual rubric-tuning for synthetic data with an automated gradient-guided optimization framework based on influence estimation.
New Capability
Rebuilds the Agent-Computer Interaction (ACI) stack for scientific discovery, solving the fragility of JSON tool-calling and execution sandboxes.
Efficiency Breakthrough
Scales imitation learning data efficiency by generating synthetic 'multi-view' demonstrations from a single expert trajectory.
New Capability
Introduces SIGN, a framework capable of discovering governing symbolic equations for networked systems with over 100,000 nodes.
Breaks Assumption
Discovers 'Quality Corruption,' an adversarial failure mode where accuracy collapses while detection counts remain stable, proving robustness is substrate-dependent.
Efficiency Breakthrough
Proposes Physical Imitation Learning (PIL) to offload up to 87% of a control policy's mechanical power to passive robotic joints.
Open Release
OmniVoice is an open-source TTS model scaling to over 600 languages using a novel diffusion language model architecture.
New Capability
TTA-Vid enables video reasoning models to adapt to new domains at test-time using label-free reinforcement learning on a single sample.
Paradigm Shift
Introduces HiLL, a framework that jointly trains a 'hinter' and 'reasoner' to prevent advantage collapse in reinforcement learning for hard tasks.
Scaling Insight
Establishes a three-dimensional scaling law for RAG-pretraining, modeling the optimal data budget allocation between model parameters, tokens, and retrieval store size.
Efficiency Breakthrough
CircuitProbe identifies reasoning circuits in Transformers 1000x faster than brute-force methods and predicts the efficacy of layer duplication.
Paradigm Shift
LangMARL introduces agent-level credit assignment and policy gradient evolution directly in the natural language space for multi-agent coordination.
Breaks Assumption
Provides the first controlled study of Silent Data Corruption (SDC) in GPUs and its catastrophic impact on LLM pretraining stability.
Efficiency Breakthrough
Spectral Compact Training (SCT) enables training 70B-parameter architectures on consumer hardware like the Steam Deck (8GB RAM) via permanent SVD factors.
Paradigm Shift
Stochastic Attention achieves a global receptive field in O(log n) layers by using randomized routing inspired by the fruit fly connectome.
New Capability
ThoughtSteer demonstrates the first successful backdoor attack on continuous latent reasoning models that leave no token-based audit trail.
Breaks Assumption
Mechanistic analysis reveals that LLMs fail at character counting not because they lack the information, but because 'negative circuits' in the final layers actively suppress the correct answer.
Efficiency Breakthrough
This paper achieves O(1) complexity for multimillion-class classification by leveraging predefined vector systems in the latent space.
Paradigm Shift
Routing-Free MoE replaces centralized routing with individual expert-level activation, eliminating the need for Softmax and Top-K load balancing.
Efficiency Breakthrough
Molecular Memory allows MoE systems to recover previously learned domain expertise 9-11x faster by utilizing cost-penalized fitness metrics that preserve dormant experts.
Efficiency Breakthrough
OBD-LLM uses second-order Hessian information to achieve 20-40% better low-rank decomposition accuracy than the current state-of-the-art SVD-LLM.
Paradigm Shift
Policy Improvement Reinforcement Learning (PIRL) shifts the training objective from reward maximization to explicit maximization of policy progress across iterations.
Efficiency Breakthrough
PixelPrune identifies and removes pixel-level redundancy before the Vision Transformer encoder, delivering up to 4.2x inference speedup for high-resolution VLM tasks.
New Capability
An autonomous research pipeline discovered a lifelong multimodal memory framework by diagnosing and fixing its own architectural bugs and data pipeline issues.
Efficiency Breakthrough
EmbedPart achieves a 100x speedup over Metis for graph partitioning by clustering node embeddings rather than operating on raw graph structures.
Efficiency Breakthrough
A lightweight probing method predicts LLM downstream task performance from internal representations during training, reducing evaluation latency from one hour to three minutes.
Efficiency Breakthrough
Canonical Correlation Analysis (CCA) can reduce image representation dimensionality by 75% while actually improving downstream performance through cross-model agreement.
New Capability
WARP provides provable, guaranteed repairs for inner layers of Transformers, overcoming the limitation of previous methods restricted to the final layer.
Paradigm Shift
Proposes dense point trajectories as universal 'visual tokens' for behavior that generalize across different species and non-rigid objects.
Open Release
Releases the GPT-NL Public Corpus, the largest permissively licensed (CC-BY) Dutch-first dataset for LLM pre-training.
Efficiency Breakthrough
Decouples weather forecasting from spatial resolution by using Flow Matching to super-resolve coarse trajectories as a post-processing step.
New Capability
Solves highly intractable (#P-hard) multi-objective optimization problems with tight approximation guarantees using a novel SAT-oracle approach.
New Capability
Demonstrates that covert collusion between multi-agent LLM systems can be detected zero-shot using internal model activations.