SeriesFusion
Science, curated & edited by AI

AI & Machine Learning

2,371 papers  ·  Page 33 of 48

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Efficiency Breakthrough
PivotRL identifies 'pivot' turns in agent trajectories where actions matter most, enabling compute-efficient reinforcement learning that matches end-to-end RL at 4x lower cost.
Mar 24
Scaling Insight
Discovers 'silent commitment failure,' where some model architectures produce confident, incorrect outputs with zero detectable warning signals before execution.
Mar 24
Scaling Insight
Provides a causal explanation for 'embedding collapse' in Transformers, linking it to the concept of semantic shift rather than just text length.
Mar 24
Efficiency Breakthrough
KG-Hopper enables 7B-parameter models to outperform 70B systems on complex Knowledge Graph reasoning by embedding the entire multi-hop process into a single 'thinking' stage.
Mar 24
Breaks Assumption
Introduces Cross-Context Verification (CCV) to detect benchmark contamination, finding that contamination is binary: models either recall solutions perfectly or lack reasoning entirely.
Mar 24
Paradigm Shift
DSPA performs preference alignment at inference time by steering Sparse Autoencoder (SAE) features, bypassing the need for expensive weight-update training.
Mar 24
New Capability
DRTriton uses large-scale synthetic data and curriculum RL to automatically generate highly optimized Triton kernels, significantly outperforming top-tier LLMs.
Mar 24
New Capability
Introduces git-inspired primitives to enable truly asynchronous and non-interfering multi-agent software engineering collaboration.
Mar 24
Breaks Assumption
Demonstrates that learning systems can stably converge to incorrect solutions when feedback reliability is unobservable.
Mar 24
Efficiency Breakthrough
Achieves state-of-the-art open-vocabulary segmentation using a training-free, purely geometric projection and propagation method.
Mar 24
Breaks Assumption
Reveals that 'erasing' concepts from video diffusion models only suppresses output rather than removing the underlying representations.
Mar 24
New Capability
Solves the 'recursive drift' problem in self-improving LLMs by using symbolic verification to gate training data quality.
Mar 24
Paradigm Shift
Introduces a counterfactual framework for precise individual credit assignment in collaborative multi-agent LLM systems.
Mar 24
Paradigm Shift
Provides the first unified theoretical formalism for hierarchical memory systems used by long-context language agents.
Mar 24
Breaks Assumption
Proves an information-theoretic lower bound showing that embedding hidden payloads in LLM text must increase its Kolmogorov complexity.
Mar 24
New Capability
Transitions MLLMs from reactive planning to 'mental navigation' by forcing the construction of hierarchical cognitive maps from egocentric video.
Mar 24
Efficiency Breakthrough
Enables merging independently trained specialist models (e.g., Vision-LLM and Audio-LLM) into a single multimodal model without any paired training data.
Mar 24
Breaks Assumption
Standard entropy-based uncertainty quantification (UQ) fails in RAG because the 'induction heads' that copy correct answers also trigger 'entropy neurons', causing false uncertainty signals.
Mar 24
Paradigm Shift
Rule-State Inference (RSI) inverts the standard ML paradigm by treating known regulatory rules as priors and inferring the latent state of compliance and drift, rather than approximating rules from noisy data.
Mar 24
Paradigm Shift
GSB-PPO lifts proximal policy optimization from discrete action steps to full generation trajectories by framing it as a Generalized Schrödinger Bridge.
Mar 24
Breaks Assumption
Auditing 'Silicon Bureaucracy' reveals that LLM benchmark scores are often inflated by contamination-related memory reactivation rather than genuine generalization.
Mar 24
Efficiency Breakthrough
SparseVoxelDet is the first fully sparse object detector for event cameras that never instantiates a dense tensor, achieving 858x GPU memory compression.
Mar 24
New Capability
HumanOmni-Speaker achieves end-to-end speaker diarization and lip-reading by compressing high-frequency motion residuals into just 6 tokens per frame.
Mar 24
Paradigm Shift
PRM-as-a-Judge shifts robotic evaluation from binary success/failure to a dense, potential-based progress metric system.
Mar 24
Scaling Insight
Depth-Recurrent Transformers decouple computational depth from parameter count, revealing a 'computational frontier' where performance on reasoning tasks snaps from zero to perfect based on iteration steps.
Mar 24
Breaks Assumption
The 'Mirage' study demonstrates that frontier MLLMs generate detailed reasoning traces and clinical findings for images they were never actually shown.
Mar 24
Efficiency Breakthrough
Confidence-Evidence Bayesian Gain (CEBaG) provides deterministic hallucination detection for medical VQA without requiring 10-20 stochastic generations.
Mar 24
Paradigm Shift
FIM-Merging provides a theoretical framework for layer-adaptive model merging using the Fisher Information Matrix to bound merging error.
Mar 24
Breaks Assumption
Challenges the gold standard of Upper Confidence Bound (UCB) exploration in diversity-aware bandit tasks.
Mar 24
Scaling Insight
Identifies structured table data as a primary driver for scaling long-context reasoning in LLMs.
Mar 24
New Capability
Achieves zero-shot, zero-training collaborative navigation between humanoid and quadruped robots.
Mar 24
Efficiency Breakthrough
Enables high-performance Zeroth-Order (ZO) fine-tuning of LLMs by leveraging online curvature signals.
Mar 24
Efficiency Breakthrough
Reduces token consumption in interleaved multimodal reasoning by over 72% using dynamic visual thoughts.
Mar 24
New Capability
Introduces a training-free method to visualize and validate the invariances of any feature extractor using diffusion priors.
Mar 24
Paradigm Shift
Hypothesizes and demonstrates a unified Gaussian latent geometry connecting vision encoders and generative models.
Mar 24
Efficiency Breakthrough
Eliminates the need for strictly aligned image pairs in infrared and visible image fusion.
Mar 24
Paradigm Shift
Solves the structural redundancy problem in symbolic regression by collapsing expression DAG isomorphisms.
Mar 24
Efficiency Breakthrough
Reduces human annotation requirements for NLP model testing by up to 95%.
Mar 24
New Capability
Reveals that frozen LLMs contain person-specific 'neural signatures' that can predict individual brain activity.
Mar 24
Scaling Insight
Introduces a robust framework for optimal Mixture-of-Experts (MoE) architecture design across six orders of magnitude in compute.
Mar 24
Paradigm Shift
Synergizes prompt optimization with policy optimization to overcome the 'sparse reward' problem in complex reasoning tasks.
Mar 24
Breaks Assumption
Demonstrates that the two standard mathematical interpretations of Temporal Difference (TD) error diverge in deep reinforcement learning.
Mar 24
Paradigm Shift
Identifies the 'golden subspace' for test-time adaptation, enabling extreme efficiency in online model updates.
Mar 24
New Capability
Uses the chronological visitation order of medical scans as a self-supervised signal for disease progression modeling.
Mar 24
Efficiency Breakthrough
Achieves a 50x reduction in visual tokens for Video-LLMs while preserving over 90% of baseline performance.
Mar 24
Open Release
Open-sources a high-fidelity foundation model that jointly generates synchronized video and audio using a unified single-stream Transformer.
Mar 24
Efficiency Breakthrough
Introduces a learnable bridge between GELU and ReLU activations to enable deployment-friendly piecewise-linear networks.
Mar 24
Efficiency Breakthrough
Achieves a 75x parameter reduction in 3D medical image segmentation by hybridizing Mamba and Transformer modules.
Mar 24
Paradigm Shift
Decouples high-level reasoning from low-level motor control in robotics using a visual prompting interface.
Mar 24
Open Release
Releases the first large-scale family of learned sparse retrieval (LSR) models specialized for code (up to 8B parameters).
Mar 24