AI & Machine Learning

2,371 papers · Page 18 of 48

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Efficiency Breakthrough

A cross-graph tuning-free prompting framework for GNNs that achieves massive gains on unseen graphs without retraining.

Proposes a decision-centric architecture that separates signal estimation from control policy to make LLM system decisions explicit and inspectable.

Enables zero-shot humanoid navigation in unseen environments using only 5 hours of human walking data and no robot-specific data.

A white-box membership inference attack using 'gradient-induced feature drift' to outperform all existing confidence-based methods.

Efficiency Breakthrough

Self-Routing removes the need for learned routers in Mixture-of-Experts (MoE) by using hidden states directly for expert assignment.

Efficiency Breakthrough

Improves Qwen2.5-7B performance on AIME2024 by 137% through test-time iterative rethinking and majority-voted pseudo-labels.

Efficiency Breakthrough

Automates mathematical optimization modeling using reinforcement learning with solver-derived rewards instead of human process supervision.

Breaks Assumption

Reveals that many 'polysemantic' neurons in LLMs are actually firing for shared word forms (lexical) rather than compressed semantic concepts.

Truth Anchoring (TAC) provides a post-hoc calibration method to align LLM uncertainty metrics with actual factual correctness.

Scaling Insight

Demonstrates that LLM judge panels follow power-law discovery curves, where panel size and persona diversity are critical for uncovering edge-case failures.

Identifies 'diversity collapse' in the popular GRPO reinforcement learning method and introduces MUPO to maintain broad reasoning paths.

Introduces the first auto-regressive framework for Gaussian Splatting, enabling parallel, progressive next-scale 3D generation.

Efficiency Breakthrough

Optimizes LLM inference scheduling by treating output length as a heavy-tailed distribution rather than a point estimate.

Efficiency Breakthrough

Introduces negative early exit and adaptive boosting to make Monte Carlo Tree Search (MCTS) practical for real-time LLM inference.

Efficiency Breakthrough

Achieves a major breakthrough in dataset distillation, reaching 60% accuracy on ImageNet-1K using only a handful of synthetic images.

Efficiency Breakthrough

Enables 'Elastic Inference' where a single trained model can be converted to multiple lower-precision formats on-the-fly without retraining.

Proposes a parameter-efficient LLM adaptation method that enables rapid specialization on non-stationary streams while preventing catastrophic forgetting.

Replaces manual rubric-tuning for synthetic data with an automated gradient-guided optimization framework based on influence estimation.

Rebuilds the Agent-Computer Interaction (ACI) stack for scientific discovery, solving the fragility of JSON tool-calling and execution sandboxes.

Efficiency Breakthrough

Scales imitation learning data efficiency by generating synthetic 'multi-view' demonstrations from a single expert trajectory.

Introduces SIGN, a framework capable of discovering governing symbolic equations for networked systems with over 100,000 nodes.

Breaks Assumption

Discovers 'Quality Corruption,' an adversarial failure mode where accuracy collapses while detection counts remain stable, proving robustness is substrate-dependent.

Efficiency Breakthrough

Proposes Physical Imitation Learning (PIL) to offload up to 87% of a control policy's mechanical power to passive robotic joints.

OmniVoice is an open-source TTS model scaling to over 600 languages using a novel diffusion language model architecture.

TTA-Vid enables video reasoning models to adapt to new domains at test-time using label-free reinforcement learning on a single sample.

Introduces HiLL, a framework that jointly trains a 'hinter' and 'reasoner' to prevent advantage collapse in reinforcement learning for hard tasks.

Scaling Insight

Establishes a three-dimensional scaling law for RAG-pretraining, modeling the optimal data budget allocation between model parameters, tokens, and retrieval store size.

Efficiency Breakthrough

CircuitProbe identifies reasoning circuits in Transformers 1000x faster than brute-force methods and predicts the efficacy of layer duplication.

LangMARL introduces agent-level credit assignment and policy gradient evolution directly in the natural language space for multi-agent coordination.

Breaks Assumption

Provides the first controlled study of Silent Data Corruption (SDC) in GPUs and its catastrophic impact on LLM pretraining stability.

Efficiency Breakthrough

Spectral Compact Training (SCT) enables training 70B-parameter architectures on consumer hardware like the Steam Deck (8GB RAM) via permanent SVD factors.

Stochastic Attention achieves a global receptive field in O(log n) layers by using randomized routing inspired by the fruit fly connectome.

ThoughtSteer demonstrates the first successful backdoor attack on continuous latent reasoning models that leave no token-based audit trail.

Breaks Assumption

Mechanistic analysis reveals that LLMs fail at character counting not because they lack the information, but because 'negative circuits' in the final layers actively suppress the correct answer.

Efficiency Breakthrough

This paper achieves O(1) complexity for multimillion-class classification by leveraging predefined vector systems in the latent space.

Routing-Free MoE replaces centralized routing with individual expert-level activation, eliminating the need for Softmax and Top-K load balancing.

Efficiency Breakthrough

Molecular Memory allows MoE systems to recover previously learned domain expertise 9-11x faster by utilizing cost-penalized fitness metrics that preserve dormant experts.

Efficiency Breakthrough

OBD-LLM uses second-order Hessian information to achieve 20-40% better low-rank decomposition accuracy than the current state-of-the-art SVD-LLM.

Policy Improvement Reinforcement Learning (PIRL) shifts the training objective from reward maximization to explicit maximization of policy progress across iterations.

Efficiency Breakthrough

PixelPrune identifies and removes pixel-level redundancy before the Vision Transformer encoder, delivering up to 4.2x inference speedup for high-resolution VLM tasks.

An autonomous research pipeline discovered a lifelong multimodal memory framework by diagnosing and fixing its own architectural bugs and data pipeline issues.

Efficiency Breakthrough

EmbedPart achieves a 100x speedup over Metis for graph partitioning by clustering node embeddings rather than operating on raw graph structures.

Efficiency Breakthrough

A lightweight probing method predicts LLM downstream task performance from internal representations during training, reducing evaluation latency from one hour to three minutes.

Efficiency Breakthrough

Canonical Correlation Analysis (CCA) can reduce image representation dimensionality by 75% while actually improving downstream performance through cross-model agreement.

WARP provides provable, guaranteed repairs for inner layers of Transformers, overcoming the limitation of previous methods restricted to the final layer.

Proposes dense point trajectories as universal 'visual tokens' for behavior that generalize across different species and non-rigid objects.

Releases the GPT-NL Public Corpus, the largest permissively licensed (CC-BY) Dutch-first dataset for LLM pre-training.

Efficiency Breakthrough

Decouples weather forecasting from spatial resolution by using Flow Matching to super-resolve coarse trajectories as a post-processing step.

Solves highly intractable (#P-hard) multi-objective optimization problems with tight approximation guarantees using a novel SAT-oracle approach.

Demonstrates that covert collusion between multi-agent LLM systems can be detected zero-shot using internal model activations.