Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
New Capability
Quantifies near-verbatim data extraction risk in LLMs at 1/5000th the computational cost of standard Monte Carlo methods.
New Capability
Enables graph-based retrieval and reranking for RAG without the maintenance overhead of a knowledge graph.
Breaks Assumption
Reduces visual tokens in robot policies by 78% by using inter-layer rank consistency instead of simple attention magnitude.
Breaks Assumption
This paper demonstrates that the order of training examples alone can encode information not present in any individual example, allowing models to bypass established sample complexity bounds.
Scaling Insight
A systematic study reveals that grokking is not an architectural property of Transformers but an interaction between weight decay and optimization stability.
Paradigm Shift
The 'Reasoning Contamination Effect' shows that Chain-of-Thought (CoT) reasoning actually disrupts a model's internal confidence signal, leading to poorer calibration.
Breaks Assumption
Large Language Models process instructions as social acts rather than technical specifications, making 'imperative mood' prompts behave inconsistently across different languages.
New Capability
GeoNDC introduces a queryable neural data cube that compresses 20 years of planetary satellite data by 95x while allowing on-demand continuous-time reconstruction.
Efficiency Breakthrough
Sparton is a specialized Triton kernel that solves the massive memory bottleneck of Learned Sparse Retrieval (LSR) models like Splade.
New Capability
Intern-S1-Pro is the first trillion-parameter scientific multimodal foundation model, outperforming proprietary models on specialized scientific reasoning.
New Capability
AirVLA successfully transfers manipulation-trained Vision-Language-Action (VLA) models to underactuated aerial robots using a payload-aware guidance mechanism.
Paradigm Shift
R1Sim applies the 'Reasoning-RL' paradigm (popularized by DeepSeek-R1) to traffic simulation, achieving superior safety and diversity in multi-agent behaviors.
Paradigm Shift
SIGMA resolves 'trajectory divergence' in molecular string generation by enforcing geometric symmetry recognition through contrastive learning.
Efficiency Breakthrough
A fully differentiable agent-based traffic simulator enables calibration and control of million-vehicle networks 173x faster than real-time.
Efficiency Breakthrough
GIFT is a training-free frame selection framework that uses 'Directed Diversity' to boost Video-LLM performance by up to 12.5%.
New Capability
Z-Erase introduces the first concept erasure method for single-stream diffusion transformers, preventing generation collapse in new unified architectures.
Breaks Assumption
This paper demonstrates that Sparse Autoencoder (SAE) features in multimodal models are not modular, challenging the core assumption of intervention-based steering.
Paradigm Shift
Pixelis shifts VLM reasoning from static description to a 'reasoning in pixels' agentic paradigm that learns via an executable tool grammar.
Paradigm Shift
The AE4E paradigm proposes a 'Social Contract' for multi-agent economies, replacing individual model alignment with an institutional 'Separation of Power'.
Scaling Insight
MSRL scales multimodal reward modeling by transferring reasoning capabilities from text to vision-language tasks without requiring new multimodal preference data.
New Capability
SEVerA enables the synthesis of self-evolving agents with formal guarantees by combining LLM planning with first-order logic rejection samplers.
Paradigm Shift
Using Signal Detection Theory, this work proves that LLM calibration and 'metacognitive efficiency' (knowing what you know) are distinct, dissociable capacities.
Efficiency Breakthrough
Photon enables efficient 3D medical volume understanding through adaptive token scheduling and a novel 'gradient restoration' backpropagation rule.
Paradigm Shift
Vision Hopfield Memory Networks (V-HMN) present a brain-inspired alternative to Transformers and Mamba using hierarchical associative memory mechanisms.
New Capability
Trace2Skill distills lessons from across a 'parallel fleet' of execution trajectories into a unified, conflict-free skill directory for LLM agents.
Efficiency Breakthrough
Pruning low-utility prompts before RL rollouts allows for 10x more efficient training of large reasoning models.
Breaks Assumption
Safety alignment does not have to be a 'tax' on performance; it can actually improve mathematical reasoning accuracy.
New Capability
Enable long video generation from short-video diffusion models without any additional training or fine-tuning.
New Capability
Training-free 6D pose estimation for unseen surgical instruments using only a CAD model as prior knowledge.
New Capability
Offline Decision Transformers can now synthesize strategies that surpass the classical heuristics they were trained on for the Traveling Salesman Problem.
Efficiency Breakthrough
Simple image sharpening serves as a surrogate-free, zero-cost preemptive defense against adversarial attacks.
Paradigm Shift
Representing GPS trajectories as hyperspectral images enables multi-month dense anomaly detection that was previously computationally intractable.
New Capability
A foundation model for gait transforms 3D skeletal motion into a systemic biosignal for multi-system health monitoring.
Efficiency Breakthrough
A new tokenization architecture reduces the 'Token Tax' for complex non-Latin scripts by over 60%.
Breaks Assumption
Sparse Autoencoder analysis reveals that weight pruning counter-intuitively preserves rare features better than frequent ones.
New Capability
LLMs can be fine-tuned to act as their own 'Z-token' compressors, achieving 18x text reduction without losing reconstruction fidelity.
Efficiency Breakthrough
GlowQ introduces group-shared low-rank approximations to speed up quantized LLM inference by up to 37%.
New Capability
Defines 'Reasoning Safety' as a new security dimension and introduces a real-time monitor to detect logic-chain hijackings.
Breaks Assumption
Cross-model disagreement (CMP/CME) provides a highly effective, label-free signal for detecting confident hallucinations.
New Capability
Introduces a training-free pipeline for pixel-level video anomaly detection that achieves a 5x improvement in object-level accuracy.
New Capability
A model-agnostic framework to extract the model-implied causal structure from any trained temporal predictor.
Efficiency Breakthrough
Reduces LLM inference energy by 40% (and up to 81%) using a distillation-based router to skip unnecessary reasoning steps.
New Capability
Detects when object detectors fail to see safety-critical objects by measuring semantic misalignment with foundation model embeddings.
Breaks Assumption
Challenges the 'Golden Data' requirement for video generation by showing that imbalanced data can outperform high-quality data through timestep-aware training.
Efficiency Breakthrough
Unlocks full-body musculoskeletal humanoid training by achieving order-of-magnitude speedups via massively parallel GPU simulation.
Paradigm Shift
Fixes the inherent instability of on-policy distillation in LLMs using local support matching and top-p rollout sampling.
Efficiency Breakthrough
Achieves 45% performance gains in robotics using 5-10x fewer real-world demonstrations through high-dimensional factorization.
Paradigm Shift
Enables LMMs to 'think' using compact latent visual representations rather than verbalizing everything into text.
New Capability
Translates a single natural language sentence into a validated, hardware-specific computational imaging system design.
Efficiency Breakthrough
Achieves up to 4.7x speedup for Diffusion LLMs using a training-free self-speculative decoding framework.