Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
New Capability
Introduces a framework for LLMs to self-improve reasoning in specific domains by autonomously mining and constructing training environments directly from the open web.
Paradigm Shift
Establishes a formal mathematical equivalence between Classifier-Free Guidance (CFG) and alignment-based objectives, allowing for CFG-like quality without inference-time overhead.
Efficiency Breakthrough
Proposes an agentic architecture that achieves O(1) token complexity relative to dataset size by strictly separating intent parsing from deterministic data execution.
Efficiency Breakthrough
Achieves high-fidelity diffusion generation in just 3 steps by distilling layer-wise time embeddings from reference trajectories.
Breaks Assumption
Finds that nominal instruction-tuning with LoRA often fails to improve (and can even degrade) verifiable instruction-following despite improvements on broader benchmarks.
Paradigm Shift
Shifts symbolic regression from discrete genetic search to a continuous, embedding-driven optimization paradigm.
Scaling Insight
Reveals that RLVR-driven reasoning improvements in LLMs are the result of highly sparse changes to a tiny fraction of 'critical' token distributions.
Breaks Assumption
Identifies that the full source code (skill body) of a tool is the primary signal for LLM tool selection, far outweighing the importance of descriptions or metadata.
Paradigm Shift
Replaces standard autoregressive document OCR with a parallel diffusion-based denoising framework.
Efficiency Breakthrough
Introduces a verifier that operates directly on the latent hidden states of Diffusion Transformers, avoiding the need for costly pixel-space decoding during inference-time scaling.
Paradigm Shift
Demonstrates that Hebbian plasticity can induce emergent attractor dynamics in robot controllers, enabling rapid adaptation without backpropagation.
Breaks Assumption
Uncovers that neural operator digital twins are acutely vulnerable to sparse adversarial perturbations on boundary conditions that bypass standard anomaly detection.
New Capability
Leverages unstructured clinical notes during training to boost the performance of models that are deployed using only structured EHR data.
Scaling Insight
Robotic bipedal mass scales with the square of leg length rather than the cubic scaling found in biological systems.
New Capability
CanViT is the first task-agnostic active-vision foundation model that reconstructs scenes using low-resolution 'glimpses' with 19.5x fewer FLOPs than existing models.
Breaks Assumption
A large-scale study of 12 reasoning models reveals that internal 'thinking' processes frequently recognize deceptive hints while the final output remains sycophantic.
Paradigm Shift
Instead of using top-activating examples, this method steers Sparse Autoencoder (SAE) features in Vision-Language Models to let the model describe its own internal visual features.
Paradigm Shift
DeIllusionLLM introduces task-level autoregressive reasoning to prevent LLMs from hallucinating answers to ill-posed or faulty scientific questions.
New Capability
CAM3R is a camera-agnostic 3D reconstruction model that handles fisheye, panoramic, and pinhole imagery without requiring prior calibration.
Paradigm Shift
Inter-Layer Structural Encoders (ILSE) use Cayley graphs to aggregate features from all internal LLM layers, improving accuracy by up to 44% over final-layer-only predictions.
Open Release
Introduces the first high-performing open-source metric for per-sample AI music quality evaluation.
Open Release
Provides a massive 2.5M image-to-TikZ dataset and the first instruction-augmented dataset for geometric visual reasoning.
New Capability
A new statistical test that reliably detects whether a dataset was NOT used in an LLM's training corpus.
Paradigm Shift
Introduces Dual Q-DM, the first non-adversarial imitation learning method theoretically guaranteed to eliminate compounding errors.
Scaling Insight
A quantitative model that predicts the performance gain of merging independent LLM specialists before committing compute.
Breaks Assumption
Proves that logic and lookup-table (LUT) based neural networks are structurally more resilient to hardware bit-flips than standard architectures.
Scaling Insight
Identifies the 'Caterpillar Tree' as the theoretically optimal structure for test-time computation and backtracking in LLMs.
New Capability
ABSTRAL automates the design of multi-agent systems by treating architectures as evolving, inspectable natural-language documents.
Breaks Assumption
Frontier models' reasoning steps are largely 'decorative' and do not causally determine the final answer in most tasks.
Paradigm Shift
Moving beyond coarse reward signals, this paper introduces token-level policy optimization for multimodal reasoning.
New Capability
UniQueR reconstructs full 3D scenes (including occluded areas) from unposed images in a single forward pass.
Scaling Insight
Persistent structural memory in neural networks is fundamentally limited by the instability of jointly-learned coordinate systems.
New Capability
Deep semi-parametric models allow for the instant deletion of training data from a model without retraining or parameter updates.
Efficiency Breakthrough
A 0.26M parameter model using continuous dynamics outperforms 27M parameter recursive models on complex logic tasks like Sudoku-Extreme.
Breaks Assumption
Standard confidence calibration is structurally biased when ground truth labels are ambiguous or annotators disagree.
Efficiency Breakthrough
Agile-VLA enables high-frequency robot control on edge devices by decoupling perception from action through implicit affordance anchoring.
Efficiency Breakthrough
EchoKV introduces a reversible KV cache compression scheme that allows LLMs to switch back to full-precision inference on-demand.
Efficiency Breakthrough
ForestPrune achieves up to 90% token reduction in video MLLMs with minimal accuracy loss using a training-free spatial-temporal forest modeling approach.
Scaling Insight
Theoretical analysis reveals that the efficiency benefits of low-dimensional data structures for diffusion models diminish significantly when the data manifold is non-linear.
Paradigm Shift
This paper moves LLMs from point predictions to set-valued predictions with rigorous statistical coverage guarantees.
New Capability
WorldMesh generates consistent, large-scale 3D worlds by populating a geometric mesh scaffold with image diffusion-derived content.
Breaks Assumption
Graph Foundation Models (GFMs) are shown to fail when using fixed architectural backbones, requiring a new approach of inference-time architecture adaptivity.
Scaling Insight
Access to conversational memory allows an 8B model to outperform a 235B model on user-specific queries while reducing inference costs by 96%.
Breaks Assumption
A rigorous evaluation shows that simple Probabilistic Circuits often outperform complex diffusion-based models for tabular data generation at a fraction of the cost.
Efficiency Breakthrough
Optimizing autoregressive image models with Group Relative Policy Optimization (GRPO) achieves competitive quality without the 2x inference cost of Classifier-Free Guidance.
New Capability
Identifies that MLLMs fail to perceive visual illusions due to a high-frequency attention bias and provides a plug-and-play fix that boosts accuracy from 13% to 84%.
New Capability
Polaris introduces a 'Gödel Agent' framework that allows 7B-parameter models to recursively improve their own policies through auditable code patches.
Efficiency Breakthrough
DILLO enables 14x faster safety-critical agent steering by predicting action consequences from latent states instead of heavy visual simulations.
Breaks Assumption
Exposes a major flaw in medical super-resolution research where models trained on downsampled data fail to recover actual lost structures in real low-resolution scans.
Paradigm Shift
Connects stochastic optimal control to the Schrödinger equation, enabling analytic solutions for long-horizon problems that previously scaled exponentially with difficulty.