AI & Machine Learning

2,557 papers · Page 32 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Quantifies an emergent 'self' in robots as an invariant subnetwork that persists across continual learning of variable tasks.

Applies reinforcement learning with a cycle-consistency reward to drastically improve natural language to Lean4 autoformalization.

Efficiency Breakthrough

A 5M-parameter OCR model that rivals billion-parameter vision-language models, proving data-centric curation can beat raw parameter scale.

Reformulates molecular discovery as an autonomous MCTS planning problem over executable chemical operations rather than just similarity-based prediction.

Scaling Insight

Identifies a 'critical threshold' in human-AI symbiosis beyond which human capability collapses abruptly and irreversibly due to over-delegation.

Moves automated research from stateless linear pipelines to a persistent Research World Model that maintains a self-correcting knowledge graph of gaps and methods.

Efficiency Breakthrough

Achieves high-fidelity sub-seasonal weather forecasting with a 276M parameter model that matches 1.6B parameter baselines in accuracy and speed.

Releases 55 hours of continuous 30fps expert human computer-use videos to address the 'missing ingredient' for desktop automation agents.

Introduces a 'sorry-driven' formal decomposition that allows LLM agents to solve complex proofs by isolating and independently verifying subgoals.

Breaks Assumption

Reveals that self-distillation degrades out-of-distribution reasoning by suppressing 'epistemic verbalization' (the model's expression of uncertainty).

Enforces hard incompressibility constraints in neural operators using spectral Leray projection, ensuring physically admissible fluid simulations.

An autonomous agentic pipeline discovered novel white-box adversarial attacks that outperform existing methods by up to 300%.

Efficiency Breakthrough

Agentic Variation Operators (AVO) replace fixed evolutionary heuristics with coding agents to discover GPU kernels that outperform FlashAttention-4 by 10.5%.

UI-Voyager achieves an 81.0% success rate on AndroidWorld, exceeding human-level performance in mobile GUI automation.

LensWalk introduces a 'reason-plan-observe' loop that allows agents to dynamically control the temporal sampling and density of the videos they analyze.

The Free-Market Algorithm (FMA) is a zero-parameter metaheuristic that discovers complex pathways in chemistry and economics through emergent supply-and-demand dynamics.

VFIG enables high-fidelity conversion of rasterized technical figures into editable, scalable SVGs using a new 66K-pair dataset.

MARCH eliminates 'LLM-as-a-judge' confirmation bias by using information asymmetry to force verification agents to check claims without seeing the original response.

Efficiency Breakthrough

DreamerAD accelerates imagination-based training for autonomous driving by 80x, compressing 100-step diffusion sampling down to a single step.

Efficiency Breakthrough

The Multilevel Euler-Maruyama (ML-EM) method allows diffusion models to perform sampling at the computational cost of a single model evaluation.

Wasserstein Parallel Transport provides a formal framework for counterfactual prediction in evolving probability distributions.

Paradigm Challenge

So there’s this new AI researcher that’s actually starting to fact-check real math papers and point out exactly where the professors messed up.

Paradigm Challenge

Get this: only about 10% of the computer code used in those fancy Nature papers actually works if you try to run it yourself.

Practical Magic

Researchers figured out they could trick a robot into handing someone a knife instead of an apple using nothing but a printed drink coaster.

Nature Is Weird

Your AI assistant’s 'brain' can be secretly messed with by random emails in your inbox, changing how it treats you without you ever knowing.

Practical Magic

Imagine wireless internet that's actually as fast as a physical cable—no lag, no matter how many devices the signal bounces through.

Breaks Assumption

Effective semantic alignment for low-resource languages can be achieved with only 10,000 noisy synthetic pairs, matching the performance of models trained on 1 million samples.

Mechanistic interpretability reveals that LLMs possess 'affect reception' circuits that detect emotional content even when explicit keywords are removed.

Efficiency Breakthrough

Sparse Feature Attention (SFA) reduces attention costs from quadratic in sequence length and linear in dimension to a fraction based on feature sparsity, enabling 2.5x speedups.

Scaling Insight

hidden states in LLMs occupy a Riemannian submanifold where tokens are Voronoi regions, revealing a universal 'hourglass' intrinsic dimension profile across all tested models.

Breaks Assumption

Forcing AI agents to use human-comprehensible language causes a 50% efficiency drop compared to their own 'inscrutable' communication protocols.

Efficiency Breakthrough

Standard quantization destroys the small parameter 'deltas' that encode post-training knowledge; Delta-Aware Quantization (DAQ) fixes this by optimizing for sign preservation.

Efficiency Breakthrough

Hybrid Associative Memory (HAM) layers allow the KV cache to grow dynamically based only on information that an internal RNN cannot predict.

Small adapters can provide frozen decoder-only LLMs with persistent latent-space memory that survives across separate sessions.

Scaling Insight

The standard 'Chinchilla Approach 2' for fitting scaling laws is systematically biased, potentially leading to millions of dollars in wasted compute at frontier scales.

Gradient boosting exhibits a 'first-mover bias' where correlated features selected early in the tree sequence gain an artificial, self-reinforcing importance in SHAP rankings.

Introduces a framework for LLMs to self-improve reasoning in specific domains by autonomously mining and constructing training environments directly from the open web.

Establishes a formal mathematical equivalence between Classifier-Free Guidance (CFG) and alignment-based objectives, allowing for CFG-like quality without inference-time overhead.

Efficiency Breakthrough

Proposes an agentic architecture that achieves O(1) token complexity relative to dataset size by strictly separating intent parsing from deterministic data execution.

Efficiency Breakthrough

Achieves high-fidelity diffusion generation in just 3 steps by distilling layer-wise time embeddings from reference trajectories.

Breaks Assumption

Finds that nominal instruction-tuning with LoRA often fails to improve (and can even degrade) verifiable instruction-following despite improvements on broader benchmarks.

Shifts symbolic regression from discrete genetic search to a continuous, embedding-driven optimization paradigm.

Scaling Insight

Reveals that RLVR-driven reasoning improvements in LLMs are the result of highly sparse changes to a tiny fraction of 'critical' token distributions.

Breaks Assumption

Identifies that the full source code (skill body) of a tool is the primary signal for LLM tool selection, far outweighing the importance of descriptions or metadata.

Replaces standard autoregressive document OCR with a parallel diffusion-based denoising framework.

Efficiency Breakthrough

Introduces a verifier that operates directly on the latent hidden states of Diffusion Transformers, avoiding the need for costly pixel-space decoding during inference-time scaling.

Demonstrates that Hebbian plasticity can induce emergent attractor dynamics in robot controllers, enabling rapid adaptation without backpropagation.

Breaks Assumption

Uncovers that neural operator digital twins are acutely vulnerable to sparse adversarial perturbations on boundary conditions that bypass standard anomaly detection.

Leverages unstructured clinical notes during training to boost the performance of models that are deployed using only structured EHR data.

Scaling Insight

Robotic bipedal mass scales with the square of leg length rather than the cubic scaling found in biological systems.