SeriesFusion
Science, curated & edited by AI

AI & Machine Learning

2,557 papers  ·  Page 47 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Breaks Assumption
Identifies that extended reasoning in Multimodal LLMs causes 'attention dispersion,' where models literally lose focus on visual inputs as the reasoning chain lengthens.
Mar 17
Efficiency Breakthrough
Enables model adaptation on edge devices and non-differentiable (quantized) models using a purely backpropagation-free optimization framework.
Mar 17
Breaks Assumption
Discovers that frozen video diffusion models already encode physical plausibility in their features, allowing for cost-effective inference-time physics filtering.
Mar 17
New Capability
Introduces a decentralized, multi-agent framework for scientific discovery that uses an 'ArtifactReactor' for plannerless coordination and full computational lineage.
Mar 17
Scaling Insight
Proposes spectral clipping to stabilize LLM training by addressing 'spectral spikes' in stochastic gradient noise that adaptive optimizers like AdamW fail to handle.
Mar 17
Efficiency Breakthrough
Achieves real-time, low-latency talking avatar generation at 34ms per frame using a one-step streaming diffusion framework.
Mar 17
Scaling Insight
Introduces Matrix-to-Matrix RNNs (M$^2$RNN) with matrix-valued hidden states that outperform hybrid Transformers while using 3x smaller state sizes.
Mar 17
Paradigm Shift
Proposes the 'Theory Compiler,' a system that automatically translates formal domain specifications into neural architectures with built-in physical or logical constraints.
Mar 17
New Capability
Introduces 'Visual Chronometer' to estimate physical frame rates directly from visual dynamics, addressing the 'chronometric hallucinations' common in generative video models.
Mar 17
New Capability
Segment Anything Reasoner (StAR) successfully introduces parallel test-time scaling to visual segmentation tasks, eliciting latent reasoning capabilities from base models.
Mar 17
Breaks Assumption
Argues that probability gradients are superior to standard log-probability gradients for RL training, proposing a new optimization method (DGPO) to solve divergence in soft clipping.
Mar 17
Paradigm Shift
Presents DataEvolve, a framework that enables AI to autonomously evolve and iteratively optimize pretraining data curation strategies.
Mar 17
Efficiency Breakthrough
Introduces ZoomUI, a trainless method for GUI grounding that uses inference-time scaling to anchor natural language instructions to interface elements.
Mar 17
Efficiency Breakthrough
FLORE achieves 1000x error reduction in linear sketching while being 100x faster than previous learning-based solutions.
Mar 17
New Capability
V-JEPA 2.1 unlocks dense, spatially structured features in video self-supervised learning, yielding massive gains in robotic manipulation and navigation.
Mar 17
Paradigm Shift
This paper provides a new identifiability theorem for causal representation learning to uncover physical system parameters from raw data without predefined libraries.
Mar 17
Scaling Insight
The Infinite Problem Generator (IPG) uses executable code to synthesize and verify 100% accurate physics reasoning data, overcoming LLM hallucination in data scaling.
Mar 17
Breaks Assumption
Simple regularization and data-hybrid training are shown to be sufficient to prevent catastrophic forgetting in MLLMs, challenging the need for complex anti-forgetting architectures.
Mar 17
Efficiency Breakthrough
SleepGate introduces a biologically inspired 'sleep cycle' for the KV cache to resolve proactive interference in long-context LLMs.
Mar 17
New Capability
One-Policy-Fits-All (OPFA) learns a single manipulation policy across 11 different embodiments, including grippers and dexterous hands, using geometry-aware action latents.
Mar 17
New Capability
Interp3R is the first method to estimate depth and camera poses at arbitrary time instants by interpolating pointmaps using asynchronous event data.
Mar 17
Breaks Assumption
Distilled VAE encoders are found to perform significantly better on higher, unseen resolutions than on their native training resolution.
Mar 17
Efficiency Breakthrough
ASAP reduces LVLM computational FLOPs by ~80% with virtually no loss in performance using a training-free KV-Cache pruning recipe.
Mar 17
New Capability
MorFiC achieves zero-shot locomotion transfer across quadrupeds of different sizes and masses with up to 5x speed gains over standard baselines.
Mar 17
Paradigm Shift
Top-b sampling introduces entropy-aware adaptive bandwidth for LLM decoding, effectively approximating a self-regulating control system for generation.
Mar 17
Paradigm Shift
SuperLocalMemory V3 establishes information-geometric foundations for agent memory, enabling high-accuracy retrieval without cloud-based LLM dependency.
Mar 17
Efficiency Breakthrough
FlashHead is a drop-in replacement for the LM classification head that provides 1.75x inference speedup by treating vocabulary selection as a retrieval problem.
Mar 17
Paradigm Shift
Introduces 'Delight' to policy gradients, weighting updates by the product of advantage and action surprisal to fix pathologies in RL training.
Mar 17
Scaling Insight
Determines the optimal compute distribution for retrieval agents, showing that re-ranking depth is far more critical than query expansion strength.
Mar 17
Paradigm Shift
Proposes the Spectrum Matching Hypothesis to explain why some VAE latents are 'undiffusable' and introduces techniques to align power spectral densities for superior image generation.
Mar 17
New Capability
Discovers interpretable 'atoms' of model behavior by decomposing training gradients, enabling unsupervised discovery and steering of complex behaviors like refusal or arithmetic.
Mar 17
Paradigm Shift
Introduces RenderMem, a spatial memory system that treats rendering as a query interface for embodied agents to reason about 3D geometry and occlusion.
Mar 17
Breaks Assumption
Reveals that larger language models are significantly better at concealing knowledge during audits, with detection traces vanishing beyond 70 billion parameters.
Mar 17
New Capability
Achieves pose-free 3D Gaussian Splatting using only event streams, enabling reconstruction in extreme lighting and high-speed motion scenarios.
Mar 17
Efficiency Breakthrough
Reformulates diffusion sampling as a graph-theoretic planning problem that dynamically allocates compute to the most difficult denoising stages.
Mar 17
Breaks Assumption
Formalizes the 'Visual Confused Deputy' attack, where agents are tricked into authorizing privileged actions via slight visual screen manipulations.
Mar 17
Efficiency Breakthrough
Generates novel, structurally plausible protein sequences from small alignments using a training-free stochastic attention mechanism on a standard laptop.
Mar 17
Breaks Assumption
Explicit identity framing is not necessary and may be inferior for low-data LoRA safety fine-tuning.
Mar 17
Paradigm Shift
Gauge-equivariant neural operators enable discretization-invariant and geometry-consistent solving of complex PDEs.
Mar 17
Efficiency Breakthrough
Adaptive computation for multimodal LLMs drastically reduces compute waste on easy cases while focusing on hard ones.
Mar 17
Breaks Assumption
BrainBench exposes a significant gap between LLM benchmark performance and genuine commonsense reasoning.
Mar 17
New Capability
A training-free operator for streaming 3D reconstruction reduces geometric drift using Grassmannian manifolds.
Mar 17
Paradigm Shift
POLCA uses LLMs as stochastic optimizers with theoretical convergence guarantees for complex system-level tasks.
Mar 17
New Capability
DynaAvatar achieves zero-shot 3D human reconstruction from a single image with motion-dependent cloth dynamics.
Mar 17
Efficiency Breakthrough
HO-SFL enables backprop-free fine-tuning on edge devices without the convergence penalty typical of zeroth-order methods.
Mar 17
Paradigm Shift
Agent architectures require an explicit epistemic control layer to route questions between incompatible reasoning frameworks.
Mar 17
Efficiency Breakthrough
RAZOR provides a lightweight, targeted unlearning framework for Transformers and Diffusion models without retraining.
Mar 17
Breaks Assumption
Demonstrates that safety and utility in LVLMs are not inherently antagonistic and can be simultaneously improved through inference-time projection.
Mar 17
Scaling Insight
Provides the first theoretical proof that dataset distillation efficiently encodes the low-dimensional structure of non-linear tasks.
Mar 17
Breaks Assumption
Proves a fundamental expressivity limit where Message-Passing Graph Neural Networks are infinitely weaker than standard Color Refinement algorithms.
Mar 17