SeriesFusion
Science, curated & edited by AI

AI & Machine Learning

2,557 papers  ·  Page 38 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

New Capability
MemDLM embeds a simulated denoising process into training to create 'Parametric Memory,' narrowing the train-inference gap for Diffusion Language Models.
Mar 24
Open Release
An open foundation suite for universal dexterous robot control trained on over 50k trajectories across eight different robotic hand architectures.
Mar 24
Paradigm Shift
Bypasses Reinforcement Learning during the exploration phase by using uncertainty-guided tree search to discover informative data.
Mar 24
Efficiency Breakthrough
Enables high-rank (r=384) DoRA training on single GPUs through factored norms and fused Triton kernels.
Mar 24
Efficiency Breakthrough
Introduces a parallel reasoning mechanism for Vision-Language-Action (VLA) models that eliminates the latency bottleneck of autoregressive Chain-of-Thought.
Mar 24
Paradigm Shift
UNITE enables single-stage joint training of the tokenizer and the diffusion model from scratch, removing the need for frozen VAEs.
Mar 24
Efficiency Breakthrough
A training-free feature caching framework that achieves 2.3x speedup for video world models while maintaining 99.4% quality.
Mar 24
New Capability
A transformer-based meta-amortized framework that allows simulation-based inference to remain valid across different model structures without retraining.
Mar 24
Paradigm Shift
LassoFlexNet matches or beats leading tree-based models on tabular data while maintaining Lasso-like interpretability through per-feature embeddings and a group Lasso mechanism.
Mar 24
Breaks Assumption
Proves that rotation-invariant algorithms like standard Gradient Descent are fundamentally suboptimal for sparse targets when trained on hard labels.
Mar 24
New Capability
A grid-free probabilistic framework for nonrigid registration of high-dimensional vector-valued functions on irregular manifolds.
Mar 24
Efficiency Breakthrough
A unified discrete diffusion framework that outperforms autoregressive models on large-scale discrete generation tasks for the first time.
Mar 24
Paradigm Challenge
The math we've used for 50 years to figure out how fast the internet should be is actually missing a giant piece of the puzzle.
Mar 23
Nature Is Weird
You can get a whole crowd to agree on something even if everyone only knows what the person right next to them is thinking.
Mar 23
Nature Is Weird
Over 10% of new medical papers are being written by AI now—three years ago, that number was zero.
Mar 23
Practical Magic
We can now spot Alzheimer's early by looking at the brain like a building that’s literally buckling under the weight of toxic sludge.
Mar 23
Nature Is Weird
Massive wealth gaps might just be a math problem: if you always pick the better of two random options, inequality is basically guaranteed.
Mar 23
Paradigm Shift
Introduces a statistical alternative to the standard frequency-based BPE tokenization used in nearly all modern LLMs.
Mar 23
Scaling Insight
Discovers a multiplicative scaling law governing how LLMs revise their beliefs during iterative reasoning (CoT, reflection).
Mar 23
Efficiency Breakthrough
Achieves state-of-the-art LLM distillation using 10-25% of the data required by standard fine-tuning.
Mar 23
Paradigm Shift
Formally proves that a causal Transformer is mathematically equivalent to a stateless Differentiable Neural Computer.
Mar 23
Efficiency Breakthrough
Accelerates MoE inference by speculating future experts to overlap CPU-GPU memory transfers with computation.
Mar 23
New Capability
A self-improvement framework (MIPO) that improves LLM personalization and reasoning with zero additional data or human labels.
Mar 23
Efficiency Breakthrough
Achieve 97% of Oracle reward performance using only 20% of the training labels for complex LLM reasoning.
Mar 23
Efficiency Breakthrough
The first Joint Embedding Predictive Architecture (JEPA) to train stably end-to-end from raw pixels with massive planning speedups.
Mar 23
Paradigm Shift
Solves the compositional generalization failure of neural networks (0% to 100% accuracy) by embedding algebraic semiring constraints.
Mar 23
Scaling Insight
A massive controlled study reveals that post-training algorithm rankings (DPO, SimPO, etc.) completely invert as models scale.
Mar 23
Efficiency Breakthrough
DAPA speeds up GELU computation by 16x and reduces hardware DSP utilization by 16x for on-device Transformer deployment.
Mar 23
Efficiency Breakthrough
Spectral Tempering achieves near-oracle embedding compression for dense retrieval without requiring any labeled data or grid searching.
Mar 23
Paradigm Shift
Challenges the 80-year-old assumption that neurons must use weighted summation as their primary aggregation mechanism.
Mar 23
Efficiency Breakthrough
Empirically proves that most Transformer layers are redundant, enabling a 54% training cost reduction through non-uniform budget allocation.
Mar 23
Efficiency Breakthrough
Warm-Start Flow Matching provides a guaranteed speedup for image/text generation by using lightweight models as initial priors.
Mar 23
New Capability
VAMPO optimizes visual dynamics in video models using policy gradients to fix precision-critical errors in robotic manipulation.
Mar 23
Breaks Assumption
Debunks recent 'evaluation awareness' findings in LLMs by showing that linear probes are actually just tracking formatting artifacts.
Mar 23
Paradigm Shift
Introduces Hyperagents: self-referential systems where the meta-level modification logic is itself an editable program.
Mar 23
Efficiency Breakthrough
Adaptive Layerwise Perturbation (ALP) solves the training-inference mismatch and importance ratio blowup in LLM reinforcement learning.
Mar 23
Paradigm Shift
Fine-tunes Large Vision Language Models for medical tasks using only image-description pairs, bypassing the need for expensive expert-curated instructions.
Mar 23
New Capability
Introduces Any-Subgroup Equivariant Networks (ASEN), a single model that can adapt to multiple different symmetry groups via input modulation.
Mar 23
New Capability
ICLAD enables unified, in-context anomaly detection for tabular data across unsupervised, semi-supervised, and one-class regimes without weight updates.
Mar 23
New Capability
Expands formal reasoning beyond proof construction to the generation and formal verification of counterexamples in Lean 4.
Mar 23
Efficiency Breakthrough
EvidenceRL uses reinforcement learning (GRPO) to explicitly optimize for evidence adherence, reducing hallucinations in high-stakes RAG pipelines.
Mar 23
Breaks Assumption
MoCA3D predicts 3D bounding boxes from monocular images without requiring any camera intrinsics at inference time.
Mar 23
Breaks Assumption
Reveals that complex reasoning strategies like Chain-of-Thought (CoT) and Tree-of-Thought (ToT) provide negligible or even negative gains for text classification tasks.
Mar 23
Paradigm Shift
Formalizes the 'Neural Uncertainty Principle,' linking adversarial vulnerability in vision and hallucinations in LLMs to a shared geometric and information-theoretic origin.
Mar 23
Efficiency Breakthrough
Accelerates diffusion-based image decoders by an order of magnitude using multi-scale sampling and one-step distillation.
Mar 23
New Capability
CurveStream implements a curvature-aware hierarchical memory to handle streaming video in MLLMs without Out-of-Memory (OOM) errors.
Mar 23
Breaks Assumption
Proves the Key-Value (KV) cache is entirely redundant and can be bit-identically recomputed from the residual stream.
Mar 23
Efficiency Breakthrough
Reduces covariance tracking error by 30x by reformulating the problem as rigid-body motion on Lie groups.
Mar 23
Paradigm Shift
A massive field study (9,000+ users) proves that algorithmic shifts can reduce affective polarization without sacrificing user engagement.
Mar 23
Efficiency Breakthrough
Achieves a 19x reduction in inference cost and 16x in latency for agentic workflows by evolving hybrid LLM-and-code pipelines.
Mar 23