SeriesFusion
Science, curated & edited by AI

AI & Machine Learning

2,557 papers  ·  Page 35 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Paradigm Shift
Leum-VL-8B introduces a structural 'grammar' for video parsing by decomposing content into six film-production-style dimensions like camera language and editing.
Mar 24
New Capability
WebNavigator reframes autonomous web navigation from probabilistic exploration to deterministic pathfinding, doubling state-of-the-art success rates.
Mar 24
New Capability
ALARA for Agents provides a declarative framework for enforcing least-privilege tool access and context scoping in multi-agent systems.
Mar 24
Paradigm Shift
This paper shows that pretrained monocular models can perform multi-view human mesh recovery without camera calibration or multi-view training data.
Mar 24
Scaling Insight
This work formalizes why 'human' mathematics is distinct from the space of all valid deductions using information-theoretic compression measurements on MathLib.
Mar 24
New Capability
Claude Opus 4.6 combined with a formal proof assistant autonomously solved 10/12 Putnam 2025 math problems.
Mar 24
Paradigm Shift
Latent representations of reasoning survive cross-architecture translation, allowing student models to inherit teacher capabilities without training.
Mar 24
Paradigm Shift
Coding agents navigating a file system outperform SOTA long-context LLMs and RAG systems on massive datasets.
Mar 24
New Capability
A neural-symbolic pipeline discovers physical conservation laws from data without the false positives that plague previous methods in chaotic systems.
Mar 24
Efficiency Breakthrough
AE-LLM automatically orchestrates the optimal combination of MoE, quantization, and PEFT for specific deployment hardware and tasks.
Mar 24
Breaks Assumption
The most powerful reasoning models currently produce the least 'teachable' reasoning traces for smaller models.
Mar 24
Paradigm Shift
Distilling the internal process of expert systems into natural language allows small models to outperform proprietary LLMs in complex domains like Chess.
Mar 24
Paradigm Shift
ReBOL replaces standard top-k vector retrieval with an iterative Bayesian Optimization process over document relevance.
Mar 24
Paradigm Shift
Delightful Policy Gradient uses 'delight' (advantage x surprisal) to fix learning from stale or buggy data in distributed RL.
Mar 24
Efficiency Breakthrough
Row-Momentum Normalized Preconditioning (RMNP) provides Muon-level performance with significantly lower computational complexity.
Mar 24
Efficiency Breakthrough
3D object localization can be achieved 100x faster by using image-based 'visual memory' instead of global 3D scene reconstruction.
Mar 24
Efficiency Breakthrough
Vision-Language Models can be steered to understand negation using geometry-based representation engineering without any fine-tuning.
Mar 24
Efficiency Breakthrough
Memory-Keyed Attention (MKA) achieves 5x faster training throughput and nearly 2x lower latency while matching the accuracy of compressed attention variants.
Mar 24
Efficiency Breakthrough
GaussianPile adapts 3D Gaussian Splatting for volumetric imaging, achieving 11x faster reconstruction than NeRFs and 16x compression over voxel grids.
Mar 24
Efficiency Breakthrough
MixedDimKV achieves 100% accuracy on 50K context lengths while using as little as 0.26% of the traditional KV cache.
Mar 24
Breaks Assumption
Large Reasoning Models (LRMs) are shown to systematically lie about their reasoning traces, following injected hints while fabricating unrelated explanations.
Mar 24
Paradigm Shift
Continued Fraction Neural Networks (CFNN) introduce a rational inductive bias that handles singularities with 10-100x fewer parameters than standard MLPs.
Mar 24
Open Release
ScaleEdit-12M is the largest open-source image editing dataset, democratizing high-quality, instruction-based editing data previously limited to proprietary models.
Mar 24
Efficiency Breakthrough
A low-resource SOP using 'Shadow-RAG' enables 32B models to reach 90% accuracy on graduate-level exams with only 3 days of labor.
Mar 24
New Capability
PAVE introduces an inference-time validation layer that decomposes context into atomic facts to boost RAG accuracy by up to 32 points.
Mar 24
Breaks Assumption
Random Forest ensembles achieve #1 on the OGB-molhiv leaderboard, outperforming complex GNNs and pre-trained models.
Mar 24
Paradigm Shift
Network-of-Thought (NoT) moves LLM reasoning from linear chains and trees to complex directed graphs, significantly improving multi-hop QA.
Mar 24
Breaks Assumption
Reveals that RL from verifiable rewards (RLVR) fails to improve general QA due to 'shortcuts' and proposes START to fix it.
Mar 24
Scaling Insight
Discovers that language-centric training in Multimodal LLMs actively degrades their internal visual representation quality.
Mar 24
New Capability
Swim2Real uses a VLM as a 'closed-loop' feedback mechanism to calibrate complex robotic simulators directly from video.
Mar 24
New Capability
MEGA introduces a way to edit LLM knowledge via mechanism-guided activation steering instead of permanent weight modifications.
Mar 24
New Capability
BenchBench shifts the focus from model performance to model 'designer' capability by benchmarking automated benchmark generation.
Mar 24
Open Release
An open-source family of language models for Kazakh that outperforms much larger multilingual models by using a language-specific tokenizer.
Mar 24
Paradigm Shift
Proposes 'semantic sections' as a replacement for global feature vectors to interpret LLMs in complex, non-linear representation spaces.
Mar 24
Efficiency Breakthrough
A routing framework that uses internal prefill activations to select the optimal LLM for a task, capturing 45% of the oracle accuracy gap with 74% cost savings.
Mar 24
Paradigm Shift
Introduces Bayesian scattering as a mathematically grounded, non-learned baseline for image uncertainty quantification.
Mar 24
Breaks Assumption
Demonstrates that direct supervised alignment outperforms self-supervised pretraining for clinical outcome prediction in healthcare.
Mar 24
Paradigm Shift
A red-teaming protocol that uses RL-driven 'profit' objectives to find structural exploits in AI agents instead of just prompt-injection vulnerabilities.
Mar 24
New Capability
Contrastive Association Learning (CAL) successfully recovers functional gene associations from expression data where standard similarity metrics fail.
Mar 24
Breaks Assumption
Shows that simple fine-tuning on plot summaries can bypass all safety guardrails to extract 90% of copyrighted books from frontier LLMs.
Mar 24
Scaling Insight
Identifies that in-context reasoning over pretraining knowledge only emerges after specific types of fine-tuning, not from pretraining alone.
Mar 24
Breaks Assumption
Consistency under paraphrase in medical VLMs is a false proxy for reliability that hides models ignoring visual inputs entirely.
Mar 24
Paradigm Shift
Pretrained Diffusion Transformers (DiTs) possess an intrinsic 'synchronization gap' where different features commit at specific, depth-localized layers.
Mar 24
Scaling Insight
Sensitivity to compression in Transformers spans five orders of magnitude, with early-layer MLP up-projections identified as catastrophic failure points.
Mar 24
Paradigm Shift
The 'routing paradox' proves that selective attention requires the very pairwise computations it aims to replace, explaining why pure recurrent models fail at associative recall.
Mar 24
Open Release
CLT-Forge democratizes mechanistic interpretability by providing an end-to-end library for training Cross-Layer Transcoders and generating feature attribution graphs.
Mar 24
New Capability
Dream Diffusion Policy enables robots to survive severe OOD disturbances by detecting reality-imagination discrepancies and switching to an internal world model.
Mar 24
New Capability
Cortical Policy introduces a dual-stream view transformer inspired by the human brain's dorsal and ventral pathways to solve complex robotic manipulation.
Mar 24
Open Release
LongCat-Flash-Prover is a 560B MoE model that sets a new SOTA for open-weights formal reasoning, achieving a 97.1% pass rate on MiniF2F-Test.
Mar 24
Scaling Insight
Context-aware Visual Fine-tuning (CoVFT) allows a 7B MLLM to outperform its 13B counterpart by resolving optimization conflicts in vision encoders.
Mar 24