SeriesFusion
Science, curated & edited by AI

Efficiency Breakthrough

375 papers  ·  Page 6 of 8
AI
Achieves an 80% reduction in Chain-of-Thought (CoT) tokens while slightly increasing reasoning accuracy.
Mar 19
AI
Extends LLM context from 32K to 128K by teaching models to selectively skip global attention for ~80% of tokens.
Mar 19
AI
Knowledge-Aware Active Learning (KA2L) uses latent space probing to identify what an LLM doesn't know and generates targeted synthetic questions.
Mar 19
AI
S-VGGT introduces structure-aware subscene decomposition to break the quadratic scaling bottleneck of 3D foundation models.
Mar 19
AI
DSS-GAN is the first generative adversarial network to use a Mamba (State Space Model) backbone for high-quality image synthesis.
Mar 19
AI
Synthetic videos of simple geometric shapes are more effective than massive real-world datasets for teaching video-language models fundamental temporal reasoning.
Mar 19
AI
Anomaly detection can be performed directly using a primary model's internal neuron output ranges, eliminating the need for expensive external AD models.
Mar 19
AI
Truncated backpropagation for video decoding reduces the memory cost of fine-tuning video diffusion models from linear to constant.
Mar 19
AI
ProbeFlow achieves 14.8x faster action decoding in Vision-Language-Action (VLA) models without any retraining.
Mar 19
AI
Parallel multi-token prediction can be achieved in standard LLMs without training auxiliary models or modifying weights.
Mar 19
AI
CARE provides a recipe for converting standard GQA models into high-efficiency Multi-head Latent Attention (MLA) architectures.
Mar 19
AI
VideoAtlas enables navigation and reasoning over long-form video using compute that scales only logarithmically with video length.
Mar 19
AI
MUD provides a faster, lower-overhead alternative to Muon for transformer training, achieving up to 2.6x higher throughput.
Mar 19
AI
LoST introduces a semantic-first 3D tokenizer that reduces the token count for 3D shape generation by up to 99.9%.
Mar 19
AI
RSM achieves 20x faster training for recursive reasoning models and enables test-time scaling for up to 20,000 refinement steps.
Mar 18
AI
Reduces high-quality 3D head avatar creation time from over 24 hours to 0.5 seconds per frame.
Mar 18
AI
Fuses categorical sampling into the LM-head matmul to eliminate logit materialization and speed up LLM decoding by up to 19%.
Mar 18
AI
Achieves microsecond-level kinodynamic motion planning for high-DOF robots by using differential flatness to solve boundary value problems analytically.
Mar 18
AI
Demonstrates that masked diffusion language models can be 21.8x more compute-efficient than traditional autoregressive models when scaled correctly.
Mar 18
AI
Introduces Helium, a serving framework that treats agentic workflows as data query plans to optimize redundant LLM calls and KV caches.
Mar 18
AI
Presents ZipCal, a model-agnostic calibration data selection strategy for pruning and quantization that is 240x faster than model-based methods.
Mar 18
AI
VQKV uses Vector Quantization to achieve over 80% KV cache compression with almost zero loss in model performance.
Mar 18
AI
FEAT is a linear-complexity foundation model designed specifically for extremely large-scale structured (tabular) data.
Mar 18
AI
Enables stable 4-bit microscaling (MXFP4) quantization for Multi-modal LLMs, which previously suffered from performance collapse.
Mar 18
AI
Low-precision optimizer states cause 'state staleness' where updates round back to stored values, but scheduled resets can fully recover performance loss.
Mar 18
AI
GIST achieves O(N) complexity for Graph Transformers while maintaining gauge invariance, enabling scaling to meshes with 750K nodes.
Mar 18
AI
Pretrained 3D generative models can be repurposed for high-quality part segmentation using less than 1% of the typical labeled data.
Mar 18
AI
Truncated-Reasoning Self-Distillation (TRSD) allows models to maintain accuracy even when their chain-of-thought traces are heavily shortened.
Mar 17
AI
The ICaRus architecture allows multiple different models to share a single, frozen KV cache for the same prompt.
Mar 17
AI
Using parallel associative scans achieves a 44x speedup in training continuous-time Spiking Neural Networks (SNNs).
Mar 17
AI
RelayCaching eliminates redundant prefill computation in multi-agent systems by reusing the decoding-phase KV cache from previous agents.
Mar 17
AI
Pretrained Transformers exhibit a pervasive inter-head linear structure where many attention heads can be reconstructed from a small set of peer heads.
Mar 17
AI
FineRMoE extends MoE granularity to both intermediate and output dimensions, achieving a 136x increase in decoding throughput.
Mar 17
AI
Distribution-Conditioned Diffusion Decoding enables high-fidelity image generation from pre-trained VLMs without expensive full-model retraining.
Mar 17
AI
Qianfan-OCR introduces 'Layout-as-Thought,' enabling a 4B model to outperform 235B models on complex document parsing and layout analysis.
Mar 17
AI
Achieves significant tool-selection accuracy gains in LLM semantic routers with zero added serving-time latency or cost.
Mar 17
AI
A training-free acceleration method for diffusion language models that achieves a 4x speedup in image generation.
Mar 17
AI
Implements bio-inspired 'mental-state dynamics' to achieve O(N) complexity in Vision Transformers.
Mar 17
AI
Reduces the number of real-world robot rollouts needed for policy comparison by up to 70% using safe, anytime-valid inference.
Mar 17
AI
Outperforms fine-tuned baselines in code optimization by using semantics-preserving transformations as a generative intermediate representation.
Mar 17
AI
A 140M-parameter networking foundation model (PLUME) that outperforms frontier LLMs on protocol analysis by learning from native packet structures.
Mar 17
AI
Replaces the quadratic cost of self-attention in Diffusion Transformers with a convection-diffusion PDE solved in the Fourier domain.
Mar 17
AI
Implicit Maximum Likelihood Estimation (IMLE) achieves multimodal trajectory planning performance comparable to diffusion models while being 100x faster.
Mar 17
AI
Greedy Information Projection (GIP) provides a fast, geometrically-principled method for selecting training data that balances quality and diversity, achieving full-data performance with a fraction of the examples.
Mar 17
AI
Traditional Spiking Neural Network (SNN) sparsity is a performance 'illusion' on GPUs; temporal aggregation is required for actual 13x speedups.
Mar 17
AI
Enables training of CNNs from scratch in true 4-bit precision on commodity CPUs with virtually no loss in accuracy.
Mar 17
AI
Introduces the FLUX preprocessing pipeline, which reduces LLM training compute by 34% by maximizing high-quality token retention.
Mar 17
AI
Reduces the RAM requirement for speech neuroprosthesis CTC decoding from 320 GB to 10 GB without sacrificing accuracy.
Mar 17
AI
Reveals that Graph-RAG performance is limited by reasoning failure rather than retrieval, and shows how to make an 8B model match a 70B baseline.
Mar 17
AI
Amortizes iterative diffusion into a one-step trajectory policy for robotics using a novel 'Keyed Drift Field' objective.
Mar 17