SeriesFusion
Science, curated & edited by AI

Efficiency Breakthrough

375 papers  ·  Page 7 of 8
AI
Proposes a temporal mixed-precision framework for diffusion models that adaptively assigns bitwidths across different denoising timesteps.
Mar 17
AI
Accelerates LLM inference by up to 1.8x using a training-free sparse pattern predictor based on SVD truncation of FFN gate matrices.
Mar 17
AI
Unifies KV cache compression and sparse attention into a single 1-bit indexing structure, eliminating the need for external metadata or predictors.
Mar 17
AI
Detects diffusion-generated images 126x faster than reconstruction-based methods by using Gaussian noise disturbance to exploit the statistical 'ease' of fitting synthetic data.
Mar 17
AI
Enables model adaptation on edge devices and non-differentiable (quantized) models using a purely backpropagation-free optimization framework.
Mar 17
AI
Achieves real-time, low-latency talking avatar generation at 34ms per frame using a one-step streaming diffusion framework.
Mar 17
AI
Introduces ZoomUI, a trainless method for GUI grounding that uses inference-time scaling to anchor natural language instructions to interface elements.
Mar 17
AI
FLORE achieves 1000x error reduction in linear sketching while being 100x faster than previous learning-based solutions.
Mar 17
AI
SleepGate introduces a biologically inspired 'sleep cycle' for the KV cache to resolve proactive interference in long-context LLMs.
Mar 17
AI
ASAP reduces LVLM computational FLOPs by ~80% with virtually no loss in performance using a training-free KV-Cache pruning recipe.
Mar 17
AI
FlashHead is a drop-in replacement for the LM classification head that provides 1.75x inference speedup by treating vocabulary selection as a retrieval problem.
Mar 17
AI
Reformulates diffusion sampling as a graph-theoretic planning problem that dynamically allocates compute to the most difficult denoising stages.
Mar 17
AI
Generates novel, structurally plausible protein sequences from small alignments using a training-free stochastic attention mechanism on a standard laptop.
Mar 17
AI
Adaptive computation for multimodal LLMs drastically reduces compute waste on easy cases while focusing on hard ones.
Mar 17
AI
HO-SFL enables backprop-free fine-tuning on edge devices without the convergence penalty typical of zeroth-order methods.
Mar 17
AI
RAZOR provides a lightweight, targeted unlearning framework for Transformers and Diffusion models without retraining.
Mar 17
AI
Introduces an asynchronous Mixture-of-Transformers architecture for autonomous driving that decouples slow reasoning from fast action execution.
Mar 17
AI
Achieves over 80% of full-resolution VLM performance while using only 1% of the original pixel budget through bio-inspired foveated sampling.
Mar 17
AI
A unified graph propagation library achieving 35,000x speedups, enabling full simulations on billion-edge graphs in seconds.
Mar 17
AI
AdaAnchor enables LLMs to perform multi-step reasoning entirely in latent space with an adaptive halting mechanism to optimize compute.
Mar 17
AI
AnoleVLA replaces the standard Transformer backbone in robotic Vision-Language-Action models with Deep State Space Models for a 3x speedup.
Mar 17
AI
Writer-R1-4B outperforms 100B+ parameter models in creative writing by utilizing memory-augmented self-reflection and fine-grained criteria generation.
Mar 17
AI
Ultra-low-bitrate image compression achieves 50% bitrate savings by treating decoding as a 'next-frame' video prediction task using diffusion priors.
Mar 17
AI
HapticVLA achieves tactile-aware robotic manipulation at 86.7% success rate without requiring any physical tactile sensors at inference time.
Mar 17
AI
IConE enables stable self-supervised learning even at batch size 1, overcoming the memory bottlenecks of high-dimensional scientific and medical data.
Mar 17
AI
FlashU is the first framework to accelerate unified multimodal models by exploiting the distinct neuron sets used for generation vs. understanding.
Mar 17
AI
MeMix is a training-free, plug-and-play module that reduces 3D reconstruction error by up to 40% in long sequences by mitigating state drift.
Mar 17
AI
PrismMirror is the first monocular human frontal view synthesis model to achieve real-time inference (24 FPS) without external geometric models.
Mar 17
AI
A 4B parameter model matches a 120B parameter model in program verification through a rigorous data curation pipeline.
Mar 17
AI
Bridges the gap between generative (MAE) and predictive (I-JEPA) self-supervised learning, achieving a 10% performance boost.
Mar 17
AI
Accelerates state-of-the-art 3D human mesh recovery by over 10x, enabling real-time vision-only humanoid teleoperation.
Mar 17
AI
Introduces Mixture-of-Depths Attention (MoDA) to solve signal degradation in deep LLMs with hardware-efficient implementation.
Mar 17
AI
Achieves 1,000x speedups in Bayesian inverse problems by replacing repeated MCMC sampling with one-step preconditioned generative transport.
Mar 17
AI
ActTail achieves 80% activation sparsity in LLMs with significantly lower perplexity degradation than uniform methods by using Heavy-Tailed Self-Regularization theory.
Mar 16
AI
ReBalance is a training-free framework that dynamically modulates 'thinking' length in reasoning models to prune redundancy during overthinking and promote exploration during underthinking.
Mar 16
AI
Achieves 100x speedup in robotic action generation by distilling iterative flow/diffusion models into a one-step policy without a pre-trained teacher.
Mar 16
AI
Reduces Chain-of-Thought (CoT) compute costs by 14-55% by learning the optimal 'early-exit' points for Large Reasoning Models.
Mar 16
AI
Accelerates Diffusion Transformers (DiTs) by 2x using a training-free framework that selectively reduces computation in non-aesthetic image regions.
Mar 16
AI
Introduces a training-free framework that allows LLM agents to dynamically scale their reasoning depth based on a pre-defined token/tool budget.
Mar 16
AI
Achieves a 98x speedup in LLM routing on AMD hardware using Flash Attention and prompt compression, enabling high-context classification without a dedicated GPU.
Mar 16
AI
Modality-level disaggregation enables cost-optimal MLLM serving across heterogeneous GPUs over commodity PCIe, bypassing the need for expensive NVLink interconnects.
Mar 16
AI
A hardware-algorithm co-design for Spiking Neural Networks achieves up to 69x energy efficiency gains using an SRAM-based Compute-in-Memory accelerator.
Mar 16
AI
Achieves 4x visual token compression and 80% lower training cost while unifying multimodal comprehension and generation.
Mar 16
AI
Adaptive VLM Routing reduces inference costs for Computer Use Agents by up to 78% with negligible accuracy loss.
Mar 16
AI
Distills a 2B Vision-Language Retriever into a 70M text-only encoder for visual document retrieval with 50x lower latency.
Mar 16
AI
CleanSight provides a training-free, test-time defense for backdoored vision-language models by detecting and pruning 'attention stealing' visual tokens.
Mar 16
AI
Structured distillation for personalized agent memory achieves an 11x reduction in token count while preserving 96% of the retrieval quality of verbatim history.
Mar 16
AI
Induces pretrained video models to perform SOTA image restoration using less than 2% of the training data required by specialized architectures.
Mar 16
AI
Achieves 'zero-hyperparameter' circuit analysis by using a foundation model to perform in-context regression, bypassing hours of manual tuning.
Mar 16
AI
Introduces Bilateral Context Conditioning to DeepSeek's GRPO, allowing models to cross-reference successful and failed reasoning traces during optimization.
Mar 16