SeriesFusion
Science, curated & edited by AI

Efficiency Breakthrough

375 papers  ·  Page 4 of 8
AI
Integrates fast scalar rewards with slow generative CoT reasoning to reduce reward model token consumption by 20%.
Mar 24
AI
Enables precise prompt routing by predicting the expected reward of a model before any response is generated.
Mar 24
AI
Reduces Tree of Thought (ToT) computational overhead by up to 75% using plug-and-play predictors for pruning.
Mar 24
AI
STAC achieves a 10x memory reduction and 4x speedup for real-time streaming 3D reconstruction using spatio-temporal cache compression.
Mar 24
AI
DiffMark enables multi-bit watermarking that is transferable across different frozen diffusion models with a 45x speedup over current methods.
Mar 24
AI
VGS-Decoding is a training-free method to mitigate medical VLM hallucinations by reweighting token probabilities based on their visual dependency.
Mar 24
AI
GEM is the first native graph-based index for multi-vector (ColBERT-style) retrieval, achieving up to 16x speedups over existing single-vector index adaptations.
Mar 24
AI
AE-LLM automatically orchestrates the optimal combination of MoE, quantization, and PEFT for specific deployment hardware and tasks.
Mar 24
AI
Row-Momentum Normalized Preconditioning (RMNP) provides Muon-level performance with significantly lower computational complexity.
Mar 24
AI
3D object localization can be achieved 100x faster by using image-based 'visual memory' instead of global 3D scene reconstruction.
Mar 24
AI
Vision-Language Models can be steered to understand negation using geometry-based representation engineering without any fine-tuning.
Mar 24
AI
Memory-Keyed Attention (MKA) achieves 5x faster training throughput and nearly 2x lower latency while matching the accuracy of compressed attention variants.
Mar 24
AI
GaussianPile adapts 3D Gaussian Splatting for volumetric imaging, achieving 11x faster reconstruction than NeRFs and 16x compression over voxel grids.
Mar 24
AI
MixedDimKV achieves 100% accuracy on 50K context lengths while using as little as 0.26% of the traditional KV cache.
Mar 24
AI
A low-resource SOP using 'Shadow-RAG' enables 32B models to reach 90% accuracy on graduate-level exams with only 3 days of labor.
Mar 24
AI
A routing framework that uses internal prefill activations to select the optimal LLM for a task, capturing 45% of the oracle accuracy gap with 74% cost savings.
Mar 24
AI
A training-free visual token pruning framework for Large Vision-Language Models that preserves geometric structure through subspace reconstruction.
Mar 24
AI
Free Sinewich enables parameter-efficient multi-task learning using frequency-based weight modulation with near-zero overhead.
Mar 24
AI
Prompt Replay speeds up GRPO training by selectively reusing 'medium difficulty' prompts to maximize learning signal in RL rollouts.
Mar 24
AI
Breaks the massive compute barrier for medium-range weather forecasting, training on a single consumer-grade GPU.
Mar 24
AI
An autonomous agent loop that optimizes GPU kernels to outperform human-expert and compiler-generated baselines.
Mar 24
AI
Introduces AgentHER, a framework that salvages 'failed' agent trajectories by relabeling them as successful demonstrations for alternative goals.
Mar 24
AI
TIDE is a post-training early-exit system that allows individual tokens to skip unnecessary layers, improving throughput by up to 8% with minimal calibration.
Mar 24
AI
PivotRL identifies 'pivot' turns in agent trajectories where actions matter most, enabling compute-efficient reinforcement learning that matches end-to-end RL at 4x lower cost.
Mar 24
AI
KG-Hopper enables 7B-parameter models to outperform 70B systems on complex Knowledge Graph reasoning by embedding the entire multi-hop process into a single 'thinking' stage.
Mar 24
AI
Achieves state-of-the-art open-vocabulary segmentation using a training-free, purely geometric projection and propagation method.
Mar 24
AI
Enables merging independently trained specialist models (e.g., Vision-LLM and Audio-LLM) into a single multimodal model without any paired training data.
Mar 24
AI
SparseVoxelDet is the first fully sparse object detector for event cameras that never instantiates a dense tensor, achieving 858x GPU memory compression.
Mar 24
AI
Confidence-Evidence Bayesian Gain (CEBaG) provides deterministic hallucination detection for medical VQA without requiring 10-20 stochastic generations.
Mar 24
AI
Enables high-performance Zeroth-Order (ZO) fine-tuning of LLMs by leveraging online curvature signals.
Mar 24
AI
Reduces token consumption in interleaved multimodal reasoning by over 72% using dynamic visual thoughts.
Mar 24
AI
Eliminates the need for strictly aligned image pairs in infrared and visible image fusion.
Mar 24
AI
Reduces human annotation requirements for NLP model testing by up to 95%.
Mar 24
AI
Achieves a 50x reduction in visual tokens for Video-LLMs while preserving over 90% of baseline performance.
Mar 24
AI
Introduces a learnable bridge between GELU and ReLU activations to enable deployment-friendly piecewise-linear networks.
Mar 24
AI
Achieves a 75x parameter reduction in 3D medical image segmentation by hybridizing Mamba and Transformer modules.
Mar 24
AI
Introduces a streaming detection head that stops Large Reasoning Models (LRMs) from 'overthinking' redundant steps.
Mar 24
AI
Reduces the token count of Stable Diffusion 3.5 by 4x for high-resolution generation with minimal fine-tuning.
Mar 24
AI
A predictive scheduling system for multi-agent workflows that optimizes serving across heterogeneous LLM clusters (mixing large and small models).
Mar 24
AI
Enables high-rank (r=384) DoRA training on single GPUs through factored norms and fused Triton kernels.
Mar 24
AI
Introduces a parallel reasoning mechanism for Vision-Language-Action (VLA) models that eliminates the latency bottleneck of autoregressive Chain-of-Thought.
Mar 24
AI
A training-free feature caching framework that achieves 2.3x speedup for video world models while maintaining 99.4% quality.
Mar 24
AI
A unified discrete diffusion framework that outperforms autoregressive models on large-scale discrete generation tasks for the first time.
Mar 24
AI
Achieves state-of-the-art LLM distillation using 10-25% of the data required by standard fine-tuning.
Mar 23
AI
Accelerates MoE inference by speculating future experts to overlap CPU-GPU memory transfers with computation.
Mar 23
AI
Achieve 97% of Oracle reward performance using only 20% of the training labels for complex LLM reasoning.
Mar 23
AI
The first Joint Embedding Predictive Architecture (JEPA) to train stably end-to-end from raw pixels with massive planning speedups.
Mar 23
AI
DAPA speeds up GELU computation by 16x and reduces hardware DSP utilization by 16x for on-device Transformer deployment.
Mar 23
AI
Spectral Tempering achieves near-oracle embedding compression for dense retrieval without requiring any labeled data or grid searching.
Mar 23
AI
Empirically proves that most Transformer layers are redundant, enabling a 54% training cost reduction through non-uniform budget allocation.
Mar 23