Filter by desk: AI Computing Robotics Math Quantum Physics Space Earth Chemistry Engineering Ecology Biology Neuroscience Health Psychology Economics Society
AI
Integrates fast scalar rewards with slow generative CoT reasoning to reduce reward model token consumption by 20%.
AI
Enables precise prompt routing by predicting the expected reward of a model before any response is generated.
AI
Reduces Tree of Thought (ToT) computational overhead by up to 75% using plug-and-play predictors for pruning.
AI
STAC achieves a 10x memory reduction and 4x speedup for real-time streaming 3D reconstruction using spatio-temporal cache compression.
AI
DiffMark enables multi-bit watermarking that is transferable across different frozen diffusion models with a 45x speedup over current methods.
AI
VGS-Decoding is a training-free method to mitigate medical VLM hallucinations by reweighting token probabilities based on their visual dependency.
AI
GEM is the first native graph-based index for multi-vector (ColBERT-style) retrieval, achieving up to 16x speedups over existing single-vector index adaptations.
AI
AE-LLM automatically orchestrates the optimal combination of MoE, quantization, and PEFT for specific deployment hardware and tasks.
AI
Row-Momentum Normalized Preconditioning (RMNP) provides Muon-level performance with significantly lower computational complexity.
AI
3D object localization can be achieved 100x faster by using image-based 'visual memory' instead of global 3D scene reconstruction.
AI
Vision-Language Models can be steered to understand negation using geometry-based representation engineering without any fine-tuning.
AI
Memory-Keyed Attention (MKA) achieves 5x faster training throughput and nearly 2x lower latency while matching the accuracy of compressed attention variants.
AI
GaussianPile adapts 3D Gaussian Splatting for volumetric imaging, achieving 11x faster reconstruction than NeRFs and 16x compression over voxel grids.
AI
MixedDimKV achieves 100% accuracy on 50K context lengths while using as little as 0.26% of the traditional KV cache.
AI
A low-resource SOP using 'Shadow-RAG' enables 32B models to reach 90% accuracy on graduate-level exams with only 3 days of labor.
AI
A routing framework that uses internal prefill activations to select the optimal LLM for a task, capturing 45% of the oracle accuracy gap with 74% cost savings.
AI
A training-free visual token pruning framework for Large Vision-Language Models that preserves geometric structure through subspace reconstruction.
AI
Free Sinewich enables parameter-efficient multi-task learning using frequency-based weight modulation with near-zero overhead.
AI
Prompt Replay speeds up GRPO training by selectively reusing 'medium difficulty' prompts to maximize learning signal in RL rollouts.
AI
Breaks the massive compute barrier for medium-range weather forecasting, training on a single consumer-grade GPU.
AI
An autonomous agent loop that optimizes GPU kernels to outperform human-expert and compiler-generated baselines.
AI
Introduces AgentHER, a framework that salvages 'failed' agent trajectories by relabeling them as successful demonstrations for alternative goals.
AI
TIDE is a post-training early-exit system that allows individual tokens to skip unnecessary layers, improving throughput by up to 8% with minimal calibration.
AI
PivotRL identifies 'pivot' turns in agent trajectories where actions matter most, enabling compute-efficient reinforcement learning that matches end-to-end RL at 4x lower cost.
AI
KG-Hopper enables 7B-parameter models to outperform 70B systems on complex Knowledge Graph reasoning by embedding the entire multi-hop process into a single 'thinking' stage.
AI
Achieves state-of-the-art open-vocabulary segmentation using a training-free, purely geometric projection and propagation method.
AI
Enables merging independently trained specialist models (e.g., Vision-LLM and Audio-LLM) into a single multimodal model without any paired training data.
AI
SparseVoxelDet is the first fully sparse object detector for event cameras that never instantiates a dense tensor, achieving 858x GPU memory compression.
AI
Confidence-Evidence Bayesian Gain (CEBaG) provides deterministic hallucination detection for medical VQA without requiring 10-20 stochastic generations.
AI
Enables high-performance Zeroth-Order (ZO) fine-tuning of LLMs by leveraging online curvature signals.
AI
Reduces token consumption in interleaved multimodal reasoning by over 72% using dynamic visual thoughts.
AI
Eliminates the need for strictly aligned image pairs in infrared and visible image fusion.
AI
Reduces human annotation requirements for NLP model testing by up to 95%.
AI
Achieves a 50x reduction in visual tokens for Video-LLMs while preserving over 90% of baseline performance.
AI
Introduces a learnable bridge between GELU and ReLU activations to enable deployment-friendly piecewise-linear networks.
AI
Achieves a 75x parameter reduction in 3D medical image segmentation by hybridizing Mamba and Transformer modules.
AI
Introduces a streaming detection head that stops Large Reasoning Models (LRMs) from 'overthinking' redundant steps.
AI
Reduces the token count of Stable Diffusion 3.5 by 4x for high-resolution generation with minimal fine-tuning.
AI
A predictive scheduling system for multi-agent workflows that optimizes serving across heterogeneous LLM clusters (mixing large and small models).
AI
Enables high-rank (r=384) DoRA training on single GPUs through factored norms and fused Triton kernels.
AI
Introduces a parallel reasoning mechanism for Vision-Language-Action (VLA) models that eliminates the latency bottleneck of autoregressive Chain-of-Thought.
AI
A training-free feature caching framework that achieves 2.3x speedup for video world models while maintaining 99.4% quality.
AI
A unified discrete diffusion framework that outperforms autoregressive models on large-scale discrete generation tasks for the first time.
AI
Achieves state-of-the-art LLM distillation using 10-25% of the data required by standard fine-tuning.
AI
Accelerates MoE inference by speculating future experts to overlap CPU-GPU memory transfers with computation.
AI
Achieve 97% of Oracle reward performance using only 20% of the training labels for complex LLM reasoning.
AI
The first Joint Embedding Predictive Architecture (JEPA) to train stably end-to-end from raw pixels with massive planning speedups.
AI
DAPA speeds up GELU computation by 16x and reduces hardware DSP utilization by 16x for on-device Transformer deployment.
AI
Spectral Tempering achieves near-oracle embedding compression for dense retrieval without requiring any labeled data or grid searching.
AI
Empirically proves that most Transformer layers are redundant, enabling a 54% training cost reduction through non-uniform budget allocation.