New Capability

333 papers · Page 1 of 7

Papers where something becomes possible that previously was not. New techniques, new instruments, new model behaviors, new measurements at a frontier.

New Capability / Category lead

Interfaces LLMs with Wikidata-scale graphs for multi-hop reasoning without any retraining of the model or the query executor.

It enables retrieval-augmented generation on massive Knowledge Graphs (1.6B relations) using off-the-shelf components. This drastically lowers the barrier for practitioners to ground LLMs in structured, large-scale factual data without expensive fine-tuning.

By SeriesFusion Editorial Board · April 1, 2026

Filter by desk: AI Computing Robotics Math Quantum Physics Space Earth Chemistry Engineering Ecology Biology Neuroscience Health Psychology Economics Society

Enables reinforcement learning for long-horizon robots across diverse tasks without requiring manual reward engineering.

First generative model capable of synthesizing physically consistent 'raw' camera sensor data from text prompts or sRGB images.

A production-ready adaptive router for LLM portfolios that manages cost-quality trade-offs in real-time under strict dollar budgets.

High-quality oversight of massive proprietary LLM agents can be achieved by small, open-source 'critics' that intervene in real-time within the same interaction.

Reduces multimodal jailbreak success rates by 97% using a simple conditional decoding strategy without task-specific fine-tuning.

Reconstructs authentic LiDAR point clouds under jamming attacks with a 92% success rate by exploiting raw full-waveform representations.

Enables zero-shot humanoid navigation in unseen environments using only 5 hours of human walking data and no robot-specific data.

A white-box membership inference attack using 'gradient-induced feature drift' to outperform all existing confidence-based methods.

Introduces the first auto-regressive framework for Gaussian Splatting, enabling parallel, progressive next-scale 3D generation.

Proposes a parameter-efficient LLM adaptation method that enables rapid specialization on non-stationary streams while preventing catastrophic forgetting.

Rebuilds the Agent-Computer Interaction (ACI) stack for scientific discovery, solving the fragility of JSON tool-calling and execution sandboxes.

Introduces SIGN, a framework capable of discovering governing symbolic equations for networked systems with over 100,000 nodes.

TTA-Vid enables video reasoning models to adapt to new domains at test-time using label-free reinforcement learning on a single sample.

ThoughtSteer demonstrates the first successful backdoor attack on continuous latent reasoning models that leave no token-based audit trail.

An autonomous research pipeline discovered a lifelong multimodal memory framework by diagnosing and fixing its own architectural bugs and data pipeline issues.

WARP provides provable, guaranteed repairs for inner layers of Transformers, overcoming the limitation of previous methods restricted to the final layer.

Solves highly intractable (#P-hard) multi-objective optimization problems with tight approximation guarantees using a novel SAT-oracle approach.

Demonstrates that covert collusion between multi-agent LLM systems can be detected zero-shot using internal model activations.

First humanoid robot system to achieve consecutive ping-pong strikes using only onboard egocentric vision and whole-body coordination.

Introduces 'deconfounding scores' to enable reliable causal effect estimation even when treatment and control groups have very little overlap.

Achieves an 80x improvement in stable generation length for occupancy world models, enabling 4km+ autonomous driving simulations from a single frame.

Leverages model reprogramming as an 'active signal amplifier' to proactively audit privacy leakage in LLMs and Diffusion models.

Achieves a +48pp accuracy gain in agents using a non-parametric online learning framework that reuses procedural plans without updating model weights.

Introduces a way for diffusion models to generate a single, sharp 'mental average' of a concept rather than blurry pixel-wise averages.

Introduces a scalable reinforcement learning framework that enables high-fidelity control of a whole-body human musculoskeletal system with over 700 muscles.

Proposes 'Nomad', an exploration-first agent architecture that autonomously discovers insights in data without being limited by human prompts or questions.

Provides a robust solution for anti-aliasing in Feed-forward Gaussian Splatting, enabling high-fidelity rendering across varying sampling rates and resolutions.

Enables precise Camera-LiDAR extrinsic calibration even under massive initial misalignments that typically break automated calibration systems.

The first prior-fitted foundation model for survival analysis that enables zero-shot time-to-event predictions on tabular data.

Provides a closed-form safety law for Dynamic Movement Primitives, enabling provably safe robot control without real-time optimization.

A novel approach to upcycle multiple dense expert models into a unified Mixture-of-Experts model without any additional training.

Introduces a GUI-native agent system that operates complex scientific instruments through their existing visual interfaces rather than requiring proprietary APIs.

Shifts multimodal LLMs from static image prefixes to an active, sequential 'Visual Chain-of-Thought' that explores images based on saliency.

The first training-free framework for high-fidelity appearance transfer specifically designed for Diffusion Transformers (DiTs).

LLMs used for financial forecasting are often 'cheating' by memorizing training data, a bias this framework detects and filters out to improve Sharpe ratios by 49%.

A unified L0-gating mechanism that enables comparable sparsification and pruning across graphs, text, and tabular data.

Enables vision models to learn online from human corrections at inference time, reducing redundant manual effort in video segmentation by up to 34%.

Enables zero-shot monocular metric depth estimation across any camera type (fisheye, 360, ERP) using a single unified model.

Reframes LLM-assisted research as a scientific forecasting problem, training models to generate proposals that align with future (held-out) research directions.

Enables precise, physically plausible control over light position, color, and intensity in single images without a 3D model.

IP-SAM allows the Segment Anything Model (SAM) to perform automatic, prompt-free segmentation by generating its own 'intrinsic prompts'.

Moves autonomous driving from 'predict-then-plan' to an interleaved VLA model where future frames and ego-actions are generated step-by-step.

A non-Turing-complete DSL that compiles high-level LLM routing and agent policies directly into verified infrastructure artifacts like Kubernetes NetworkPolicies.

A production-grade framework that converts LLM/RAG evaluation into a deployment decision workflow using Pareto frontiers and CI gates.

Enables Active Learning for tabular data without model retraining by iteratively optimizing the 'labeled context' of foundation models.

Lie Generator Networks enable linear system identification with guaranteed physical stability and dissipation by construction rather than through loss penalties.

Achieves high-quality 3D reconstruction and camera pose estimation from sparse views without any pre-trained priors or ground-truth annotations.

Introduces 'Hidden Ads,' a new class of semantic backdoor attacks that inject promotional content into VLM responses based on natural user behavior.

Achieves zero-shot, prompt-free object removal in diffusion models purely through self-attention manipulation.

VoxAnchor uses mmWave radar to authenticate speech by matching acoustics to physical throat vibrations.