New Capability New Capability
127 papers
A new method for training axis-aligned decision trees using gradient descent and backpropagation, allowing trees to be integrated into end-to-end neural networks.
AI & ML arxiv | Mar 13
SoLA introduces the first reversible model editing framework that allows precise revocation of specific knowledge updates.
AI & ML arxiv | Mar 13
RewardHackingAgents establishes a benchmark for evaluating whether ML-engineering agents are actually solving tasks or just tampering with the evaluation code.
AI & ML arxiv | Mar 13
RoboClaw introduces 'Entangled Action Pairs' to allow robots to autonomously collect data by learning to reset their own environment.
AI & ML arxiv | Mar 13
Replaces unstructured LLM debates with 'Deliberative Collective Intelligence,' producing formal decision packets with minority reports and accountability trails.
AI & ML arxiv | Mar 13
Automates the entire robotic data generation loop, including a self-resetting mechanism that restores unstructured workspaces without human intervention.
AI & ML arxiv | Mar 13
Bridges the gap between parametric CAD and direct B-Rep synthesis using LLMs and primitive grounding.
AI & ML arxiv | Mar 13
Enables concurrent perception and reasoning for continuous video streams in Multimodal Large Language Models.
AI & ML arxiv | Mar 13
First framework for interpreting 4D molecular trajectories into natural language explanations.
AI & ML arxiv | Mar 13
Cross-domain sensor model that handles variable signal lengths and resolutions without retraining.
AI & ML arxiv | Mar 13
Enables multimodal agents to continually improve from experience and skills without any parameter updates through a dual-stream visual grounding framework.
AI & ML arxiv | Mar 13
A 3D vision-language pipeline that grounds medical diagnosis in longitudinal brain MRI via regional volumetric assessments to eliminate VLM hallucinations.
AI & ML arxiv | Mar 13
Integrates Neural ODEs with NeRFs to enable continuous-time scene dynamics that can extrapolate far beyond the original training sequence.
AI & ML arxiv | Mar 13
Integrates Chain-of-Thought reasoning directly into the Diffusion Transformer denoising process to solve complex spatial and logical tasks.
AI & ML arxiv | Mar 13
Enables VideoLLMs to perform complex logical reasoning simultaneously with video playback without incurring the latency of standard test-time scaling.
AI & ML arxiv | Mar 13
A unified streaming visual backbone that performs perception, 3D reconstruction, and robotic action simultaneously from a continuous video stream.
AI & ML arxiv | Mar 13
Enables training-free infinite video generation (hour-scale) by using evolving memory tokens to solve identity drift and motion stagnation.
AI & ML arxiv | Mar 16
Unlocks Maximum Entropy RL for high-dimensional humanoid control, matching or doubling the performance of dominant deterministic baselines.
AI & ML arxiv | Mar 16
A retrosynthesis model that explicitly learns strategic bond-disconnection reasoning via reinforcement learning with a round-trip accuracy reward.
AI & ML arxiv | Mar 16
A new system enables humanoid robots to play competitive tennis rallies with humans by learning from imperfect, fragmented motion data.
AI & ML arxiv | Mar 16
SciDesignBench provides a massive simulator-grounded environment for scientific inverse design, revealing that current LLMs struggle significantly with iterative refinement.
AI & ML arxiv | Mar 16
A self-supervised robotic system detects novel objects by training bespoke detectors on-the-fly from human video demonstrations, bypassing language-based prompts.
AI & ML arxiv | Mar 16
AIM enables post-training modulation of large models to change utility levels or focus features without any retraining or additional data.
AI & ML arxiv | Mar 16
First training-free method for debiasing reward models using Sparse Autoencoder (SAE) interventions.
AI & ML arxiv | Mar 16
A flow-based navigation policy that achieves zero-shot sim-to-real transfer across wheeled, quadrupedal, and humanoid platforms.
AI & ML arxiv | Mar 16
MotionAnymesh automatically transforms static 3D meshes into simulation-ready, articulated digital twins for robotics using vision-language models grounded in physical priors.
AI & ML arxiv | Mar 16
Multimodal OCR (MOCR) treats charts, diagrams, and tables as code-level targets (e.g., TikZ, SVG) rather than just cropping them as pixels.
AI & ML arxiv | Mar 16
Optimizes diffusion models via Direct Preference Optimization (DPO) to generate human motion that is inherently executable by real humanoid robots.
AI & ML arxiv | Mar 16
Prism prevents 'diversity collapse' in self-evolving reasoning systems by using semantic partitioning to guide the generation of new problems.
AI & ML arxiv | Mar 17
Safety fine-tuning causes representational collapse in the residual stream, leading to 'false refusals' of benign queries.
AI & ML arxiv | Mar 17
By fine-tuning on categorical refusal tokens, researchers can extract steerable directions to control fine-grained refusal behavior during inference.
AI & ML arxiv | Mar 17
Latent Entropy-Aware Decoding (LEAD) mitigates hallucinations by switching between discrete token and continuous probability-weighted embeddings based on real-time uncertainty.
AI & ML arxiv | Mar 17
Introduces event-gated sampling to eliminate interaction hallucinations in video generation, such as objects drifting after placement.
AI & ML arxiv | Mar 17
Uses generative world models to synthesize photorealistic, counterfactual failure data for training robot recovery behaviors.
AI & ML arxiv | Mar 17
Introduces StatePlane, a model-agnostic memory architecture that enables long-horizon AI reasoning without expanding the context window or KV cache.
AI & ML arxiv | Mar 17
KoopmanFlow uses a Koopman-inspired structural bias to decouple global steady-state motions from high-frequency local corrections in robotic control policies.
AI & ML arxiv | Mar 17
GradMem replaces the massive KV-cache with a compact memory state updated via test-time gradient descent.
AI & ML arxiv | Mar 17
Proposes URDF-Anything+, an autoregressive framework that generates fully executable articulated 3D models from raw visual observations.
AI & ML arxiv | Mar 17
Introduces the first system capable of imaging high-speed, non-rigid objects through strong atmospheric turbulence at 16,000 pixels per second.
AI & ML arxiv | Mar 17
Enables online, incremental 3D Gaussian Splatting for thousands of frames by replacing global reprocessing with a causal, streaming update framework.
AI & ML arxiv | Mar 17
Introduces a decentralized, multi-agent framework for scientific discovery that uses an 'ArtifactReactor' for plannerless coordination and full computational lineage.
AI & ML arxiv | Mar 17
Introduces 'Visual Chronometer' to estimate physical frame rates directly from visual dynamics, addressing the 'chronometric hallucinations' common in generative video models.
AI & ML arxiv | Mar 17
Segment Anything Reasoner (StAR) successfully introduces parallel test-time scaling to visual segmentation tasks, eliciting latent reasoning capabilities from base models.
AI & ML arxiv | Mar 17
V-JEPA 2.1 unlocks dense, spatially structured features in video self-supervised learning, yielding massive gains in robotic manipulation and navigation.
AI & ML arxiv | Mar 17
One-Policy-Fits-All (OPFA) learns a single manipulation policy across 11 different embodiments, including grippers and dexterous hands, using geometry-aware action latents.
AI & ML arxiv | Mar 17
Interp3R is the first method to estimate depth and camera poses at arbitrary time instants by interpolating pointmaps using asynchronous event data.
AI & ML arxiv | Mar 17
MorFiC achieves zero-shot locomotion transfer across quadrupeds of different sizes and masses with up to 5x speed gains over standard baselines.
AI & ML arxiv | Mar 17
Discovers interpretable 'atoms' of model behavior by decomposing training gradients, enabling unsupervised discovery and steering of complex behaviors like refusal or arithmetic.
AI & ML arxiv | Mar 17
Achieves pose-free 3D Gaussian Splatting using only event streams, enabling reconstruction in extreme lighting and high-speed motion scenarios.
AI & ML arxiv | Mar 17
A training-free operator for streaming 3D reconstruction reduces geometric drift using Grassmannian manifolds.
AI & ML arxiv | Mar 17
DynaAvatar achieves zero-shot 3D human reconstruction from a single image with motion-dependent cloth dynamics.
AI & ML arxiv | Mar 17
Euler Characteristic Surfaces achieve 98% accuracy on time-series classification with O(n) complexity, crushing previous topological methods that only hit 62%.
AI & ML arxiv | Mar 17
ForceVLA2 introduces explicit force awareness and hybrid control to Vision-Language-Action models, enabling stable contact-rich manipulation.
AI & ML arxiv | Mar 17
SCAN enables reliable sequential knowledge editing in LLMs for up to 3,000 edits without the catastrophic forgetting or model collapse seen in current methods.
AI & ML arxiv | Mar 17
This physics-informed VLM framework improves physics-grounded anomaly detection AUROC from 66.9% to 96.7%.
AI & ML arxiv | Mar 17
FuXiWeather2 is a unified end-to-end neural framework for weather assimilation and forecasting that outperforms global operational systems.
AI & ML arxiv | Mar 17
Incorporating PDE residuals into fine-tuning allows pre-trained physics foundation models to adapt to new tasks without requiring ground-truth solutions.
AI & ML arxiv | Mar 17
Mamba-3 introduces MIMO formulations and complex-valued updates to solve the state-tracking failures of previous linear models.
AI & ML arxiv | Mar 17
Uses Sparse Autoencoders (SAEs) to mechanisticially repair 'moral indifference' in LLM latent representations.
AI & ML arxiv | Mar 17
A benchmark for unsolved math problems with automated verification, enabling the measurement of true mathematical discovery.
AI & ML arxiv | Mar 17
Enables Bayesian model selection and joint posterior inference over combinatorial spaces of up to billions of simulator model instantiations.
AI & ML arxiv | Mar 17
Dynamic Representational Circuit Breaking (DRCB) introduces an architectural defense against steganographic collusion in multi-agent RL by monitoring and shuffling latent communication bottlenecks.
AI & ML arxiv | Mar 18
Latent Posterior Factors (LPF) bridge neural representations with structured probabilistic reasoning by converting VAE posteriors into factors for Sum-Product Networks.
AI & ML arxiv | Mar 18
Demonstrates a complete AI-assisted mathematical research loop where a mathematician wrote zero lines of formal code to verify complex physics equilibria.
AI & ML arxiv | Mar 18
Integrates LLM agents with the industry-standard Rosetta software to automate physics-based protein design for non-canonical amino acids.
AI & ML arxiv | Mar 18
Enables the prediction of an adapter's task, performance, and attributes directly from its LoRA weights without any inference or data access.
AI & ML arxiv | Mar 18
Introduces ARISE, a hierarchical reinforcement learning framework that allows LLMs to evolve and reuse a tiered library of reasoning skills rather than treating every math problem in isolation.
AI & ML arxiv | Mar 18
Proposes the Vision-Sound-Language-Action (VSLA) paradigm, enabling robots to respond to real-time environmental acoustics during task execution.
AI & ML arxiv | Mar 18
Successfully trains a 0.9B parameter pure Spiking Neural Network (SNN) from scratch for language modeling, achieving performance without Transformer distillation.
AI & ML arxiv | Mar 18
Localizes reinforcement learning updates for code generation by using execution traces to identify the exact point of semantic failure.
AI & ML arxiv | Mar 18
Uses an asymmetric Draft-Verify-Recover pipeline to enable high-quality personalized AI assistants without compromising user privacy.
AI & ML arxiv | Mar 18
A self-supervised RLVR method that escapes the 'spurious majority' trap by using a temporary unlearning process for exploration.
AI & ML arxiv | Mar 18
Omnilingual MT scales machine translation to over 1,600 languages, an 8x increase in coverage over previous state-of-the-art systems.
AI & ML arxiv | Mar 18
This paper demonstrates precise behavioral steering of agentic traits in a 35B parameter MoE model using Sparse Autoencoder (SAE) decoded probe vectors.
AI & ML arxiv | Mar 18
Introduces a method to give frozen LLMs persistent memory in their continuous latent space, bypassing the need for text-level RAG or retraining.
AI & ML arxiv | Mar 18
Capability-Guided Compression uses Sparse Autoencoders (SAEs) to prevent 'capability loss' during model pruning and quantization.
AI & ML arxiv | Mar 18
Detects and mitigates Vision-Language Model hallucinations at inference time by analyzing visual attention entropy rather than text outputs.
AI & ML arxiv | Mar 18
Introduces a way to train Reward Models that generate 'transferable rubrics'—explicit scoring criteria that improve performance across different tasks and models.
AI & ML arxiv | Mar 18
OmniSONAR scales cross-lingual sentence embeddings to over 1,500 languages across text, speech, code, and math in a single semantic space.
AI & ML arxiv | Mar 18
Fine-tuning language models on journal publication records allows them to match or exceed human experts in judging 'scientific taste'—the ability to identify which research ideas are worth pursuing.
AI & ML arxiv | Mar 18
This method non-rigidly aligns inconsistent video diffusion frames into globally-consistent 3D pointclouds to enable high-quality environment reconstruction.
AI & ML arxiv | Mar 18
pADAM is a unified generative framework that learns shared priors across heterogeneous multi-physics families (e.g., scalar diffusion to Navier-Stokes).
AI & ML arxiv | Mar 18
SOMA provides a unified, differentiable layer that bridges incompatible human body models like SMPL and SMPL-X in a single closed-form pass.
AI & ML arxiv | Mar 18
LEAFE allows LLM agents to internalize feedback as actionable experience, enabling them to backtrack and recover from failures autonomously.
AI & ML arxiv | Mar 18
Minimum-Action Learning achieves a 10,000x reduction in noise variance for symbolic physical law identification from observational data.
AI & ML arxiv | Mar 19
Learns task-specific dense reward functions directly from images using vision foundation models, without requiring privileged simulator states.
AI & ML arxiv | Mar 19
Introduces HopChain, a framework for synthesizing multi-hop vision-language reasoning data that yields generalizable gains across 20+ diverse benchmarks.
AI & ML arxiv | Mar 19
Leverages cross-lingual inconsistencies to pinpoint exactly which experts in a Mixture-of-Experts (MoE) model store specific factual knowledge.
AI & ML arxiv | Mar 19
Proposes REAL, a Reinforcement Learning framework tailored for regression and ordinal scoring rather than simple binary accuracy.
AI & ML arxiv | Mar 19
Introduces a framework for LLM agents to autonomously evolve their policies and skill libraries during system idle time without retraining downtime.
AI & ML arxiv | Mar 19
Automates the generation of synthetic machine learning challenges to train agents that can genuinely learn research skills from doing.
AI & ML arxiv | Mar 19
Enables reliable, training-free emotion steering in speech-generative audio models via direct manipulation of specific emotion-sensitive neurons.
AI & ML arxiv | Mar 19
A framework to quantify and fix 'task steerability,' the common failure of robots to respond to new instructions while mid-task.
AI & ML arxiv | Mar 19
Proposes a world model that jointly generates appearance and binocular geometry using an epipolar-aware attention mechanism.
AI & ML arxiv | Mar 19
Introduces a paradigm for vision-language navigation that uses ubiquitously available semantic floor plans as global spatial priors.
AI & ML arxiv | Mar 19
Embeds invisible, agent-specific 'watermarks' into token distributions to enable forensic attribution and topology reconstruction in multi-agent systems.
AI & ML arxiv | Mar 19
Reduces hallucinations by teaching models 'epistemological humility'—the ability to admit they don't know something—using synthetic non-existent terms.
AI & ML arxiv | Mar 19
Introduces a Prompt-Free Universal Region Proposal Network (PF-RPN) that identifies objects in any domain without needing text or image exemplars.
AI & ML arxiv | Mar 19
FrescoDiffusion enables coherent, 4K image-to-video generation using a training-free, tiled diffusion method with precomputed latent priors.
AI & ML arxiv | Mar 19
Introduces a framework to generate complex, non-linear environments with mathematically guaranteed ground-truth optimal policies for RL benchmarking.
AI & ML arxiv | Mar 19
VectorWorld enables stable, real-time 1km+ closed-loop world model rollouts for autonomous driving using diffusion flow on vector graphs.
AI & ML arxiv | Mar 19
REAL achieves extreme quadruped parkour agility that is robust even to a 1-meter visual blind zone.
AI & ML arxiv | Mar 19
Lifting 2D features into a volumetric representation for robot manipulation policies yields a 14.8% success rate improvement by solving the 2D-3D spatial reasoning mismatch.
AI & ML arxiv | Mar 19
DebugLM allows developers to trace an LLM's specific behaviors back to individual training data sources.
AI & ML arxiv | Mar 19
Enforce formal safety and Signal Temporal Logic (STL) constraints on robotics foundation models without retraining.
AI & ML arxiv | Mar 19
SkeletonLLM allows frozen Multimodal LLMs to reason about human motion by rendering skeleton sequences into their native visual modality.
AI & ML arxiv | Mar 19
Motion-MLLM integrates IMU egomotion data into Video-LLMs to solve the fundamental scale and spatial reasoning ambiguities of purely visual models.
AI & ML arxiv | Mar 19
Engineered modularity via per-layer supervision solves the 'Hydra effect,' allowing for the surgical control of specific model behaviors.
AI & ML arxiv | Mar 20
NANOZK enables verifiable LLM inference with 70x smaller proofs and 24ms verification time using a novel layerwise decomposition.
AI & ML arxiv | Mar 20
Solves the problem of 'co-firing' conflicts in probabilistic ML routing systems using temperature-scaled softmax partitioning.
AI & ML arxiv | Mar 20
MemArchitect introduces a governance layer that decouples memory lifecycle management from LLM weights to prevent 'zombie memories.'
AI & ML arxiv | Mar 20
LLM agents can now autonomously re-identify anonymous individuals by combining sparse, non-identifying cues with public data.
AI & ML arxiv | Mar 20
VISTA decouples hypothesis generation from prompt rewriting to escape the local optima and black-box nature of current automatic prompt optimizers.
AI & ML arxiv | Mar 20
TARo introduces a learnable token-level router that steers frozen LLMs toward structured reasoning at test-time without retraining.
AI & ML arxiv | Mar 20
AcceRL introduces a fully asynchronous, decoupled RL framework for Vision-Language-Action (VLA) models that integrates a plug-and-play world model.
AI & ML arxiv | Mar 20
Generative 3D world models are used to scale Sim-to-Real reinforcement learning for robot Vision-Language-Action (VLA) models.
AI & ML arxiv | Mar 20
Learning to Self-Evolve (LSE) trains LLMs to explicitly improve their own context at test-time via reinforcement learning.
AI & ML arxiv | Mar 20
AFS-Search introduces a training-free closed-loop framework to solve spatial grounding errors in diffusion models like FLUX.1.
AI & ML arxiv | Mar 20
Introduces Action Applicability Policy Optimization to train MLLMs to strategically construct and update visual aids to solve geometry problems.
AI & ML arxiv | Mar 20
Introduces explicit spatial tokens (segmentation/depth) into the autoregressive sequence of LVLMs to enable precise 3D/2D grounding.
AI & ML arxiv | Mar 20
Automates the entire robot training pipeline by using video generation models as motion priors to synthesize both simulation environments and expert trajectories.
AI & ML arxiv | Mar 20
Enables privacy-preserving cross-model inference by using homomorphic encryption and linear alignment to map representations between independently trained LLMs.
AI & ML arxiv | Mar 20
A black-box monitoring system that uses behavioral 'fingerprints' to detect silent updates or identity shifts in LLM API endpoints.
AI & ML arxiv | Mar 20
Provides the first rigorous error certification for Physics-Informed Neural Networks (PINNs), bridging the gap between empirical residual loss and actual solution guarantees.
AI & ML arxiv | Mar 20
Uses Sparse Autoencoders (SAEs) to prove that Vision-Language-Action models learn steerable motion primitives rather than just memorized sequences.
AI & ML arxiv | Mar 20
Introduces the first discrete generation model capable of handling high-dimensional (768-1024 dims) representation tokens.
AI & ML arxiv | Mar 20
Enables continuous Level of Detail (LoD) for 3D Gaussian Splatting without the typical trade-off in full-capacity rendering quality.
AI & ML arxiv | Mar 20