New Capability

New Capability

127 papers

A new method for training axis-aligned decision trees using gradient descent and backpropagation, allowing trees to be integrated into end-to-end neural networks.

AI & ML arxiv | Mar 13

SoLA introduces the first reversible model editing framework that allows precise revocation of specific knowledge updates.

AI & ML arxiv | Mar 13

RewardHackingAgents establishes a benchmark for evaluating whether ML-engineering agents are actually solving tasks or just tampering with the evaluation code.

AI & ML arxiv | Mar 13

RoboClaw introduces 'Entangled Action Pairs' to allow robots to autonomously collect data by learning to reset their own environment.

AI & ML arxiv | Mar 13

Replaces unstructured LLM debates with 'Deliberative Collective Intelligence,' producing formal decision packets with minority reports and accountability trails.

AI & ML arxiv | Mar 13

Automates the entire robotic data generation loop, including a self-resetting mechanism that restores unstructured workspaces without human intervention.

AI & ML arxiv | Mar 13

Bridges the gap between parametric CAD and direct B-Rep synthesis using LLMs and primitive grounding.

AI & ML arxiv | Mar 13

Enables concurrent perception and reasoning for continuous video streams in Multimodal Large Language Models.

AI & ML arxiv | Mar 13

First framework for interpreting 4D molecular trajectories into natural language explanations.

AI & ML arxiv | Mar 13

Cross-domain sensor model that handles variable signal lengths and resolutions without retraining.

AI & ML arxiv | Mar 13

Enables multimodal agents to continually improve from experience and skills without any parameter updates through a dual-stream visual grounding framework.

AI & ML arxiv | Mar 13

A 3D vision-language pipeline that grounds medical diagnosis in longitudinal brain MRI via regional volumetric assessments to eliminate VLM hallucinations.

AI & ML arxiv | Mar 13

Integrates Neural ODEs with NeRFs to enable continuous-time scene dynamics that can extrapolate far beyond the original training sequence.

AI & ML arxiv | Mar 13

Integrates Chain-of-Thought reasoning directly into the Diffusion Transformer denoising process to solve complex spatial and logical tasks.

AI & ML arxiv | Mar 13

Enables VideoLLMs to perform complex logical reasoning simultaneously with video playback without incurring the latency of standard test-time scaling.

AI & ML arxiv | Mar 13

A unified streaming visual backbone that performs perception, 3D reconstruction, and robotic action simultaneously from a continuous video stream.

AI & ML arxiv | Mar 13

Enables training-free infinite video generation (hour-scale) by using evolving memory tokens to solve identity drift and motion stagnation.

AI & ML arxiv | Mar 16

Unlocks Maximum Entropy RL for high-dimensional humanoid control, matching or doubling the performance of dominant deterministic baselines.

AI & ML arxiv | Mar 16

A retrosynthesis model that explicitly learns strategic bond-disconnection reasoning via reinforcement learning with a round-trip accuracy reward.

AI & ML arxiv | Mar 16

A new system enables humanoid robots to play competitive tennis rallies with humans by learning from imperfect, fragmented motion data.

AI & ML arxiv | Mar 16

SciDesignBench provides a massive simulator-grounded environment for scientific inverse design, revealing that current LLMs struggle significantly with iterative refinement.

AI & ML arxiv | Mar 16

A self-supervised robotic system detects novel objects by training bespoke detectors on-the-fly from human video demonstrations, bypassing language-based prompts.

AI & ML arxiv | Mar 16

AIM enables post-training modulation of large models to change utility levels or focus features without any retraining or additional data.

AI & ML arxiv | Mar 16

First training-free method for debiasing reward models using Sparse Autoencoder (SAE) interventions.

AI & ML arxiv | Mar 16

A flow-based navigation policy that achieves zero-shot sim-to-real transfer across wheeled, quadrupedal, and humanoid platforms.

AI & ML arxiv | Mar 16

MotionAnymesh automatically transforms static 3D meshes into simulation-ready, articulated digital twins for robotics using vision-language models grounded in physical priors.

AI & ML arxiv | Mar 16

Multimodal OCR (MOCR) treats charts, diagrams, and tables as code-level targets (e.g., TikZ, SVG) rather than just cropping them as pixels.

AI & ML arxiv | Mar 16

Optimizes diffusion models via Direct Preference Optimization (DPO) to generate human motion that is inherently executable by real humanoid robots.

AI & ML arxiv | Mar 16

Prism prevents 'diversity collapse' in self-evolving reasoning systems by using semantic partitioning to guide the generation of new problems.

AI & ML arxiv | Mar 17

Safety fine-tuning causes representational collapse in the residual stream, leading to 'false refusals' of benign queries.

AI & ML arxiv | Mar 17

By fine-tuning on categorical refusal tokens, researchers can extract steerable directions to control fine-grained refusal behavior during inference.

AI & ML arxiv | Mar 17

Latent Entropy-Aware Decoding (LEAD) mitigates hallucinations by switching between discrete token and continuous probability-weighted embeddings based on real-time uncertainty.

AI & ML arxiv | Mar 17

Introduces event-gated sampling to eliminate interaction hallucinations in video generation, such as objects drifting after placement.

AI & ML arxiv | Mar 17

Uses generative world models to synthesize photorealistic, counterfactual failure data for training robot recovery behaviors.

AI & ML arxiv | Mar 17

Introduces StatePlane, a model-agnostic memory architecture that enables long-horizon AI reasoning without expanding the context window or KV cache.

AI & ML arxiv | Mar 17

KoopmanFlow uses a Koopman-inspired structural bias to decouple global steady-state motions from high-frequency local corrections in robotic control policies.

AI & ML arxiv | Mar 17

GradMem replaces the massive KV-cache with a compact memory state updated via test-time gradient descent.

AI & ML arxiv | Mar 17

Proposes URDF-Anything+, an autoregressive framework that generates fully executable articulated 3D models from raw visual observations.

AI & ML arxiv | Mar 17

Introduces the first system capable of imaging high-speed, non-rigid objects through strong atmospheric turbulence at 16,000 pixels per second.

AI & ML arxiv | Mar 17

Enables online, incremental 3D Gaussian Splatting for thousands of frames by replacing global reprocessing with a causal, streaming update framework.

AI & ML arxiv | Mar 17

Introduces a decentralized, multi-agent framework for scientific discovery that uses an 'ArtifactReactor' for plannerless coordination and full computational lineage.

AI & ML arxiv | Mar 17

Introduces 'Visual Chronometer' to estimate physical frame rates directly from visual dynamics, addressing the 'chronometric hallucinations' common in generative video models.

AI & ML arxiv | Mar 17

Segment Anything Reasoner (StAR) successfully introduces parallel test-time scaling to visual segmentation tasks, eliciting latent reasoning capabilities from base models.

AI & ML arxiv | Mar 17

V-JEPA 2.1 unlocks dense, spatially structured features in video self-supervised learning, yielding massive gains in robotic manipulation and navigation.

AI & ML arxiv | Mar 17

One-Policy-Fits-All (OPFA) learns a single manipulation policy across 11 different embodiments, including grippers and dexterous hands, using geometry-aware action latents.

AI & ML arxiv | Mar 17

Interp3R is the first method to estimate depth and camera poses at arbitrary time instants by interpolating pointmaps using asynchronous event data.

AI & ML arxiv | Mar 17

MorFiC achieves zero-shot locomotion transfer across quadrupeds of different sizes and masses with up to 5x speed gains over standard baselines.

AI & ML arxiv | Mar 17

Discovers interpretable 'atoms' of model behavior by decomposing training gradients, enabling unsupervised discovery and steering of complex behaviors like refusal or arithmetic.

AI & ML arxiv | Mar 17

Achieves pose-free 3D Gaussian Splatting using only event streams, enabling reconstruction in extreme lighting and high-speed motion scenarios.

AI & ML arxiv | Mar 17

A training-free operator for streaming 3D reconstruction reduces geometric drift using Grassmannian manifolds.

AI & ML arxiv | Mar 17

DynaAvatar achieves zero-shot 3D human reconstruction from a single image with motion-dependent cloth dynamics.

AI & ML arxiv | Mar 17

Euler Characteristic Surfaces achieve 98% accuracy on time-series classification with O(n) complexity, crushing previous topological methods that only hit 62%.

AI & ML arxiv | Mar 17

ForceVLA2 introduces explicit force awareness and hybrid control to Vision-Language-Action models, enabling stable contact-rich manipulation.

AI & ML arxiv | Mar 17

SCAN enables reliable sequential knowledge editing in LLMs for up to 3,000 edits without the catastrophic forgetting or model collapse seen in current methods.

AI & ML arxiv | Mar 17

This physics-informed VLM framework improves physics-grounded anomaly detection AUROC from 66.9% to 96.7%.

AI & ML arxiv | Mar 17

FuXiWeather2 is a unified end-to-end neural framework for weather assimilation and forecasting that outperforms global operational systems.

AI & ML arxiv | Mar 17

Incorporating PDE residuals into fine-tuning allows pre-trained physics foundation models to adapt to new tasks without requiring ground-truth solutions.

AI & ML arxiv | Mar 17

Mamba-3 introduces MIMO formulations and complex-valued updates to solve the state-tracking failures of previous linear models.

AI & ML arxiv | Mar 17

Uses Sparse Autoencoders (SAEs) to mechanisticially repair 'moral indifference' in LLM latent representations.

AI & ML arxiv | Mar 17

A benchmark for unsolved math problems with automated verification, enabling the measurement of true mathematical discovery.

AI & ML arxiv | Mar 17

Enables Bayesian model selection and joint posterior inference over combinatorial spaces of up to billions of simulator model instantiations.

AI & ML arxiv | Mar 17

Dynamic Representational Circuit Breaking (DRCB) introduces an architectural defense against steganographic collusion in multi-agent RL by monitoring and shuffling latent communication bottlenecks.

AI & ML arxiv | Mar 18

Latent Posterior Factors (LPF) bridge neural representations with structured probabilistic reasoning by converting VAE posteriors into factors for Sum-Product Networks.

AI & ML arxiv | Mar 18

Demonstrates a complete AI-assisted mathematical research loop where a mathematician wrote zero lines of formal code to verify complex physics equilibria.

AI & ML arxiv | Mar 18

Integrates LLM agents with the industry-standard Rosetta software to automate physics-based protein design for non-canonical amino acids.

AI & ML arxiv | Mar 18

Enables the prediction of an adapter's task, performance, and attributes directly from its LoRA weights without any inference or data access.

AI & ML arxiv | Mar 18

Introduces ARISE, a hierarchical reinforcement learning framework that allows LLMs to evolve and reuse a tiered library of reasoning skills rather than treating every math problem in isolation.

AI & ML arxiv | Mar 18

Proposes the Vision-Sound-Language-Action (VSLA) paradigm, enabling robots to respond to real-time environmental acoustics during task execution.

AI & ML arxiv | Mar 18

Successfully trains a 0.9B parameter pure Spiking Neural Network (SNN) from scratch for language modeling, achieving performance without Transformer distillation.

AI & ML arxiv | Mar 18

Localizes reinforcement learning updates for code generation by using execution traces to identify the exact point of semantic failure.

AI & ML arxiv | Mar 18

Uses an asymmetric Draft-Verify-Recover pipeline to enable high-quality personalized AI assistants without compromising user privacy.

AI & ML arxiv | Mar 18

A self-supervised RLVR method that escapes the 'spurious majority' trap by using a temporary unlearning process for exploration.

AI & ML arxiv | Mar 18

Omnilingual MT scales machine translation to over 1,600 languages, an 8x increase in coverage over previous state-of-the-art systems.

AI & ML arxiv | Mar 18

This paper demonstrates precise behavioral steering of agentic traits in a 35B parameter MoE model using Sparse Autoencoder (SAE) decoded probe vectors.

AI & ML arxiv | Mar 18

Introduces a method to give frozen LLMs persistent memory in their continuous latent space, bypassing the need for text-level RAG or retraining.

AI & ML arxiv | Mar 18

Capability-Guided Compression uses Sparse Autoencoders (SAEs) to prevent 'capability loss' during model pruning and quantization.

AI & ML arxiv | Mar 18

Detects and mitigates Vision-Language Model hallucinations at inference time by analyzing visual attention entropy rather than text outputs.

AI & ML arxiv | Mar 18

Introduces a way to train Reward Models that generate 'transferable rubrics'—explicit scoring criteria that improve performance across different tasks and models.

AI & ML arxiv | Mar 18

OmniSONAR scales cross-lingual sentence embeddings to over 1,500 languages across text, speech, code, and math in a single semantic space.

AI & ML arxiv | Mar 18

Fine-tuning language models on journal publication records allows them to match or exceed human experts in judging 'scientific taste'—the ability to identify which research ideas are worth pursuing.

AI & ML arxiv | Mar 18

This method non-rigidly aligns inconsistent video diffusion frames into globally-consistent 3D pointclouds to enable high-quality environment reconstruction.

AI & ML arxiv | Mar 18

pADAM is a unified generative framework that learns shared priors across heterogeneous multi-physics families (e.g., scalar diffusion to Navier-Stokes).

AI & ML arxiv | Mar 18

SOMA provides a unified, differentiable layer that bridges incompatible human body models like SMPL and SMPL-X in a single closed-form pass.

AI & ML arxiv | Mar 18

LEAFE allows LLM agents to internalize feedback as actionable experience, enabling them to backtrack and recover from failures autonomously.

AI & ML arxiv | Mar 18

Minimum-Action Learning achieves a 10,000x reduction in noise variance for symbolic physical law identification from observational data.

AI & ML arxiv | Mar 19

Learns task-specific dense reward functions directly from images using vision foundation models, without requiring privileged simulator states.

AI & ML arxiv | Mar 19

Introduces HopChain, a framework for synthesizing multi-hop vision-language reasoning data that yields generalizable gains across 20+ diverse benchmarks.

AI & ML arxiv | Mar 19

Leverages cross-lingual inconsistencies to pinpoint exactly which experts in a Mixture-of-Experts (MoE) model store specific factual knowledge.

AI & ML arxiv | Mar 19

Proposes REAL, a Reinforcement Learning framework tailored for regression and ordinal scoring rather than simple binary accuracy.

AI & ML arxiv | Mar 19

Introduces a framework for LLM agents to autonomously evolve their policies and skill libraries during system idle time without retraining downtime.

AI & ML arxiv | Mar 19

Automates the generation of synthetic machine learning challenges to train agents that can genuinely learn research skills from doing.

AI & ML arxiv | Mar 19

Enables reliable, training-free emotion steering in speech-generative audio models via direct manipulation of specific emotion-sensitive neurons.

AI & ML arxiv | Mar 19

A framework to quantify and fix 'task steerability,' the common failure of robots to respond to new instructions while mid-task.

AI & ML arxiv | Mar 19

Proposes a world model that jointly generates appearance and binocular geometry using an epipolar-aware attention mechanism.

AI & ML arxiv | Mar 19

Introduces a paradigm for vision-language navigation that uses ubiquitously available semantic floor plans as global spatial priors.

AI & ML arxiv | Mar 19

Embeds invisible, agent-specific 'watermarks' into token distributions to enable forensic attribution and topology reconstruction in multi-agent systems.

AI & ML arxiv | Mar 19

Reduces hallucinations by teaching models 'epistemological humility'—the ability to admit they don't know something—using synthetic non-existent terms.

AI & ML arxiv | Mar 19

Introduces a Prompt-Free Universal Region Proposal Network (PF-RPN) that identifies objects in any domain without needing text or image exemplars.

AI & ML arxiv | Mar 19

FrescoDiffusion enables coherent, 4K image-to-video generation using a training-free, tiled diffusion method with precomputed latent priors.

AI & ML arxiv | Mar 19

Introduces a framework to generate complex, non-linear environments with mathematically guaranteed ground-truth optimal policies for RL benchmarking.

AI & ML arxiv | Mar 19

VectorWorld enables stable, real-time 1km+ closed-loop world model rollouts for autonomous driving using diffusion flow on vector graphs.

AI & ML arxiv | Mar 19

REAL achieves extreme quadruped parkour agility that is robust even to a 1-meter visual blind zone.

AI & ML arxiv | Mar 19

Lifting 2D features into a volumetric representation for robot manipulation policies yields a 14.8% success rate improvement by solving the 2D-3D spatial reasoning mismatch.

AI & ML arxiv | Mar 19

DebugLM allows developers to trace an LLM's specific behaviors back to individual training data sources.

AI & ML arxiv | Mar 19

Enforce formal safety and Signal Temporal Logic (STL) constraints on robotics foundation models without retraining.

AI & ML arxiv | Mar 19

SkeletonLLM allows frozen Multimodal LLMs to reason about human motion by rendering skeleton sequences into their native visual modality.

AI & ML arxiv | Mar 19

Motion-MLLM integrates IMU egomotion data into Video-LLMs to solve the fundamental scale and spatial reasoning ambiguities of purely visual models.

AI & ML arxiv | Mar 19

Engineered modularity via per-layer supervision solves the 'Hydra effect,' allowing for the surgical control of specific model behaviors.

AI & ML arxiv | Mar 20

NANOZK enables verifiable LLM inference with 70x smaller proofs and 24ms verification time using a novel layerwise decomposition.

AI & ML arxiv | Mar 20

Solves the problem of 'co-firing' conflicts in probabilistic ML routing systems using temperature-scaled softmax partitioning.

AI & ML arxiv | Mar 20

MemArchitect introduces a governance layer that decouples memory lifecycle management from LLM weights to prevent 'zombie memories.'

AI & ML arxiv | Mar 20

LLM agents can now autonomously re-identify anonymous individuals by combining sparse, non-identifying cues with public data.

AI & ML arxiv | Mar 20

VISTA decouples hypothesis generation from prompt rewriting to escape the local optima and black-box nature of current automatic prompt optimizers.

AI & ML arxiv | Mar 20

TARo introduces a learnable token-level router that steers frozen LLMs toward structured reasoning at test-time without retraining.

AI & ML arxiv | Mar 20

AcceRL introduces a fully asynchronous, decoupled RL framework for Vision-Language-Action (VLA) models that integrates a plug-and-play world model.

AI & ML arxiv | Mar 20

Generative 3D world models are used to scale Sim-to-Real reinforcement learning for robot Vision-Language-Action (VLA) models.

AI & ML arxiv | Mar 20

Learning to Self-Evolve (LSE) trains LLMs to explicitly improve their own context at test-time via reinforcement learning.

AI & ML arxiv | Mar 20

AFS-Search introduces a training-free closed-loop framework to solve spatial grounding errors in diffusion models like FLUX.1.

AI & ML arxiv | Mar 20

Introduces Action Applicability Policy Optimization to train MLLMs to strategically construct and update visual aids to solve geometry problems.

AI & ML arxiv | Mar 20

Introduces explicit spatial tokens (segmentation/depth) into the autoregressive sequence of LVLMs to enable precise 3D/2D grounding.

AI & ML arxiv | Mar 20

Automates the entire robot training pipeline by using video generation models as motion priors to synthesize both simulation environments and expert trajectories.

AI & ML arxiv | Mar 20

Enables privacy-preserving cross-model inference by using homomorphic encryption and linear alignment to map representations between independently trained LLMs.

AI & ML arxiv | Mar 20

A black-box monitoring system that uses behavioral 'fingerprints' to detect silent updates or identity shifts in LLM API endpoints.

AI & ML arxiv | Mar 20

Provides the first rigorous error certification for Physics-Informed Neural Networks (PINNs), bridging the gap between empirical residual loss and actual solution guarantees.

AI & ML arxiv | Mar 20

Uses Sparse Autoencoders (SAEs) to prove that Vision-Language-Action models learn steerable motion primitives rather than just memorized sequences.

AI & ML arxiv | Mar 20

Introduces the first discrete generation model capable of handling high-dimensional (768-1024 dims) representation tokens.

AI & ML arxiv | Mar 20

Enables continuous Level of Detail (LoD) for 3D Gaussian Splatting without the typical trade-off in full-capacity rendering quality.

AI & ML arxiv | Mar 20