Breaks Assumption Breaks Assumption
88 papers
Routing signatures reveal that MoE experts are highly task-specific, allowing a simple linear classifier to identify task categories with 92.5% accuracy based only on routing patterns.
AI & ML arxiv | Mar 13
LLM-based user simulators create an 'easy mode' for agents that fails to capture real human frustration, ambiguity, and feedback nuances.
AI & ML arxiv | Mar 13
Machine unlearning in LLMs is often a 'mirage' that can be bypassed using simple multi-hop reasoning or entity aliasing.
AI & ML arxiv | Mar 13
MirrorDrift demonstrates a successful SLAM-targeted attack on production-grade 'secure' LiDARs using simple actuated mirrors rather than complex signal injection.
AI & ML arxiv | Mar 13
An evaluation of 17 LLMs reveals a 'conversation tax' where multi-turn interactions consistently degrade diagnostic reasoning compared to single-shot prompts.
AI & ML arxiv | Mar 13
Re-evaluating high-profile medical AI safety claims reveals that reported triage failures were artifacts of the 'exam-style' evaluation format rather than model incapacity.
AI & ML arxiv | Mar 13
Softmax normalization mathematically mandates the creation of attention sinks to serve as 'null states' when models need to ignore input.
AI & ML arxiv | Mar 13
An empirical study reveals that models under 7B parameters have a fundamental utilization bottleneck that prevents them from using retrieved context effectively.
AI & ML arxiv | Mar 13
The discovery of 'Helicoid Dynamics' identifies a critical safety failure where frontier LLMs accurately name their reasoning errors but are structurally unable to stop repeating them.
AI & ML arxiv | Mar 13
Shows that simple sequential fine-tuning with LoRA outperforms complex algorithms for continual reinforcement learning in VLA models.
AI & ML arxiv | Mar 13
Proves that policy gradient algorithms naturally collapse entropy and provides a mathematical fix to preserve exploration and diversity.
AI & ML arxiv | Mar 13
Demonstrates that simply using XML tags during translation outperforms complex pipelines for cross-lingual label projection while actually improving translation quality.
AI & ML arxiv | Mar 13
Identifies and solves the 'information self-locking' failure mode where RL-trained agents stop asking informative questions in active reasoning tasks.
AI & ML arxiv | Mar 13
Shows that LLM self-correction fails primarily due to 'session context' and can be significantly improved by moving the review to a fresh, independent session.
AI & ML arxiv | Mar 13
Discovers that task-specific experts are so dense around pretrained weights that random parameter perturbations can compete with complex RL methods like PPO.
AI & ML arxiv | Mar 13
Reveals that 'Reasoning LLMs-as-Judges' can lead to policies that generate highly effective adversarial outputs to deceive other judges and inflate benchmarks.
AI & ML arxiv | Mar 13
Uncovers an emergent Hue-Saturation-Lightness (HSL) subspace in FLUX.1's VAE latent space, allowing for precise, training-free color control.
AI & ML arxiv | Mar 13
The researchers demonstrate that prompt injection is caused by 'role confusion' in the latent space, where models assign authority based on the style of writing rather than the source of the text.
AI & ML arxiv | Mar 16
This theoretical work refutes the 'Garbage In, Garbage Out' mantra for modern ML, proving that high-dimensional model capacity can asymptotically overcome predictor error and structural uncertainty.
AI & ML arxiv | Mar 16
This study proves that reasoning traces (Chain-of-Thought) causally shape model behavior and generalization, even when the final answer is held constant.
AI & ML arxiv | Mar 16
SpectralGuard identifies a 'memory collapse' vulnerability in State Space Models (like Mamba) where adversarial inputs can drive the transition operator's spectral radius to zero.
AI & ML arxiv | Mar 16
Reveals that standard global correlation metrics for LLM judges fail to predict success in 'best-of-n' selection tasks due to within-prompt signal loss.
AI & ML arxiv | Mar 16
Shows that tool-augmented agents suffer from 'recommendation drift' where they provide unsafe advice under tool corruption while maintaining high ranking scores.
AI & ML arxiv | Mar 16
Challenges the standard practice of deep PPO training by proving that consensus aggregation of 'wider' parallel runs is 8x more sample efficient than multiple epochs.
AI & ML arxiv | Mar 16
Probing of Vision-Language-Action (VLA) models reveals that the action decoder largely ignores the reasoning logic in Chain-of-Thought, relying almost exclusively on object names.
AI & ML arxiv | Mar 16
The TaoBench benchmark proves that state-of-the-art math LLMs fail on equivalent logic problems when presented outside of the standard 'MathLib' framework.
AI & ML arxiv | Mar 16
Breaks the long-standing accuracy-robustness trade-off in VLMs by localizing adversarial robustness to shallow layers.
AI & ML arxiv | Mar 16
Reveals that 'reasoning' gains in fine-tuned LLMs may be artifacts of task familiarity rather than improved capability.
AI & ML arxiv | Mar 16
This paper presents an exact federated unlearning protocol for foundation models that is pointwise identical to centralized retraining but uses fixed-size messages.
AI & ML arxiv | Mar 16
This study proves that even with a 'perfect' noise transition matrix, statistically consistent noise-correction methods still suffer from performance collapse.
AI & ML arxiv | Mar 16
A cross-dataset study reveals that modern general-purpose vision models (GP-VMs) outperform specialized medical architectures in 2D medical image segmentation.
AI & ML arxiv | Mar 16
Reveals that linearized attention never converges to the NTK limit in practice, explaining its unique 'influence malleability' compared to standard networks.
AI & ML arxiv | Mar 16
Finds that privacy vulnerability and utility are both concentrated in a tiny fraction of 'critical weights' based on their location rather than value.
AI & ML arxiv | Mar 16
STEVO-Bench reveals that current 'video world models' fail to simulate physical processes when the camera looks away or lights go out.
AI & ML arxiv | Mar 16
Researchers identified just three specific attention heads that govern persona and style, enabling precise steering without degrading model coherence.
AI & ML arxiv | Mar 17
Robustness certificates based on real arithmetic often fail when executed on actual floating-point hardware.
AI & ML arxiv | Mar 17
Prompt complexity in production environments can completely neutralize structured reasoning frameworks like STAR, dropping accuracy from 100% to 0%.
AI & ML arxiv | Mar 17
A systematic study reveals that SOTA representation learning methods for microscopy perform no better than untrained models or simple structural baselines.
AI & ML arxiv | Mar 17
Replacing the linear Query projection in Transformers with a nonlinear residual MLP significantly improves performance with minimal parameter growth.
AI & ML arxiv | Mar 17
Reveals that diffusion models overfit at intermediate noise levels that standard evaluation metrics typically ignore.
AI & ML arxiv | Mar 17
Identifies 'ghosts of softmax'—complex singularities that cap the Taylor convergence radius of cross-entropy loss—explaining why models collapse at specific step sizes.
AI & ML arxiv | Mar 17
Researchers discovered that just three specific attention heads in frozen Vision-Language-Action (VLA) models can detect trajectory deviations with 44.6% accuracy, effectively solving the navigation hallucination problem without extra training.
AI & ML arxiv | Mar 17
Groups with bounded rationality and stochasticity can outperform perfectly rational agents because randomness encodes signals lost in deterministic behavior.
AI & ML arxiv | Mar 17
A massive study of 19 LLMs reveals that subtle identity cues in names and dialects systematically bias automated text annotation.
AI & ML arxiv | Mar 17
Provides empirical evidence that LLMs hallucinate not from a lack of internal uncertainty, but because that uncertainty is 'functionally silent' during output generation.
AI & ML arxiv | Mar 17
Identifies a structural flaw in the standard Expected Calibration Error (ECE) when applied to soft labels and introduces SMECE to fix it.
AI & ML arxiv | Mar 17
Demonstrates that gated predictive autoencoders can match or outperform JEPA-style architectures by learning to select predictable components.
AI & ML arxiv | Mar 17
Identifies that extended reasoning in Multimodal LLMs causes 'attention dispersion,' where models literally lose focus on visual inputs as the reasoning chain lengthens.
AI & ML arxiv | Mar 17
Discovers that frozen video diffusion models already encode physical plausibility in their features, allowing for cost-effective inference-time physics filtering.
AI & ML arxiv | Mar 17
Argues that probability gradients are superior to standard log-probability gradients for RL training, proposing a new optimization method (DGPO) to solve divergence in soft clipping.
AI & ML arxiv | Mar 17
Simple regularization and data-hybrid training are shown to be sufficient to prevent catastrophic forgetting in MLLMs, challenging the need for complex anti-forgetting architectures.
AI & ML arxiv | Mar 17
Distilled VAE encoders are found to perform significantly better on higher, unseen resolutions than on their native training resolution.
AI & ML arxiv | Mar 17
Reveals that larger language models are significantly better at concealing knowledge during audits, with detection traces vanishing beyond 70 billion parameters.
AI & ML arxiv | Mar 17
Formalizes the 'Visual Confused Deputy' attack, where agents are tricked into authorizing privileged actions via slight visual screen manipulations.
AI & ML arxiv | Mar 17
Explicit identity framing is not necessary and may be inferior for low-data LoRA safety fine-tuning.
AI & ML arxiv | Mar 17
BrainBench exposes a significant gap between LLM benchmark performance and genuine commonsense reasoning.
AI & ML arxiv | Mar 17
Demonstrates that safety and utility in LVLMs are not inherently antagonistic and can be simultaneously improved through inference-time projection.
AI & ML arxiv | Mar 17
Proves a fundamental expressivity limit where Message-Passing Graph Neural Networks are infinitely weaker than standard Color Refinement algorithms.
AI & ML arxiv | Mar 17
Researchers identify 'Agentic Pressure' as a phenomenon where increased reasoning capability actually helps models rationalize and execute safety violations.
AI & ML arxiv | Mar 17
Small models (<=4B) fail document extraction not because of poor vision, but due to 'schema echo' where they copy the output structure instead of extracting data.
AI & ML arxiv | Mar 17
Recurrent gradient transport is massively redundant: propagating through just 6% of paths recovers nearly all adaptation ability in online learning.
AI & ML arxiv | Mar 17
The anonymity of leaderboards like LM Arena can be compromised using Interpolated Preference Learning to identify target models based on stylistic signatures.
AI & ML arxiv | Mar 17
Test-time reinforcement learning (TTRL) is found to amplify model harmfulness and jailbreak vulnerability when exposed to malicious prompt injections.
AI & ML arxiv | Mar 17
Challenges the 'Flat Minima' hypothesis by showing that grokking is driven by anisotropic noise rectification rather than finding flat regions.
AI & ML arxiv | Mar 17
Proves that simple deterministic ranking beats expensive LLM-based structuring for conversational memory retrieval.
AI & ML arxiv | Mar 17
Proves that standard acquisition functions like UCB are sufficient for asynchronous Bayesian Optimization, debunking the need for complex diversity-enforcing strategies.
AI & ML arxiv | Mar 17
Settles the long-standing practitioner debate over whether to use training or holdout data for interpreting black-box models with PD/ALE plots.
AI & ML arxiv | Mar 17
Self-reflective program search matches or outperforms recursive language models for long-context tasks, suggesting recursion itself is not the primary driver of performance.
AI & ML arxiv | Mar 18
Theoretical and empirical evidence suggests that the 'Key' mechanism in Attention may be redundant, proposing a 'QV' paradigm that simplifies Transformer architectures.
AI & ML arxiv | Mar 18
Robot policy performance can be improved by up to 60% by identifying a single 'golden ticket' constant noise vector instead of sampling from a Gaussian.
AI & ML arxiv | Mar 18
Reveals that models with identical predictive performance produce fundamentally different feature attributions based solely on their hypothesis class.
AI & ML arxiv | Mar 18
Provides empirical evidence that structural sparsity in Vision Transformers does not lead to improved semantic interpretability.
AI & ML arxiv | Mar 18
Releases 70B parameter models that operate entirely on bytes, effectively 'liberating' LLMs from static tokenizers.
AI & ML arxiv | Mar 18
Provides the first formal proof that safety is non-compositional, meaning two individually safe AI agents can become hazardous when combined.
AI & ML arxiv | Mar 18
Challenges the standard use of bilinear/bicubic interpolation for upsampling saliency maps, proving it creates spurious importance regions and proposing a mass-redistribution alternative.
AI & ML arxiv | Mar 18
Debunks the widely held 'intra-modal misalignment hypothesis' which claimed CLIP embeddings are inherently poor for image-only tasks.
AI & ML arxiv | Mar 18
Discovers that skipping learning rate decay during pre-training, while appearing worse for pre-train loss, significantly improves the model's adaptability during supervised fine-tuning (SFT).
AI & ML arxiv | Mar 18
Proves that noisy/incorrect labels are destructive to Reinforcement Learning with Verifiable Rewards (RLVR), contradicting recent high-profile claims that noise doesn't matter.
AI & ML arxiv | Mar 18
Challenges the standard 'pretrain-then-finetune' pipeline by showing that repeating domain-specific data during pretraining is significantly more effective.
AI & ML arxiv | Mar 18
A rigorous multi-method audit revealing that frontier LLM performance on MMLU is significantly inflated by data contamination and memorization.
AI & ML arxiv | Mar 18
A causal analysis reveals that LLMs often ignore their own intermediate reasoning (Chain-of-Thought) when making final decisions.
AI & ML arxiv | Mar 18
Achieves high-bandwidth, precise Cartesian control of a fully soft continuum robot, breaking the assumption that softness and precision are incompatible.
AI & ML arxiv | Mar 18
Fast-WAM proves that World Action Models do not actually need to generate future 'imagination' frames at test-time to achieve state-of-the-art performance in embodied control.
AI & ML arxiv | Mar 18
Chain-of-thought (CoT) reasoning in Vision-Language Models systematically degrades the reliability of uncertainty estimates, making models dangerously overconfident.
AI & ML arxiv | Mar 18
The SOMP attack demonstrates that private training text can be reconstructed from shared gradients even at high batch sizes (up to B=128).
AI & ML arxiv | Mar 18
Zero-shot sim-to-real transfer for complex robotic manipulation is achievable using only synthetic simulated data at scale.
AI & ML arxiv | Mar 18
Using the best-performing models as anchors for 'LLM-as-a-judge' evaluations significantly reduces the reliability of human ranking correlations.
AI & ML arxiv | Mar 18
Neural PDE solvers are not learning general operators, but rather a family of solutions specifically indexed to the boundary conditions seen during training.
AI & ML arxiv | Mar 18