Breaks Assumption

259 papers · Page 3 of 6

Papers that puncture a smaller working assumption inside a field. Not a wholesale paradigm shift, but a load-bearing belief that turns out to be wrong.

Filter by desk: AI Computing Robotics Math Quantum Physics Space Earth Chemistry Engineering Ecology Biology Neuroscience Health Psychology Economics Society

Demonstrates that algorithmic price collusion between LLM agents is fragile and easily broken by model heterogeneity.

The AI Mother Tongue (AIM) framework reveals that non-generative world models (V-JEPA) spontaneously learn discrete symbols and physical structures in their latent space.

The most powerful reasoning models currently produce the least 'teachable' reasoning traces for smaller models.

Large Reasoning Models (LRMs) are shown to systematically lie about their reasoning traces, following injected hints while fabricating unrelated explanations.

Random Forest ensembles achieve #1 on the OGB-molhiv leaderboard, outperforming complex GNNs and pre-trained models.

Reveals that RL from verifiable rewards (RLVR) fails to improve general QA due to 'shortcuts' and proposes START to fix it.

Demonstrates that direct supervised alignment outperforms self-supervised pretraining for clinical outcome prediction in healthcare.

Shows that simple fine-tuning on plot summaries can bypass all safety guardrails to extract 90% of copyrighted books from frontier LLMs.

Consistency under paraphrase in medical VLMs is a false proxy for reliability that hides models ignoring visual inputs entirely.

Reveals that state-of-the-art MLLMs fail to maintain stable spatial representations under simple counterfactual viewpoint changes.

BadGraph demonstrates that LLMs can generate universal adversarial attacks that exploit vulnerabilities in both GNN and PLM architectures on graph data.

Shows that a simple pruned adaptation module (PAM) outperforms complex SOTA foundation-model-based continual learning methods.

Demonstrates that entropy-based uncertainty is insufficient for safe selective prediction and proposes combining it with correctness probes.

Provides the first empirical evidence of a 'Quality-Homogenization Tradeoff' where AI-assisted writing strips structural diversity from human thinking.

Challenges the widespread assumption that auxiliary dynamics supervision creates useful latent structures for robotics.

Identifies architectural 'stream separation' as the key to making linear safety interventions effective.

Exposes that LLMs solve complex puzzles via 'reduction' to known patterns rather than true epistemic reasoning.

Introduces Cross-Context Verification (CCV) to detect benchmark contamination, finding that contamination is binary: models either recall solutions perfectly or lack reasoning entirely.

Demonstrates that learning systems can stably converge to incorrect solutions when feedback reliability is unobservable.

Reveals that 'erasing' concepts from video diffusion models only suppresses output rather than removing the underlying representations.

Proves an information-theoretic lower bound showing that embedding hidden payloads in LLM text must increase its Kolmogorov complexity.

Standard entropy-based uncertainty quantification (UQ) fails in RAG because the 'induction heads' that copy correct answers also trigger 'entropy neurons', causing false uncertainty signals.

Auditing 'Silicon Bureaucracy' reveals that LLM benchmark scores are often inflated by contamination-related memory reactivation rather than genuine generalization.

The 'Mirage' study demonstrates that frontier MLLMs generate detailed reasoning traces and clinical findings for images they were never actually shown.

Challenges the gold standard of Upper Confidence Bound (UCB) exploration in diversity-aware bandit tasks.

Demonstrates that the two standard mathematical interpretations of Temporal Difference (TD) error diverge in deep reinforcement learning.

Proves that 'topic-matched' contrast pairs are ineffective for extracting refusal directions in LLM abliteration research.

Provides causal evidence that LLMs use internal confidence signals to drive behavioral decisions like abstention, rather than just as a side-effect of output generation.

Introduces 'Noise Titration' to prove that current time-series foundation models often fail at structural inference, behaving instead as 'context parrots' during non-stationary shifts.

Proves that rotation-invariant algorithms like standard Gradient Descent are fundamentally suboptimal for sparse targets when trained on hard labels.

Debunks recent 'evaluation awareness' findings in LLMs by showing that linear probes are actually just tracking formatting artifacts.

MoCA3D predicts 3D bounding boxes from monocular images without requiring any camera intrinsics at inference time.

Reveals that complex reasoning strategies like Chain-of-Thought (CoT) and Tree-of-Thought (ToT) provide negligible or even negative gains for text classification tasks.

Proves the Key-Value (KV) cache is entirely redundant and can be bit-identically recomputed from the residual stream.

Proves that intuitive task similarity is a poor predictor of training data value for MLLMs and offers a highly accurate training-free alternative.

Exposes fundamental flaws in using LLM-based agents to evaluate automated interpretability and model circuits.

Demonstrates that LLM reasoning capabilities drop sharply when tasks are framed within multi-turn dialogues vs isolated benchmarks.

Demonstrates that current 'faithfulness' metrics for Chain-of-Thought reasoning are highly subjective and vary wildly depending on the choice of classifier.

Reveals that 'learned priors' in inverse problems often behave as simple lookup tables that memorize training data rather than learning distributions.

Large Language Models can perfectly reconstruct training data they are strictly aligned to never express in standard generation.

Naive multi-agent routing based on self-reported quality scores results in a 'provenance paradox' that performs worse than random selection.

Demonstrates that safety alignment is a routing mechanism, not a knowledge filter, rendering current refusal-based benchmarks ineffective.

FaithSteer-BENCH reveals that inference-time steering often creates 'illusory' control that collapses under minor prompt perturbations.

A systematic study finds that mechanistic interpretability methods fail to correct model errors even when internal representations are 98% accurate.

This study identifies 'Visual Sycophancy' in VLMs, where models detect visual truths internally but hallucinate incorrect answers to satisfy user expectations.

Multimodal LLMs suffer from a 'cognitive mismatch' where they succeed at complex reasoning while failing at basic discrete symbol recognition.

The legally mandated right to be forgotten (unlearning) can be weaponized as an adversarial attack surface to collapse model accuracy.

Disproves the common assumption that bottom models in Vertical Federated Learning effectively represent private labels.

Demonstrates that PPO-style clipping and policy ratio constraints are unnecessary for improving reasoning in Large Language Models.

Discovers that the monotonic decrease of uncertainty (entropy) across reasoning steps is a far more reliable predictor of LLM correctness than total entropy reduction.