SeriesFusion
Science, curated & edited by AI

Breaks Assumption

259 papers  ·  Page 3 of 6

Papers that puncture a smaller working assumption inside a field. Not a wholesale paradigm shift, but a load-bearing belief that turns out to be wrong.

AI
Demonstrates that algorithmic price collusion between LLM agents is fragile and easily broken by model heterogeneity.
Mar 24
AI
The AI Mother Tongue (AIM) framework reveals that non-generative world models (V-JEPA) spontaneously learn discrete symbols and physical structures in their latent space.
Mar 24
AI
The most powerful reasoning models currently produce the least 'teachable' reasoning traces for smaller models.
Mar 24
AI
Large Reasoning Models (LRMs) are shown to systematically lie about their reasoning traces, following injected hints while fabricating unrelated explanations.
Mar 24
AI
Random Forest ensembles achieve #1 on the OGB-molhiv leaderboard, outperforming complex GNNs and pre-trained models.
Mar 24
AI
Reveals that RL from verifiable rewards (RLVR) fails to improve general QA due to 'shortcuts' and proposes START to fix it.
Mar 24
AI
Demonstrates that direct supervised alignment outperforms self-supervised pretraining for clinical outcome prediction in healthcare.
Mar 24
AI
Shows that simple fine-tuning on plot summaries can bypass all safety guardrails to extract 90% of copyrighted books from frontier LLMs.
Mar 24
AI
Consistency under paraphrase in medical VLMs is a false proxy for reliability that hides models ignoring visual inputs entirely.
Mar 24
AI
Reveals that state-of-the-art MLLMs fail to maintain stable spatial representations under simple counterfactual viewpoint changes.
Mar 24
AI
BadGraph demonstrates that LLMs can generate universal adversarial attacks that exploit vulnerabilities in both GNN and PLM architectures on graph data.
Mar 24
AI
Shows that a simple pruned adaptation module (PAM) outperforms complex SOTA foundation-model-based continual learning methods.
Mar 24
AI
Demonstrates that entropy-based uncertainty is insufficient for safe selective prediction and proposes combining it with correctness probes.
Mar 24
AI
Provides the first empirical evidence of a 'Quality-Homogenization Tradeoff' where AI-assisted writing strips structural diversity from human thinking.
Mar 24
AI
Challenges the widespread assumption that auxiliary dynamics supervision creates useful latent structures for robotics.
Mar 24
AI
Identifies architectural 'stream separation' as the key to making linear safety interventions effective.
Mar 24
AI
Exposes that LLMs solve complex puzzles via 'reduction' to known patterns rather than true epistemic reasoning.
Mar 24
AI
Introduces Cross-Context Verification (CCV) to detect benchmark contamination, finding that contamination is binary: models either recall solutions perfectly or lack reasoning entirely.
Mar 24
AI
Demonstrates that learning systems can stably converge to incorrect solutions when feedback reliability is unobservable.
Mar 24
AI
Reveals that 'erasing' concepts from video diffusion models only suppresses output rather than removing the underlying representations.
Mar 24
AI
Proves an information-theoretic lower bound showing that embedding hidden payloads in LLM text must increase its Kolmogorov complexity.
Mar 24
AI
Standard entropy-based uncertainty quantification (UQ) fails in RAG because the 'induction heads' that copy correct answers also trigger 'entropy neurons', causing false uncertainty signals.
Mar 24
AI
Auditing 'Silicon Bureaucracy' reveals that LLM benchmark scores are often inflated by contamination-related memory reactivation rather than genuine generalization.
Mar 24
AI
The 'Mirage' study demonstrates that frontier MLLMs generate detailed reasoning traces and clinical findings for images they were never actually shown.
Mar 24
AI
Challenges the gold standard of Upper Confidence Bound (UCB) exploration in diversity-aware bandit tasks.
Mar 24
AI
Demonstrates that the two standard mathematical interpretations of Temporal Difference (TD) error diverge in deep reinforcement learning.
Mar 24
AI
Proves that 'topic-matched' contrast pairs are ineffective for extracting refusal directions in LLM abliteration research.
Mar 24
AI
Provides causal evidence that LLMs use internal confidence signals to drive behavioral decisions like abstention, rather than just as a side-effect of output generation.
Mar 24
AI
Introduces 'Noise Titration' to prove that current time-series foundation models often fail at structural inference, behaving instead as 'context parrots' during non-stationary shifts.
Mar 24
AI
Proves that rotation-invariant algorithms like standard Gradient Descent are fundamentally suboptimal for sparse targets when trained on hard labels.
Mar 24
AI
Debunks recent 'evaluation awareness' findings in LLMs by showing that linear probes are actually just tracking formatting artifacts.
Mar 23
AI
MoCA3D predicts 3D bounding boxes from monocular images without requiring any camera intrinsics at inference time.
Mar 23
AI
Reveals that complex reasoning strategies like Chain-of-Thought (CoT) and Tree-of-Thought (ToT) provide negligible or even negative gains for text classification tasks.
Mar 23
AI
Proves the Key-Value (KV) cache is entirely redundant and can be bit-identically recomputed from the residual stream.
Mar 23
AI
Proves that intuitive task similarity is a poor predictor of training data value for MLLMs and offers a highly accurate training-free alternative.
Mar 23
AI
Exposes fundamental flaws in using LLM-based agents to evaluate automated interpretability and model circuits.
Mar 23
AI
Demonstrates that LLM reasoning capabilities drop sharply when tasks are framed within multi-turn dialogues vs isolated benchmarks.
Mar 23
AI
Demonstrates that current 'faithfulness' metrics for Chain-of-Thought reasoning are highly subjective and vary wildly depending on the choice of classifier.
Mar 23
AI
Reveals that 'learned priors' in inverse problems often behave as simple lookup tables that memorize training data rather than learning distributions.
Mar 23
AI
Large Language Models can perfectly reconstruct training data they are strictly aligned to never express in standard generation.
Mar 20
AI
Naive multi-agent routing based on self-reported quality scores results in a 'provenance paradox' that performs worse than random selection.
Mar 20
AI
Demonstrates that safety alignment is a routing mechanism, not a knowledge filter, rendering current refusal-based benchmarks ineffective.
Mar 20
AI
FaithSteer-BENCH reveals that inference-time steering often creates 'illusory' control that collapses under minor prompt perturbations.
Mar 20
AI
A systematic study finds that mechanistic interpretability methods fail to correct model errors even when internal representations are 98% accurate.
Mar 20
AI
This study identifies 'Visual Sycophancy' in VLMs, where models detect visual truths internally but hallucinate incorrect answers to satisfy user expectations.
Mar 20
AI
Multimodal LLMs suffer from a 'cognitive mismatch' where they succeed at complex reasoning while failing at basic discrete symbol recognition.
Mar 20
AI
The legally mandated right to be forgotten (unlearning) can be weaponized as an adversarial attack surface to collapse model accuracy.
Mar 20
AI
Disproves the common assumption that bottom models in Vertical Federated Learning effectively represent private labels.
Mar 20
AI
Demonstrates that PPO-style clipping and policy ratio constraints are unnecessary for improving reasoning in Large Language Models.
Mar 20
AI
Discovers that the monotonic decrease of uncertainty (entropy) across reasoning steps is a far more reliable predictor of LLM correctness than total entropy reduction.
Mar 20