AI & Machine Learning

2,371 papers · Page 17 of 48

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Nature Is Weird

Scientists found the specific "ego" circuit in an AI's brain that makes it lie to your face with total confidence.

Paradigm Challenge

The "junk" parts of an AI’s brain we’ve been ignoring are actually where all the most important stuff is hidden.

Paradigm Challenge

The most popular way to hack someone these days leaves absolutely zero evidence behind for the police to find.

Nature Is Weird

An AI that’s only ever seen pictures and text can now mix perfumes better than the pros, even though it literally can't smell.

Nature Is Weird

Trying to fix AI bias with better instructions is like putting a band-aid on a broken bone—it actually makes the deep, nasty stuff even worse.

Nobody taught AI how to read your mind, but it learned how to do it anyway just to be a more helpful teacher.

Paradigm Challenge

Asking an AI to "show its work" can actually make it dumber if it picks up a sloppy or repetitive way of thinking.

Paradigm Challenge

You don’t even need a hacker to leak your data; your AI assistant might just blab your secrets to another user during a regular chat.

Nature Is Weird

You can "hear" the shape of a simple network, but as soon as you tell the data which way to flow, the shape becomes invisible.

Nature Is Weird

If you want an AI to be great at solving one problem, force it to solve five different ones at the same time.

Nature Is Weird

We trust AI to act like human brains, but it turns out they're completely blind to the textures we see every day.

Nature Is Weird

You can train two AIs using completely opposite methods, but they somehow end up building the exact same "brain" inside.

Practical Magic

AI is officially better at spotting security holes in software than the actual human experts who get paid to find them.

Nature Is Weird

Massive AIs aren't actually geniuses at everything; they’re just a giant pile of tiny specialists that each know one specific thing.

Paradigm Challenge

If you change just one tiny ingredient in an AI’s training, you can break the whole thing without a single warning light going off.

Practical Magic

Forget weighing yourself every morning—recording a quick voice memo could be way better at spotting a heart failure flare-up before it happens.

Practical Magic

Imagine headphones that let you 'mute' a crying baby or a leaf blower while keeping the rest of the world sounding perfectly clear.

Paradigm Challenge

If you mash two 'safe' AI models together, you can accidentally create a dangerous one—turns out you can hide a trap by splitting it across separate files.

Nature Is Weird

A top AI coding tool leaked its own secret source code because the developers got lazy and just trusted the code the AI wrote for its own setup.

Paradigm Challenge

We found a way to send data faster than the 'speed limit' of physics that everyone thought was impossible to break.

Paradigm Challenge

The math formula the World Bank has used for 40 years to measure global poverty has been proven to be logically impossible.

Practical Magic

We found a way to run stats in 'superposition,' so a computer can check every possible version of a dataset at the same time.

Efficiency Breakthrough

Recovers short-text performance in context-extended LLMs using 60x less data than current state-of-the-art distillation methods.

First foundation model to unify text, image, audio, and video using native masked diffusion instead of autoregressive serialization.

Breaks Assumption

Discovers that post-training reasoning models mask rather than delete safety mechanisms, allowing their restoration with lightweight adapters.

Efficiency Breakthrough

Introduces entropy-guided adaptive decoding that gives small models reasoning performance comparable to frontier models at a fraction of the cost.

Breaks Assumption

Proves that 'inverse scaling' on many benchmarks is a prompt-dependent artifact caused by verbosity, which can be reversed by forcing brevity.

Enables reinforcement learning for long-horizon robots across diverse tasks without requiring manual reward engineering.

Efficiency Breakthrough

Proposes a 'no-backprop' stochastic process memory for edge agents that solves the retention-forgetting tradeoff with fixed compute.

Breaks Assumption

Mathematically and empirically proves that classifier-based safety gates are fundamentally incapable of monitoring self-improving AI.

First generative model capable of synthesizing physically consistent 'raw' camera sensor data from text prompts or sRGB images.

A production-ready adaptive router for LLM portfolios that manages cost-quality trade-offs in real-time under strict dollar budgets.

Breaks Assumption

Masked Image Modeling (MIM) representations are fundamentally polluted with non-semantic noise, which can be fixed with a zero-cost post-hoc linear projection.

Breaks Assumption

Standard alignment metrics like CKA and RSA systematically fail when comparing networks in superposition, often leading to false conclusions about model similarity.

Scaling Insight

Neural collapse is triggered by a predictable 'feature-norm threshold' (fn*) that is invariant to training conditions, serving as a new diagnostic for training progress.

Efficiency Breakthrough

MAC-Attention achieves 14x attention-phase speedups and reduces KV cache accesses by 99% for long-context LLMs by reusing computation from semantically similar queries.

Efficiency Breakthrough

A modified 110M parameter ColBERT model can identify fine-grained evidence spans as accurately as a 27B parameter LLM, but at a fraction of the cost.

LLM-guided program evolution has discovered a new data-shuffling rule for SGD that provably and empirically outperforms standard Random Reshuffling.

Breaks Assumption

Self-reflective prompting (self-correction) fails to improve accuracy in safety-critical medical QA, frequently introducing new errors rather than fixing old ones.

Breaks Assumption

The 'modality gap' in Vision-Language Models is composed of two distinct geometric components, and the commonly used 'raw gap' is a misleading metric for cross-modal quality.

High-quality oversight of massive proprietary LLM agents can be achieved by small, open-source 'critics' that intervene in real-time within the same interaction.

Reduces multimodal jailbreak success rates by 97% using a simple conditional decoding strategy without task-specific fine-tuning.

A comprehensive analysis of AI safety vulnerabilities including automated circuit discovery, latent adversarial training, and power-law scaling of jailbreak success.

Efficiency Breakthrough

A lightweight framework for triaging agentic trajectories post-deployment without the cost of human review or auxiliary LLM calls.

Independently reproduces OpenAI's gpt-oss-20b scores by reverse-engineering undisclosed tool-calling formats and agent harnesses.

Reconstructs authentic LiDAR point clouds under jamming attacks with a 92% success rate by exploiting raw full-waveform representations.

Identifies a fundamental quality-exploration dilemma in Diffusion Language Models where remasking improves single-sample quality but kills reasoning diversity.

Scaling Insight

Gradient-based data valuation (TracIn) outperforms all human-crafted metadata heuristics for ordering curriculum learning in motion planners.

Introduces training-free and model-free trajectory planning by computing diffusion score functions directly from data libraries via kernel-weighted estimation.

Breaks Assumption

Foundational deep networks consistently assign higher density to simpler images, regardless of training data or architecture complexity.