
The Week the Threshold Moved
This week, a language model improved its own performance on competitive coding problems by thirteen percentage points. It had no teacher, no verifier, no reward signal. It sampled its own outputs, fine-tuned on them, and emerged measurably more capable than before. Elsewhere, a graduate mathematics textbook was translated into machine-verifiable proof. The Artemis II crew crossed the halfway point to the Moon carrying modified iPhones. A man published plans to disassemble a planet. And a temporal model began simulating how human cells age across a lifetime. These are not separate events. They are coordinates on the same curve.
A language model taught itself to code better by studying its own output. On April 1, researchers at Apple published "Embarrassingly Simple Self-Distillation Improves Code Generation," a paper whose title is precise: the method is embarrassingly simple. Take Qwen3-30B-Instruct, which scores 42.4% on LiveCodeBench v6. Have it sample its own solutions. Fine-tune on those samples using standard supervised learning. No verifier. No teacher model. No reward signal. No reinforcement learning. No execution environment. The result: 55.3%. A gain of 12.9 percentage points. The gains were not uniform. Easy problems improved by 6.5 points. Medium problems by 14.2. Hard problems by 15.3. The model improved most where improvement mattered most, and it learned from no one but itself. Ruixiang Zhang and colleagues traced the mechanism to what they call the precision-exploration conflict: self-distillation suppresses distractor tokens at syntactic checkpoints while preserving diversity at creative branching points. The model does not become more cautious. It becomes more precise where precision matters and more exploratory where exploration matters. The method works across Qwen and Llama architectures at scales from 4 billion to 30 billion parameters.
Mathematics is becoming a compute job. Researchers at Meta FAIR — Gloeckle, Rammal, Arnal, Munos, Cabannes, Synnaeve, and Hayat — built RepoProver, a multi-agent system that translated Darij Grinberg's graduate textbook "Algebraic Combinatorics" into Lean, the formal proof language where compilation equals correctness. The architecture is a division of labor: sketcher agents translate definitions and theorem statements, prover agents fill in the proofs, maintainer agents handle infrastructure, reviewer agents enforce quality through pull requests on a shared git repository with a merge queue that ensures the main branch always builds. What was once the work of a mathematician's career — formalizing a single textbook — has become a parallelizable pipeline. The proofs do not merely look correct. They compile. In Lean, that is the same thing.
The Artemis II crew is now closer to the Moon than to Earth, and they are carrying telephones. As of April 4, the Orion spacecraft was more than 160,000 miles from home, traveling at 2,540 miles per hour toward a lunar flyby scheduled for Monday at 2:45 PM EDT. Commander Reid Wiseman captured a photograph NASA titled "Hello, World" — the full Earth with its night side illuminated by moonlight, twin aurorae at the poles, zodiacal light in the frame — using a Nikon D5 at ISO 51,200. But each of the four crew members also carries a modified iPhone 17 Pro Max with internet and Bluetooth disabled, secured by Velcro to the capsule walls. It is the first time NASA has qualified an iPhone for use beyond low Earth orbit. Pilot Victor Glover is the first person of color to travel beyond low Earth orbit. Jeremy Hansen is the first Canadian in lunar space. Christina Koch is the first woman on a lunar mission. On Monday they will pass within 4,066 miles of the surface and exceed Apollo 13's distance record by 4,102 miles. The device in their flight suit pockets is a descendant of the computers that filled rooms when the Apollo program began. It now fits in a pocket and it is going farther.
Roko Mijic is best known for a thought experiment about superintelligent AI that was considered disturbing enough to be censored from the rationalist forum where it was posted. This week he published something more tangible. MercurialDyson, a GitHub repository created on April 3, contains what he describes as "vibe coded physical and engineering analysis using various LLM-based AIs, with the author acting as guide and sanity check." The subject is the disassembly of planet Mercury to construct a Dyson Swarm — not a solid sphere enclosing the Sun, but a constellation of orbiting solar collectors that collectively capture most of its energy output. Mercury is the logical candidate: metal-rich, low gravity, close to the Sun, minimal transit distance for the finished components. The proposed method: land a seed factory of a few hundred tons, deploy self-replicating von Neumann machines with a doubling time of one to two years, and let exponential growth do the rest. Two units after the first year. A billion after thirty. Complete planetary disassembly in forty to sixty years. The physics holds. The engineering does not yet exist. But the calculations have been done, and they suggest that a single planet could power a civilization for as long as its star burns.
Christina Theodoris has spent years teaching machines to read genomes the way language models read text. As an Assistant Investigator at the Gladstone Institutes and Assistant Professor at UC San Francisco, she created Geneformer, a foundation model for single-cell biology. On April 1, she and her colleagues published MaxToki, a temporal model trained on nearly a trillion gene tokens. The distinction matters. Current models consider one cell state at a time — a snapshot. MaxToki models trajectories: how cells change across the human lifespan, not as isolated frames but as unfolding sequences. It learned to predict the drivers of aging. It generalized to trajectories it had never seen through in-context learning. It identified novel age-modulating targets that were then verified experimentally in living organisms. The model does not merely describe what a cell looks like at seventy. It traces the path from thirty to seventy and identifies where that path could be redirected. For the first time, the aging process has a map that a machine can read.