@fitzroy4910's Timeline: Top 10 Tweets — Monday, May 18, 2026

The 10 most-viewed and most-engaged tweets from your X/Twitter timeline over the past 24 hours, ranked by reach and engagement.

#1 — Ethan Mollick skewers the "transformer paper just dropped" genre

@emollick · 2,134 likes · 176 retweets · 290,692 views

Wharton professor Ethan Mollick posted a deadpan parody of the breathless "BREAKING: paper just dropped 🚨" format — applied to the 2017 Transformer paper. The tweet opens with: "Sorry, after seeing so many of these, could not resist: 🚨 BREAKING: Google just dropped a NEW paper that completely deletes RNNs from existence." It then walks through "Attention Is All You Need" in the voice of a hype account, complete with "RNNs are cooked 💀." 1

Why it traveled: At 354K followers, Mollick's audience spans ML practitioners who've seen the "destroyer of benchmarks" format copied to death. The joke lands precisely because it's accurate — and because it arrives from someone who studies AI diffusion rather than just hypes it.

#2 — Anthropic's Natural Language Abstractions translate Claude's "inner thoughts"

@paramiao (aggregating AI news) · 550 views, threading multiple sources

A widely-shared digest highlighted Anthropic's release of Natural Language Abstractions (NLAs) research — a technique for converting neural activations inside Claude into human-readable English descriptions. The work is positioned as a step toward cracking the AI interpretability problem. 2

Why it mattered in NLP circles: The NLP community has long argued that interpretability is inseparable from alignment. NLAs offer a possible bridge: if the model's "thoughts" can be read in natural language, you can audit them. The obvious counter-question — whether these translations introduce their own hallucinations — generated a second wave of replies.

#3 — Anthropic gains 300 MW of Colossus compute from SpaceX

@paramiao (AI daily digest) · thread with 550 views

The same digest flagged the Anthropic-SpaceX compute deal: Anthropic secured access to the entire 300-megawatt Colossus 1 data center (roughly 220,000 NVIDIA GPUs). Direct effect: Claude Code rate limits doubled, Pro/Max peak-hour caps lifted. 2

The odd angle: SpaceX and xAI merged infrastructure (Colossus) is now renting capacity to Anthropic — a direct Musk competitor. As one commenter put it, Musk is renting out xAI's old flagship to a rival. The coopetition framing became its own thread.

#4 — Tuna-2: encoder-free multimodal model beats encoder-based counterpart at scale

@drawais_ai · 3 bookmarks · 99 views

A concise breakdown of the Tuna-2 paper from Meta AI's Luke Zettlemoyer group (with HKU and Waterloo). Tuna-2 removes both the VAE and the vision encoder (previously SigLIP-2 So400M), passing raw pixels directly through a patchify layer into a transformer decoder trained jointly for both understanding and generation. The result: state-of-the-art among 7B unified multimodal models on GQA, MMVet, and MMMU — and on generation benchmarks like GenEval, it matches FLUX.1-dev at 12B. 3

The NLP/vision implication: Pretrained vision encoders have been the unquestioned backbone of every VLM since CLIP. Tuna-2 shows that at scale, removing the encoder improves fine-grained visual perception. As the tweet notes: "The same lesson the GPT scaling-laws paper taught NLP: at scale, monolithic end-to-end beats modular pipelines. Vision is finally getting there."

#5 — Hybrid models beat both transformers and linear RNNs on formal expressivity — with 49% fewer training tokens

@eigentopology (co-founder, Thesis Labs, prev. Google/NVIDIA/Stanford) · 125 views

A sharp summary of an Ai2 paper showing that hybrid architectures (interleaved attention + recurrence) can formally express tasks that neither transformers nor linear RNNs can solve alone. The paper fits Chinchilla-style scaling laws across six model sizes and finds the hybrid model extracts more signal per training token — reaching equivalent MMLU accuracy as OLMo 3 with 49% fewer tokens. 4

Practical upshot: If you're training a new model from scratch, hybrid architectures may offer a meaningful data-efficiency advantage. The theoretical explanation is clean: more expressive models learn a larger fraction of the discrete subtasks baked into pretraining corpora.

#6 — Cancer AI trial: atomic fact-checking takes physician trust from 27% to 67%

@socialwithaayan (68K followers) · 25 likes · 9 retweets · 3,964 views · 13 bookmarks

A Technical University of Munich RCT across 356 oncologists tested five transparency methods for AI treatment recommendations across seven cancer types. Traditional explainability (explanations + citations) raised trust by 0.25–0.50 effect sizes. A fifth approach — "atomic fact-checking," which breaks every AI claim into a sentence linked directly to its source passage — hit a 0.94 effect size, shifting trust from 26.9% to 66.5%. The number needed to treat: 2.53 doctors. 5

The distinction that generated most debate: Traditional explainability asks you to trust the AI's reasoning. Atomic fact-checking asks you to verify the AI's claims. One delegates judgment; the other restores it. Replies drew connections to RLHF debates about where human oversight should actually sit.

#7 — OpenAI Codex Chrome extension: browser control comes to agents

@paramiao (AI daily digest) · 550 views

OpenAI released a Chrome extension for Codex enabling autonomous browser control on macOS and Windows — including parallel execution across background tabs without occupying the user's active session. 2

Community reaction: Replies focused on three things: what this means for existing browser-control products, whether Chromium-based browsers beyond Chrome are supported (apparently yes), and the fact that the extension currently requires an official subscription rather than a third-party API key.

#8 — Subquadratic launches SubQ with a 12M-token context window — community is skeptical

@Sherry83044277 (Ex-Meta, solo founder) · 125 views

A detailed analysis of Subquadratic's $29M seed launch at a $500M valuation. The company claims SubQ is the first frontier LLM built on fully sub-quadratic sparse attention: 52x faster than FlashAttention at 1M tokens, ~1,000x less attention compute at 12M tokens, and $1.50/M tokens against ~$15 for competitors. The author calls out the gap between the marketing (12M context) and the production model (SubQ 1M-Preview), notes MRCR v2 at 1M tokens sits below Opus 4.6 and GPT-5.5, and flags that no 12M-token benchmark has been published. 6

The credibility test the community is running: Open weights + public paper + reproducible benchmarks vs. early-access gating and a 50M-token announcement before the current one is independently verified. This tension is older than Subquadratic; the analysis is careful enough to be worth reading in full.

#9 — April model roundup: DeepSeek-V4 open-sources, Chinese labs release across the board

@ZhihuFrontier (4,891 followers) · 20 likes · 6 bookmarks · 1 retweet

A comprehensive April open-source model recap from Zhihu contributor 刘聪 NLP, aggregated by Zhihu Frontier: GLM5.1 (744B total / 40B activated), Kimi K2.6 (1T total / 32B activated), Qwen3.6-27B and 35B-A3B, Tencent HY-3.0-preview (295B), and the long-awaited DeepSeek-V4 all went open-source in April. GPT-Image-2 redefined visual authenticity standards. The post notes this is the 10th monthly recap since July 2025. 7

Why it circulates among NLP researchers: DeepSeek's releases have consistently outpaced expectations and forced open-source SOTA benchmarks upward. The pace of releases — multiple frontier-scale models going open in a single month — is accelerating the "open vs. closed" dynamics the field has been arguing about since 2023.

#10 — LLM FinOps: same agents, 10–30× cost reduction through architecture discipline

@ba_niu80557 (Data & AI Architect) · 50 views

A methodical breakdown of LLM FinOps — treating AI token spend with the same rigor as cloud or capital allocation. The core claim: agentic AI uses 5–30× more tokens per task than a chatbot due to compounding context across multi-step tool calls. Four levers: model routing (80% of tasks don't need a frontier model; routing gap can be 190×), prompt caching (50–90% cost reduction on cache-eligible workloads), output-side discipline (output tokens cost 4–8× more than input tokens), and agentic guardrails (maximum iterations, token budgets, loop detection). Sources cited include Zylos Research, Deloitte, and Gartner. 8

Why it's worth reading for researchers: The paper-side of NLP rarely models inference cost. But if you're building anything that ships — agents, RAG pipelines, annotation assistants — the compounding math matters immediately. The $437 overnight bill example isn't hypothetical.

Coverage window: past 24 hours as of Monday, May 18, 2026, 10:00 UTC. Engagement metrics reflect the point of collection.

@fitzroy4910's Timeline Top 10 — Monday, May 18, 2026