LLMs Research
Home
Podcast
Chat
About
Home
Podcast
Chat
About
Subscribe
Sign in
Latest
Top
Discussions
We Built TurboAngle: Near-Lossless KV Cache Compression Without Calibration
14.8× less perplexity degradation than Google's TurboQuant at fewer bits, by quantizing angles instead of coordinates
Apr 5
•
LLMs Research
51
2
4
February 2026
Recursive Language Models: Treating the Prompt as Code
Recursive Language Models process millions of tokens by storing prompts as REPL variables. No architecture changes, no fine-tuning. 0% baseline to 91…
Feb 23
•
LLMs Research
12
1
How GLM Went From Fill-in-the-Blank (2021) to 744 Billion Parameters in 2026
GLM architecture evolution from blank infilling to 744B MoE: what worked at each stage, from Tsinghua lab to Zhipu AI's $19B IPO on Huawei Ascend chips.
Feb 14
•
LLMs Research
5
2
Your 70-Billion-Parameter Model Might Be 40% Wasted
February 1–6, 2026: Three papers converge on a decade-old suspicion. Most transformer layers aren't building knowledge. They're averaging noise.
Feb 11
•
LLMs Research
34
4
2
12:24
Fixing Reasoning from Three Directions at Once
How reasoning is being debugged across training, memory, and inference. Covering 17 papers from arXiv Feb 1–6, 2026. LLMs Research Podcast.
Feb 7
•
LLMs Research
7
1
12:49
Mamba's Memory Problem
Three ICLR 2026 papers, same bottleneck, different fixes
Feb 2
•
LLMs Research
8
5
7:18
January 2026
What ICLR 2026 Taught Us About Multi-Agent Failures
14 ICLR 2026 papers on why multi-agent systems break: slow pipelines, high costs, cascading errors, brittle graphs, opaque coordination.
Jan 31
•
LLMs Research
5
3
2
What ICLR 2026 Taught Us About Multi-Agent Failures
Lessons on why agent systems break and what solutions these papers provide
Jan 31
•
LLMs Research
2
1
17:13
Jan 17–23, 2026: The Rise of the Action Layer
How multimodal AI is shifting from passive perception to active control
Jan 25
•
LLMs Research
3
1
The Evolution of Long-Context LLMs: From 512 to 10M Tokens
How transformers learned to process millions of tokens
Jan 24
•
LLMs Research
6
15:42
The Evolution of Long-Context LLMs: From 512 to 10M Tokens
How LLMs evolved from 512 to 2M token context windows through sparse attention, FlashAttention, RoPE, and new architectures like Mamba and Ring…
Jan 24
•
LLMs Research
4
2
1
Jan 17–23, 2026: The Rise of the Action Layer
How multimodal AI is shifting from passive perception to active control
Jan 24
•
LLMs Research
4
1
15:57
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts