The week ahead: 20251208
-
[News] Trumps launches the Genesis Mission, a government-wide effort to use federal scientific datasets to train models.
-
[Tweet] Super long Gemini 3 Pro system instruction, reported have a significant impact on various agentic benchmarks.
-
[Company] Conductor - Run parallel coding agents in isolated environments. YC company page.
-
[Tweet] DeepSeek-V3.2 & DeepSeek-V3.2-Speciale.
-
[Paper] In RL, penalizing mistakes seems to contribute more than rewarding successes.
-
[Paper] Imitation-based interactions on the individual level can be re-interpreted as collective RL at the group level. In other words, bandit RL can be implemented as a collection of very simple parts.
-
[Blog] Feedback descent: use textual feedback to learn instead of a single score. When model weights are too entangled, this allows learning to continue in semantic space. Relevant paper.
-
[Blog] Best practices for AI code reviews.
-
[Blog] What does it mean to ship a project?
-
[News] Claude is dead.
- [專欄]郵件工具的PMF。
Have a great week!
Comments