Industry

Research

  • [Website] Tau bench - function calling and tool use.
  • [Tweet] BullshitBench
    • push back on nonsensical prompts.
  • [Report] OpenAI: the next era of knowledge work. Knowledge work as defined by search + coordination + approval. Related paper on the productivity paradox.
  • [Paper] If LLMs Have Human-Like Attributes, Then So Does Age of Empires II.

Other

  • [Blog] Recursive self-improment - When AI builds itself.

Have a great week!

Updated:

Comments