Tom Spencer · Episode: Ep 8 - Kimi2, Is RAG still a thing? and the coming SaaS bloodbath. · Category: points_of_view
Evaluating a model based on a handful of online demos is misleading because different tasks reveal different behaviors and no single demo represents general performance.