12 hours ago Deep Research was unveiled, and I’ve tested it thoroughly, including vs Deepseek R1 with search, Gemini Deep Research and even R1 in Perplexity. It’s a notable step forward, with one big caveat. I’ll go through all the benchmark figures, my initial impression of the o3 model within, and much more. AI Insiders ($9!) - my Patreon with exclusive vids: https://www.patreon.com/AIExplained Deep Research: https://openai.com/index/introducing-deep-research/ https://www.youtube.com/watch?v=YkCDVn3_wiw GAIA Bench: https://openreview.net/forum?id=fibxvahvs3 https://openreview.net/pdf?id=fibxvahvs3 CodeELO:https://arxiv.org/pdf/2501.01257 CamelCamel:https://uk.camelcamelcamel.com/ Deepseek R1 with search: https://chat.deepseek.com/ https://arxiv.org/pdf/2501.12948 HaluBench: https://arxiv.org/pdf/2407.08488 Chapters: 00:00 - Introduction 01:06 - Powered by o3, Humanity’s Last Exam, GAIA 03:55 - Simple Tests 06:00 - Good News vs Deepseek R1 and Gemini Deep Research 09:32 - Bad News on Hallucinations 14:14 - What Can’t it Browse? 14:42 - For Shopping? 16:40 - Final thoughts Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/