Dec 18, 2024

TheGP’s Year of AI Debates

by Ben Cmejla

Throughout 2024, we hosted a bi-weekly AI meeting to provide a forum for TheGP’s engineers, founders, and technical leaders in our network to debate the future of artificial intelligence. While plenty of time was spent on recent releases and benchmark results from the usual suspects, the more interesting moments came when the conversation turned speculative. Here's a look at some of the most thought-provoking ideas that emerged:

Winter

January to March Highlights

OpenAI's trust problem: Following the GPT Store launch in early January, the group debated whether OpenAI had earned the developer trust for an “app store” to work. The prevailing view was “no,” with most participants believing any third-party developer success from launching GPTs (or the earlier plugins) would likely be overshadowed by concerns around OpenAI using the data for their own ends.
Fighting AI with AI: The Justice Department’s February appointment of Princeton professor Jonathan Mayer as its first Chief AI Officer spurred a conversation around the red queen, where both law enforcement and criminals adopt AI, nullifying each other’s advantages. Participants highlighted how both sides could end up sprinting to stay in the same place.
Do we want malleable interfaces?: Vercel’s March release of AI SDK 3.0 introduced generative UI, allowing React components to stream dynamically from large language models (LLMs). Designers in the group highlighted the inherent difficulty of creating high-quality UI, noting that delegating this to LLMs could exacerbate existing challenges.

Spring

April to June Highlights

Humans demoted to copilot: In April, we discussed Cognition Labs’s launch of coding agent Devin. After a year dominated by AI copilots in 2023, we debated whether agentic approaches like Devin would accelerate autonomy compared to the gradual evolution of human-controlled tools like Cursor.
Life after AI Receptionists: The Winter Y Combinator batch was bursting with voice AI platforms. After discussing the immediate interoperability benefits of AI receptionists—which simplify business integrations—we speculated on the long-term impact of AI proxies pre-negotiating and coordinating interactions on behalf of humans. How will business-to-customer interactions change when each has an AI proxy pre-exploring, negotiating, or coordinating on their behalf?
Traveling through the dream browser: The conversation took a surreal turn when we covered Websim’s AI-simulated browser, which lets users explore a “hallucinated internet.” While still more of an art project at the time, one person in our network noted: “Websim reminds me of seeing a digital picture for the first time and that ‘aha’ moment where you can kind of imagine everything around you bending into that new format.”

Summer

July to September Highlights

Engineering features, not prompts: One of our most spirited debates centered on Anthropic's Scaling Monosemanticity research. Will perfectly engineered prompts eventually give way to the direct amplification or suppression of interpretable features in models? While many saw potential, one AI researcher pointed out that this approach will "hurt the model’s brain” — the equivalent of poking and prodding a human brain and expecting to affect predictable improvements.
Agents earning their own promotions: Asana’s AI Teammates launch inspired discussions about evolving productivity tools to accommodate AI contributors. We liked Asana’s “assist, act, adapt” progression, which dolls out more responsibility to AI agents as they get better and gain user trust.
“Straight to Pixels” AI experiences: We took a deep dive into Google’s GameNGen and the potential for neural models as gaming engines. Echoing Alex Atallah’s question on X and the earlier conversations around Websim, we explored other categories where neural networks can help us “skip physics and go straight to the pixels.”

Fall

October to December Highlights

Lending a hand to reasoner models: OpenAI’s O1-preview launch in September sparked conversations around test-time compute scaling and “reasoner” models. Aidan Mclaughlin’s critique — that these models excel in domains with clear verification paths like math or coding — led us to explore how adjacent fields like legal contracting or business process automation might adapt to make AI completions more testable.
Building Pro-Human Systems: We spent an entire meeting discussing Altera’s Project Sid — 1000+ autonomous agents collaborating in a virtual world, including some provoking questions around how we can build pro-human agents and systems as we reach the next levels in AI.
AI outsourcing to humans: Seed-stage startup Payman is building a platform for AI agents to outsource tasks to humans. While one of our engineers compared this approach to “mounting a machine gun on a dog,” there was general agreement that human contributions will be a vital function call for agents.

These bi-weekly conversations are always a highlight at TheGP. We want to thank all our guest attendees, including Hanlin Tang, Sami Torbey, Mike Adams, Uri Merhav, Nitai Dean, Pierre Brunelle, Niall O’Higgins, Kate Cook, John Smart, Sasha Aickin and Manasi Vartak for joining. If you're a founder or builder working on AI and would like to join these debates in the new year, please reach out.

And thank you to our team at TheGP for always keeping these timely debates lively and thoughtful: Alec Flett, David Watson, Mikey Wakerly, Stas Baranov, Justin Rosenthal, Lilian Caylee Wang, Marcus Gosling, Ted Mao, Greg Harezlak and Kris Kostelecky.

Up Next

Apr 20, 2025

Partnering with Kenzo Security

Helping defenders move faster than threats

by Dan Portillo

Mar 12, 2025

The Art of Intelligence

AI knows everything we've done, but perhaps nothing we've felt

by Lilian Caylee Wang

Portfolio Team About Field Notes