AI Revolution – April 23, 2026

Daily AI briefing — frontier models, research, and infrastructure.

Episode Summary

Today's episode covers 9 stories across 5 topic areas, including: Google unveils 8th-gen TPUs, agent platform, and Workspace AI layer at Cloud Next '26; NVIDIA and Google infrastructure cuts AI inference costs; AI Agent Designs a RISC-V CPU Core From Scratch.

Stories Covered

• Infrastructure

Google unveils 8th-gen TPUs, agent platform, and Workspace AI layer at Cloud Next '26

The Decoder · Apr 22 · Relevance: █████████░ 9/10

Why it matters: Google's 8th-gen TPU split into dedicated training and inference chips signals a major architectural shift in custom AI silicon, directly challenging NVIDIA's dominance. The dual-chip strategy paired with the rebranded Gemini Enterprise Agent Platform represents Google's most aggressive cloud AI infrastructure play to date.

Google unveiled two separate 8th-generation TPUs: one optimized for training and one for inference
The Vertex AI platform has been rebranded as Gemini Enterprise Agent Platform targeting agentic workloads
A new AI layer for Google Workspace was announced alongside the hardware

📖 Read full article

NVIDIA and Google infrastructure cuts AI inference costs

AI News · Apr 23 · Relevance: ████████░░ 8/10

Why it matters: The new A5X bare-metal instances running on NVIDIA Vera Rubin NVL72 rack-scale systems promise up to 10x lower inference costs, which could fundamentally change the economics of deploying large models at scale. This joint hardware-software codesign approach addresses the biggest bottleneck for enterprise AI adoption.

New A5X bare-metal instances will run on NVIDIA Vera Rubin NVL72 rack-scale systems
Architecture aims to deliver up to 10x lower inference costs through hardware-software codesign
Announced at Google Cloud Next as part of a joint NVIDIA-Google roadmap

📖 Read full article

• Research

AI Agent Designs a RISC-V CPU Core From Scratch

IEEE Spectrum AI · Apr 22 · Relevance: ████████░░ 8/10

Why it matters: An agentic AI system designing a complete, functional RISC-V CPU core at 1.5 GHz represents a meaningful milestone in AI-driven hardware design, moving well beyond prior work that only generated circuit fragments or basic logic. This suggests AI could dramatically compress chip design timelines.

Verkor.io's agentic AI system designed a complete RISC-V CPU core called VerCore from scratch
The CPU runs at 1.5 GHz with performance comparable to a 2011-era laptop processor
The key advance is using an end-to-end agentic approach rather than specialized AI for individual design subtasks

📖 Read full article

AI Designs Thermoelectric Generators 10,000 Times Faster Than We Can

IEEE Spectrum AI · Apr 23 · Relevance: ███████░░░ 7/10

Why it matters: Japanese researchers demonstrating AI-designed thermoelectric generators that match state-of-the-art performance while accelerating the design process by four orders of magnitude is a compelling example of AI for scientific discovery with near-term practical applications in energy harvesting.

AI tool designs thermoelectric generators 10,000x faster than conventional simulation-based approaches
Prototypes built from AI recommendations matched performance of current leading designs
Research conducted in Japan, targeting waste heat recovery applications

📖 Read full article

• Applications

OpenAI says its new ChatGPT for Clinicians outperforms doctors on clinical tasks even when they have unlimited time and web access

The Decoder · Apr 23 · Relevance: ████████░░ 8/10

Why it matters: OpenAI launching a dedicated clinical AI product backed by benchmark claims of surpassing physician performance — even with unlimited time and internet — marks a significant step toward domain-specific AI deployment in healthcare. The free pricing model suggests a strategic play for institutional adoption.

ChatGPT for Clinicians is a free, dedicated medical version of ChatGPT for healthcare professionals
GPT-5.4 reportedly outperforms human doctors on clinical benchmarks even when doctors have unlimited time and web access
The product targets medical professionals rather than general consumers

📖 Read full article

OpenAI now lets teams make custom bots that can do work on their own

The Verge · Apr 22 · Relevance: ███████░░░ 7/10

Why it matters: OpenAI's workspace agents represent a significant product evolution from chatbot to autonomous task-execution platform, directly competing with Microsoft Copilot and Google's enterprise agent offerings. The shift to persistent, cloud-based agents that run unattended marks a new phase in enterprise AI deployment.

Workspace agents are available to Business, Enterprise, Edu, and Teachers plan users
Agents are cloud-based and can perform autonomous business tasks like web monitoring and Slack reporting
Powered by Codex, these agents evolve the custom GPTs concept into persistent automation workflows

📖 Read full article

• Industry

Exclusive: Google deepens Thinking Machines Lab ties with new multibillion-dollar deal

TechCrunch AI · Apr 22 · Relevance: ████████░░ 8/10

Why it matters: Mira Murati's Thinking Machines Lab securing a multibillion-dollar Google Cloud deal for NVIDIA GB300-powered infrastructure validates both the startup's frontier ambitions and the continued pattern of hyperscalers bankrolling AI labs through compute deals. This further concentrates frontier AI development among a small number of well-funded players.

Thinking Machines Lab (founded by ex-OpenAI CTO Mira Murati) signed a multibillion-dollar infrastructure deal with Google Cloud
The deal provides access to NVIDIA's latest GB300 chips
This deepens an existing relationship between Murati's lab and Google

📖 Read full article

Musk bets Tesla's AI future on Intel node that isn't finished yet

The Register AI · Apr 23 · Relevance: ███████░░░ 7/10

Why it matters: Tesla committing to Intel's unfinished 14A process for custom AI chips at its Terafab represents a high-stakes vertical integration bet. If it works, Tesla gains supply chain independence from NVIDIA; if the node slips, Tesla's autonomous driving and robotics timelines face cascading delays.

Tesla plans to build custom AI chips on Intel's 14A process, which is still in development
The chips are intended for Tesla's Terafab manufacturing facility
Musk stated Tesla needs to build its own silicon for its AI workloads

📖 Read full article

• Policy

Unauthorized users breach Anthropic's restricted Mythos AI model

The Decoder · Apr 22 · Relevance: ███████░░░ 7/10

Why it matters: The breach of Anthropic's access-restricted Mythos model — which was withheld from public release due to its cybersecurity capabilities — raises serious questions about whether voluntary access controls on dangerous models are sufficient. This is a real-world test case for the frontier model security debate.

Unauthorized users gained access to Anthropic's restricted Mythos model, per Bloomberg
Mythos was deliberately restricted from public access due to its reported vulnerability-finding capabilities
Early independent analysis suggests the model's dangerous capabilities may be overstated

📖 Read full article

Full Transcript

Click to expand full episode transcript

Sam: Google just split its TPU into two separate chips — one for training, one for inference — and that architectural decision tells you everything about where AI infrastructure is heading.

Priya: Welcome to AI Revolution for Thursday, April 23rd, 2026. I'm Priya Nair, here with Sam Kim. Today we're deep in infrastructure week — Google Cloud Next delivered a lot to unpack, from eighth-gen TPUs to a major NVIDIA joint announcement. We've also got an AI agent that designed a working CPU from scratch, OpenAI making a serious move into clinical medicine, and a real security breach at Anthropic involving a model they deliberately kept locked away. Let's get into it.

Sam: So let's start with that TPU split, because it's genuinely interesting from a systems design perspective. For seven generations, Google built a single TPU that had to do everything — training runs, inference serving, fine-tuning. The trade-offs those two workloads demand are fundamentally different. Training is about maximizing sustained throughput on massive matrix multiplications, high memory bandwidth, tolerance for latency, you're going to be running this job for weeks. Inference is almost the opposite — you want low latency, high concurrency, efficient batching of variable-length requests, and you're extremely cost-sensitive because you're serving millions of requests continuously.

Priya: And when you try to optimize a single chip for both, you end up compromising on both. What's interesting architecturally is that this mirrors what we've seen in the CPU world — where workload specialization eventually wins over general-purpose designs when you're operating at sufficient scale. Google is operating at that scale.

Sam: Exactly. And pairing this with the NVIDIA announcement makes the picture clearer. The A5X bare-metal instances running on NVIDIA's Vera Rubin NVL72 rack-scale systems are targeting that same inference cost problem from a different angle. The NVL72 connects 72 Blackwell-successor GPUs in a rack-scale configuration with NVLink — you're essentially treating the entire rack as a single logical accelerator. The claim is up to ten times lower inference costs through hardware-software codesign. That's not just faster hardware, it means rethinking how inference workloads are scheduled, batched, and distributed across the fabric.

Priya: The codesign part is the key word there. You can't just drop new hardware in and get a ten-x improvement. The software stack has to be built around the memory hierarchy and interconnect topology of that specific system. When Google and NVIDIA are co-designing both sides of that equation together, that's a real architectural advantage. For engineers deploying large models in production today, inference cost is the main bottleneck. A ten-x reduction meaningfully changes what's economically viable to deploy.

Sam: And then there's the Vertex AI rebrand to Gemini Enterprise Agent Platform, which is less a technical story and more a product positioning one. Google is making a clear bet that the primary enterprise use case going forward is agentic workloads — multi-step, autonomous task execution — not just inference on individual prompts. The Workspace AI layer fits that same narrative.

Priya: Which leads us directly to OpenAI's workspace agents announcement. They're rolling out cloud-based agents for Business, Enterprise, and Edu plans, powered by Codex, that can run persistently and autonomously — web monitoring, Slack reporting, sales workflows. These are agents that run unattended in the cloud, not assistants you're chatting with.

Sam: The architectural shift here is meaningful. Earlier custom GPTs were essentially stateless — you gave them a system prompt and some tools, and each conversation started fresh. Persistent cloud agents maintain state across sessions, can be scheduled, and integrate into existing business workflows as first-class automation actors. That's a qualitatively different product category. And the timing is notable — Google just rebranded their entire platform around agents, and OpenAI is shipping persistent agents the same week.

Priya: Let's talk about the AI-designed CPU, because this is one of those milestones that's easy to misread in both directions. Verkor.io built an agentic system that designed a complete RISC-V CPU core called VerCore, running at 1.5 gigahertz — performance roughly comparable to a laptop processor from 2011.

Sam: The 2011 comparison is important context. This isn't competitive with modern chip design. But the significance isn't the performance number — it's the end-to-end agentic approach. Prior work in AI-assisted chip design handled specific subtasks: maybe AI helps place logic cells, or optimizes a routing step. VerCore was designed by an agent handling the full pipeline — RTL generation, verification, timing closure, the whole flow — without human engineers driving each stage.

Priya: The progression here is worth laying out. 2020: GPT-2 fine-tuned to generate logic circuit fragments. 2023: GPT-4 helping design an 8-bit processor with human guidance. 2024: LLMs producing basic functional circuits, often with bugs. 2026: an agentic system producing a complete, functional CPU core end-to-end. Each step isn't just a performance improvement — it's a different level of autonomy in the design process.

Sam: And if the design timeline for a processor core compresses from months to days, that changes what's feasible to explore. You could run thousands of architectural variants and test them. The constraint on hardware innovation shifts from design time to fabrication time.

Priya: Now let's talk about the OpenAI clinical AI story, because this one has a lot of signal in it. ChatGPT for Clinicians is a free, dedicated medical version targeted at healthcare professionals, backed by benchmark results claiming GPT-5.4 outperforms doctors on clinical tasks — even when those doctors have unlimited time and web access.

Sam: The unlimited time and web access condition is specifically designed to control for the argument that AI has an unfair advantage because it's fast. If a doctor can take as long as they want and look anything up, and the model still outperforms them on these benchmarks, that's a stronger claim. Now, clinical benchmarks are not the same as clinical practice — they're testing structured question-answering under controlled conditions, not the full complexity of a patient encounter. That caveat matters.

Priya: It does. But the free pricing model is a strategic signal. This isn't being positioned as a premium product. OpenAI is trying to get broad institutional adoption in healthcare, which means building a distribution moat before competitors do. The capability claim gets attention; the free pricing is the actual go-to-market move.

Sam: Let's get to the Anthropic Mythos breach, which is genuinely important for the frontier model safety conversation. Anthropic had decided not to publicly release Mythos because of its reported vulnerability-finding capabilities. A small group of unauthorized users got access anyway. The early analysis suggests the actual dangerous capabilities may be less severe than feared, but that almost doesn't matter for the policy question.

Priya: Right. The breach itself is the stress test. Anthropic's approach — and this is representative of a broader voluntary access control framework — assumes that restricting model access through API policies and deployment controls is sufficient to prevent misuse of a highly capable model. This breach suggests those controls have real limits.

Sam: The interesting technical question is how the breach happened. We don't have full details, but unauthorized access to a model that's not publicly deployed suggests either a credential or access control failure, not someone jailbreaking a public model. Those are very different threat models with very different mitigations.

Priya: Two quick industry items. Thinking Machines Lab — Mira Murati's post-OpenAI startup — signed a multibillion-dollar infrastructure deal with Google Cloud, getting access to NVIDIA GB300 chips. This is the second major funding signal from Murati's lab, and it fits the pattern we keep seeing: hyperscalers financing frontier AI labs through compute deals rather than pure equity investment. Google gets a committed customer and strategic alignment; the lab gets the infrastructure to compete.

Sam: And Tesla is betting on Intel's 14A process node for custom AI chips at its Terafab facility. The 14A node isn't finished yet — Musk is committing to silicon that doesn't exist in production form. The upside is supply chain independence from NVIDIA, which matters enormously for a company whose autonomous driving and robotics roadmap depends on AI inference at scale. The downside is that process node delays are common, and any slip in Intel's 14A timeline cascades directly into Tesla's product schedule.

Priya: One more research story worth spending a minute on. Japanese researchers built an AI tool that designs thermoelectric generators — solid-state devices that convert heat differentials directly into electricity — ten thousand times faster than conventional simulation-based approaches. And the prototypes built from those designs matched current state-of-the-art performance.

Sam: The four-orders-of-magnitude speedup is remarkable. The traditional approach involves iterating through candidate materials using physics simulations, which are computationally expensive. What AI enables here is essentially learning the mapping from material properties to device performance well enough to search that space intelligently rather than exhaustively. This is the scientific discovery pattern we've been watching develop — AI doesn't replace the physics, it makes the search tractable.

Priya: So let's look ahead. The through-line in today's news is the infrastructure layer consolidating fast. Google just made moves on training silicon, inference silicon, platform positioning, and a major lab partnership — all in one week. The gap between organizations with access to frontier infrastructure and those without is widening.

Sam: The agent platform convergence is the other thing I'm watching closely. Google, OpenAI, and Microsoft are all shipping persistent autonomous agents into enterprise environments this month. The technical capability to run agents that take real actions at scale is arriving before we have good answers to some basic questions: how do you audit what an agent did and why, how do you scope permissions appropriately, and what happens when agents from different vendors start interacting in the same workflow.

Priya: And the Mythos breach puts the model access control question back on the table right as capability benchmarks are claiming doctors can be outperformed on clinical tasks. The policy frameworks haven't kept pace with either the capabilities or the deployment velocity.

Sam: A lot to watch. That's it for today's episode of AI Revolution. If you found this useful, share it with someone in your orbit who's trying to keep up with where this is all going.

Priya: We're back tomorrow. Thanks for listening.

AI Revolution is an automated daily podcast covering AI advancements. Generated 2026-04-23.

Sources: MIT Technology Review, VentureBeat AI, The Verge, Wired, TechCrunch AI, Ars Technica, IEEE Spectrum, The Decoder, The Gradient, Hugging Face Blog, Google AI Blog, AI News, SemiAnalysis, and The Register.

AI Revolution – April 23, 2026

Show Notes

AI Revolution – April 23, 2026

Episode Summary

Stories Covered

• Infrastructure

Google unveils 8th-gen TPUs, agent platform, and Workspace AI layer at Cloud Next '26

NVIDIA and Google infrastructure cuts AI inference costs

• Research

AI Agent Designs a RISC-V CPU Core From Scratch

AI Designs Thermoelectric Generators 10,000 Times Faster Than We Can

• Applications

OpenAI says its new ChatGPT for Clinicians outperforms doctors on clinical tasks even when they have unlimited time and web access

OpenAI now lets teams make custom bots that can do work on their own

• Industry

Exclusive: Google deepens Thinking Machines Lab ties with new multibillion-dollar deal

Musk bets Tesla's AI future on Intel node that isn't finished yet

• Policy

Unauthorized users breach Anthropic's restricted Mythos AI model

Further Reading

Full Transcript