AI on IBM z17, Meta’s Llama 4 and Google Cloud Next 2025

Watch the episode
Mixture of Experts podcast album artwork
Episode 50: AI on IBM z17, Meta’s Llama 4 and Google Cloud Next 2025

IBM z17™ is here! In episode 50 of Mixture of Experts, host Tim Hwang is joined by Kate Soule, Shobhit Varshney and Hillery Hunter to debrief the launch of a new mainframe with robust AI infrastructure. Next, Meta dropped Llama 4 over the weekend—how’s it going? Then, Shobhit is recording live from Google Cloud Next in Las Vegas, along with Gemini 2.5 Pro. What are some of the most exciting announcements? Finally, the Pew Research Center shows perception of AI—how does this impact the industry? All that and more on today’s 50th Mixture of Experts.

Key takeaways:

  • 00:00 – Intro  
  • 00:55 – IBM z17
  • 11:42 – Llama 4
  • 25:02 – Google Cloud Next 2025 
  • 34:29 – Pew’s research on perception of AI

The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.

Listen on Apple Podcasts Spotify Podcasts YouTube Casted

Episode transcript

Tim Hwang: What percentage of enterprise data is unstructured data? Kate Soule is Director of Technical Product Management for Granite. Kate, welcome back to the show. What’s your estimate?

Kate Soule: This feels like a trap. Without any data, just a wild guess, I’m gonna say 40%.

Tim Hwang: Shobhit Varshney is Head of Data and AI for the Americas. Shobhit, tuning in live from Vegas. What do you think?

Shobhit Varshney: 200%. Have you seen the quality of structured data in companies?

Tim Hwang: Alright, great. And last but not least, joining us for the very first time is Hillery Hunter, IBM Fellow and CTO of IBM Infrastructure. You’ve got an advantage on this question, but I don’t know if you wanna offer your guess.

Hillery Hunter: Yeah, I’ll take the midpoint there—not exactly the midpoint, but I’ll go with 80%.

Tim Hwang: Okay, great. So the answer is 90%. We’re gonna talk about that today, and all that and more, on the very 50th episode of Mixture of Experts!

Tim Hwang: 50th episode! Crazy!

Kate Soule: Woo-hoo!

Tim Hwang: Woo-hoo! I’m Tim Hwang, and welcome to Mixture of Experts. Each week, MoE brings together a talented and just lovely group of researchers, product leaders, and more to discuss and debate the week’s top headlines in artificial intelligence.

As always, there’s a ton to cover. We’re gonna talk about the Llama 4 release; Shobhit’s in Vegas, so he’s gonna tell us all about Google Cloud Next; and some really super interesting research coming out of Pew Research.

But today, we want to take the opportunity because Hillery is on the line with us to talk about IBM z, a new launch that just came out on, I believe, Tuesday. It concerns mainframes. So I guess, Hillery, do you wanna just start—for listeners who are less familiar with the sector, what is a mainframe anyways, and why is it important?

Hillery Hunter: Yeah. I think first, a fun fact is that “z” stands for zero downtime, and mathematically, that’s an interesting conversation. We talk about the system now having eight nines of reliability. The way you count those nines is you say it’s 99 point... and then six more nines. So that’s how you get to it—it’s a lot of nines.

Tim Hwang: Nines of resiliency, yeah.

Hillery Hunter: Yeah. But it means just a couple hundred milliseconds a year of downtime on average. When I talk to family members or meet someone socially, I kind of say we work on building the computers that you don’t see and that you just assume are there and never think about. What that means is, this is where most of the world’s financial transaction volume—everything from things in the market to your personal credit card transactions—go through in the back end. You hopefully never think about whether that computer’s gonna work or if your credit card transaction will go through. These are systems we all assume are up all the time. So it’s really at the core of the global economy, to be honest; that’s really not an exaggeration.

Tim Hwang: Yeah. What I love about this is you work on arguably some of the highest-stakes computing. One of the most interesting things about the launch is that AI is a big part of it. I know there’s the z17, which is the mainframe, and then there’s the “z” software, which sounds like IBM pushing into the idea that these six-nines-of-reliability computers are really gonna get integrated into the overall AI revolution.

We’ve talked on the show before about how AI is not always production-ready; it sometimes messes up, it’s stochastic, it has all sorts of randomness. So I’m curious to hear more about what’s getting launched on the software side, and how you get AI to work at such a high level of reliability that most software developers never even need to think about as they’re vibe coding or whatever.

Hillery Hunter: Yeah, it’s a pretty different space, but it’s equally fascinating, I think, to that whole vibe coding space that a lot of folks interact with daily. From a technical perspective, getting things done in transaction means having millisecond-level AI. That means super, super fast, tightly integrated, being able to handle billions of transactions a day, and being able to score things at line speed.

An anecdotal example: if you’re talking about fraud and analytics in the credit card transaction processing space, if I, as a consumer, am buying something online, it’s okay if there are minutes to hours before the thing gets shipped out; fraud can happen offline. But if it’s in a store and somebody’s trying to rip you off and buy an expensive phone at Best Buy, you wanna make sure that instantaneously, the moment the transaction goes through, it’s detected as fraudulent. So there’s actual real economic and consumer value to being able to score every transaction in real time.

The interesting thing we’re now talking about being possible on this next generation of mainframe is multi-model AI. So a really small, fast, compact model running right on the processor, dealing with massive transaction throughput. Maybe occasionally it has low confidence in the scoring it provided and needs to be backed up by a more robust, complicated model. So we’re putting extra AI cards, called the Spyre card, into the system to enhance not just the super-fast processing on the processor itself, but also do fast processing one step slightly removed on a PCIe-attached set of cards. We’ve just multiplied the AI capacity and throughput for the system.

Also, from the perspective of the total system experience on the software side, we now have something called Operations Unite, which is an AIOps-driven, AI chat-driven interface to everything going on in the system—observing, remediating issues, all happening in a totally modern interface. So it’s pervasive; once you put the AI capability in, it’s not just about the workloads running in the system, but also how people use, operate, and keep the whole thing stable and healthy.

Tim Hwang: Yeah, that’s awesome.

So, Shobhit, I’d love to bring you in. I launched this episode with a question about how much unstructured data enterprises are sitting on, and I’m sure this is a problem you deal with and talk about with customers day in, day out. I know that’s a component of this launch, but I’m curious if you want to opine on how the world is evolving there and how the Z launch fits into those questions.

Shobhit Varshney: I’m a big fan of the Z Series. I grew up in a cloud-first, AI-first world, and I have so much respect for understanding the right balance between where mainframes should play versus where the clouds are. As an example, working with a very large bank, we leveraged cloud environments with many different GPUs and compute to train the models. But once you have fine-tuned the models to enterprise data, you wanna bring it where the transactions are happening. These are sub-millisecond, very quickly; you’re doing billions of these every hour. So you want to bring the AI inference as close as possible to where the transaction is happening.

In the first wave of unstructured content analysis, you might have a large language model that summarizes a call recording or does some knowledge search. Now, in the next wave, once we’ve proven the technology works, you wanna do this in more mission-critical workflows. For example, in fraud detection, like Hillery mentioned, there are many patterns to look for. It’s not just that one transaction; you need to look at how that transaction happened. At that point, in sub-milliseconds, larger models have a lot of latency. You obviously can’t afford to have that data go out to the cloud and come back—security issues, latency, and other things.

So we are in a world where we see many of our larger Fortune 100 companies move from experimenting with large frontier models via API calls to fine-tuning smaller open models and bringing them close to the compute. I think the Z series works incredibly in this space. We also have the brand permission with Z—what, Hillery, 90% of all credit card transactions happen on Z, and 90% of the Fortune 50 banks rely on us, and so on? Airlines, retailers... So you’re in the mission-critical workflows. This is no longer, “Hey, let me ask the prompt a different way.” You’re not experimenting; you are doing this in critical workflows.

Hillery Hunter: You know, I love that you went to latency. I think one thing related to leaving the system is the data security model, data sovereignty—all those other hot topics. Bringing AI to where that data is, where that mission-critical data is, where that valuable and sensitive consumer and personal information is, is a big part of this conversation.

Another thing, in addition to latency and data protection, is energy. We’ve greatly increased the AI capability and overall capability of the system, but dropped the power consumption for this whole system, generation to generation, by 17%. The team has measured that it’s about 5x more efficient to do that AI in place where the data is than, to your point, calling out to some external system. These days, everybody’s running out of power, looking to take out more data centers, and all that stuff. Being able to do AI so efficiently is a really exciting step forward.

Shobhit Varshney: And Hillery, just about a month back, I was with one of the largest top three credit card companies, and we were having concerns around fraud detection. We can obviously do a lot of LLM work to understand patterns; it’s not just a spot in time. Even a month back, we struggled to bring LLM models into real-time transactions because it’s just sub-millisecond. I was just so proud that in the last week, this week, we’ve been able to go after those use cases we couldn’t even a few weeks back.

So we’re coming to a point where clients understand they’ve proven inside their enterprises that they can use LLMs and have trained them in a particular way, but latency was in the way. A lot of our clients—huge kudos to your team—are just... I think you bring enough AI, and to your point, the creativity just explodes. Every developer in this core enterprise space is now thinking, “Oh, that’s now for me.” That’s not for people elsewhere in different environments. It’s now insurance claims processing, even medical image assessment. There are all kinds of amazing things going on on that core data. AI is also for those people, for that data, and for that context. That’s super exciting.

Tim Hwang: So, Hillery, before we move on, what comes next for you all?

Hillery Hunter: The capabilities with Spyre come out in the 4th quarter. There’s a rolling set of announcements on different software enhancements. The way to think about it is we’re making these systems AI through and through. Starting back in z/OS 3.1, the last release, there was AI inside things, starting to look in that direction of self-healing or automation of management of system efficiency. What we’ve stated about z/OS 3.2, coming out, is even more integration of that smartness into the core of how the system operates, how operations teams experience it, and going all the way out into our support staffing.

If you call IBM for help, we are also using watsonx technology to help those agents helping you with your mainframe. We started that project in our Technology Lifecycle Services organization with our storage products, and we’ve announced this week we’re bringing that to mainframe support. So that whole experience end-to-end—how the system runs, what you can do on it, what you understand about it, and how somebody helps support you—is all gonna be AI-enabled. That end-to-end, full-stack story is really exciting. This is us living what we’ve been talking about with the power of AI.

Tim Hwang: This is awesome. We’d love to have you back on the show as things unfold. It’s a segment of AI we haven’t talked much about, but I love it personally because it’s this high-stakes thing you really gotta get right. It’s a kind of AI engineering you don’t see in a lot of other places, which is really exciting.

I’m gonna move us to our next topic. Meta has released Llama 4, a long-awaited release in the open-source space. There are three models they’ve talked about; two actually announced: the Scout model, the Maverick model, and the Behemoth model. It follows a pattern we’ve seen elsewhere, launching both smaller and bigger models for different applications.

Kate, maybe I’ll start with you. I don’t know if you’ve played with the models yet, but curious about your early impressions, your vibe check on this release.

Kate Soule: It’s been a busy week, so I haven’t played with them directly, but it’s really exciting. I’ve been reading up on them. With the release of their largest model—over 400 billion parameters, I believe, mixture of experts—and the Scout, I think, is 100 billion parameters—they’re starting to take on larger tasks and create powerful models in the open-source ecosystem. With the announcement of their Behemoth model, which is 2 trillion parameters... that’s big, Tim. That’s pretty big.

They’re talking about, on earlier trained versions and checkpoints, it’s cracking GPT-4.0 on tasks like science. So they’re putting themselves out there as a frontier model provider. Doing that in the open only continues to put more pressure on closed labs to release their work and helps the community. That’s really interesting.

There’s a lot to be said about the mixture of experts architecture. DeepSeek made this famous with their big update back in December. It’s an architecture used more broadly even before that. I’m hopeful this release will get broader community support behind mixture of experts. There are tons of interesting things about it: very training-efficient, inference-efficient, particularly at low batch size. You only use the experts you need at inference time, which, if running one or two tasks, can be very efficient. You start to lose that at larger batch sizes because you have to load all experts into memory. Most people don’t realize that about mixture of experts.

Either way, really excited to see another powerhouse model released—in this case, two powerhouse models—into the open.

Tim Hwang: Yeah, for sure. Can you go into that a bit more for our listeners? I mean, it’s the namesake of the show, so I have to fight for it. Has mixture of experts been a little uncool as of late? It sounds like you’re implying these models might make it a focus of the community again. I’m curious how that’s developed.

Kate Soule: Well, even with the Z system, we’re talking about inference efficiency, running things quickly. A lot of what enables that is the community building open-source software and platforms to host and run these models as fast as possible. Because the most popular open-source models to date, including prior Llama generations, have been dense architecture models, a lot of existing support for hosting and running them locally or on platforms like VLM is predominantly based on those dense architectures.

There will need to be a groundswell movement of the community building out support. I think we’ve seen a lot of that already with Llama 4’s release. I’m excited to get more open-source developers interested in mixture of experts as an architecture and continue building out tooling and ways to work with these models more broadly.

Tim Hwang: Shobhit, maybe I’ll bring you in here. I think a less interesting way this discussion goes is, “Okay, Meta did this release; who’s ahead in this race?” But that’s often the wrong way to think about it, especially as the space gets more complex. How should we read this launch in terms of Meta’s strategy and how it’s trying to fill a niche? Rather than “DeepSeek is ahead” or “Meta is ahead,” how are the strategies evolving? I’m curious what you read into this launch.

Shobhit Varshney: Absolutely. Let’s start by acknowledging the consequential impact Llama has had on the industry. As of March 18th, Llama models have been downloaded a billion times. Let that sink in—a billion times we’ve downloaded a model and made different versions, adapted it, and so forth.

Many enterprises we work with are focused on adapting a model to their specific domain, their data, and how they want models to behave. That adaptation only comes when you’re really open. Certain frontier models can be adapted via fine-tuning, but then you’re sending proprietary data to the cloud—a no-go. So usually, open-weight models are fine for that space where you can tune them. Our own Granite models, models from Mistral, DeepSeek, and others are open-weight, open models.

But it takes quite a bit to create a good mechanism to assess output quality. For many clients, we have to build end-to-end LLM benchmarking mechanisms to evaluate output on specific documents. Public benchmark results are a good starting point for a directional check—“Yeah, it’s worth looking at; Llama 4 did X better”—but none of my clients jump up and down over 0.2 points higher. People have other criteria to judge which LLM to leverage.

It starts with IP: who can own the IP? It starts with data gravity—AI models follow data gravity. There are commitments to specific vendor cloud vendors. Things around, “Can I adapt this to my own environment?” And then return on investment, the overall ROI of running these models.

You’ll see a trend: every six months, the next size smaller model gets smart enough to outcompete the previous one from six months back. We’re seeing a constant trend where we’re getting really good performance-to-cost ratio. I think that’s the sweet spot. Llama has done a really good job. I anticipate we’ll continue this trajectory of a billion downloads, with different adapted versions of Llama available for enterprises. That’s the right frame versus, “Oh my God, this crushed the numbers on this task.” Other models constantly innovate with new methodologies. DeepSeek did a phenomenal job with some paperwork; our Granite models have nice tricks; we give back to the community.

I’m super pumped about the community coming together, open source getting to a point you can adapt it to the enterprise, and very focused on intelligence divided by price—that kind of metric.

Tim Hwang: Hillery, maybe I’ll bring you in on the Behemoth model. I know it wasn’t released, but it’s shockingly large. It’s cool on one level—“Wow, it’s really big.” But from your point of view, to what degree are these practical models people will use in the wild? The infra needed to serve and use a model at that scale... is this more marketing than practical reality? Is there room for open source on the mega-scale model, or does it limit the set of people who can practically use it?

Hillery Hunter: Yeah, I have similar thoughts to Shobhit. Within IBM Infrastructure, we also handle creating cloud infrastructure for watsonx and deployment of all these infra services. The other part of my brain is looking at how we bring more powerful accelerators into that cloud environment to do whatever watsonx needs. If customers need really big models, I’m not gonna say no; we’ll provide the infrastructure. We’re advancing with NVIDIA, Intel, and AMD, putting new GPUs out there to enable people to play with models as large as they find useful.

On the practical side, we see a lot of experimentation or attempts to use these things, maybe for teaching. But when scaling deployments, almost all customers engage with us on how to customize smaller things. You sort of have to know where things are on the large side and what it might do for you. You may use that to inform what the solution looks like or create additional tuning data to get the characteristic you need out of something affordable to scale.

Like Shobhit said, most of our customers work largely in the B2B space. IBM works with large enterprises with millions to hundreds of millions of clients. When you want to engage with all of them and run at business scale of billions of interactions, affordability kicks in, and people start looking at customization of smaller things for real scale-out of deployments.

Kate Soule: If I can make a prediction based on what Hillery just said and what Shobhit mentioned about small LLMs increasingly doing more... My prediction is that most of the Llama 4 models released are very big; even the smallest is quite big at 100 billion parameters. I think they’ll be used most by the community to fine-tune older, smaller Llama 3 models. If we look at what can run on a laptop, what you can easily train and customize, you’re talking one to 10 billion parameters, dense architecture, because there’s a lot of tuning support for that.

I think the most immediate uses of these biggest models will be to continue the trend of making smaller models more performant—using bigger models to teach, generate data, augment enterprise data, and pack that down into smaller models like older Llama generations or our Granite generation, playing in the single-digit billion parameter frame.

Shobhit Varshney: I totally agree, Kate. One other factoid—I’m sure you’ve talked about this—it’s estimated only about 1% of enterprise data, or 1% of what an enterprise needs a model to use, is contained in publicly available models. So an enterprise must customize something. The question is, what is that something, and is it affordable enough to scale?

Hillery Hunter: Yeah, and the size of the model and the context window side—10 million per context window! What a world; I can dump a bunch of data and talk against it. But it takes a lot to host these models. Different vendors offering inference infrastructure for the same model—it’s complex to host it right. Each vendor offers different context windows; not everybody can pull off 10 million. How you fine-tune it, and so forth.

Even companies doing third-party analysis, like Artificial Analysis, it took them a few turns to get the inference infrastructure right to match what Llama claimed in their papers. It takes a few rounds. This speaks to the complexity of larger models and the difference you see from the same prompt sent to three or seven vendors hosting the model—slightly different responses, quite a bit of difference.

I think we’ll get to a point where derivatives of Llama 4, synthetic data from Llama 4, and new techniques they released will make their way into smaller models, and those will scale across companies.

Shobhit Varshney: I’m generally very excited about these big releases. Model companies are still sticking to open-weight models. There are still restrictions with the Meta license—not quite Apache or MIT—but overall, clients have loved that we can now outcompete each other in the AI space. All clients win when you have great AI labs working on this together.

Tim Hwang: I’m gonna move us to our next topic: Google Cloud Next. Shobhit, you’re dialing in straight from Vegas, so I’ll kick it to you. You’ve been there all week. What are the big things we should know about coming out of the show?

Shobhit Varshney: It’s lovely to be with developers and clients hacking through and using it. 500 customer logos on screen—that’s where Google Cloud is today. A great testament to where they were two, three years back. They’ve done quite a bit to serve enterprises and have more data. Cloud is growing, profitable, etc.

Looking at how they’re bringing AI across the entire platform, exposing internal strengths—for example, they have amazing TPUs to train their own models for use cases like YouTube, Gemini across mobile apps, etc. They’re bringing TPUs out to enterprises, constantly innovating on chips. The latest release, Ironwood, shows amazing progress on their own chips.

Then, things Google does to support billions of users, like their own wide-area network of fiber—millions of miles—now exposed to enterprise users. They’re making a concerted effort to make their secret sauce available to enterprises.

Overall, they spent a lot of time on media creation versus use cases like coding or data. They’re the only cloud that can do media creation end-to-end across modalities. I was privileged to be part of the Sphere experience on Day Zero, where they showed “The Wizard of Oz” and what they’re doing on a mega scale. It’s a great experience to see AI leveraging the best techniques to create an immersive experience on the Sphere.

A lot in media space, but not many enterprise clients jump on media topics. Marketing is great, some media creation, but bigger focuses are call centers, code development processes, messy data, etc. They made quite a few announcements here. They’ve been announcing new models for weeks. It’s amazing—10 days before your annual event, you release Gemini 2.5. In this AI race, you can’t wait 10 days; you need Gemini 2.5 out before Llama 4. Good to see progress going fast.

Performance per intelligence per dollar—Gemini Flash has been doing really well. Their Gemini 2.5 Pro model is number one across benchmarks, including the MMLU, for now. A huge focus on that.

Shifting to agents space: we had MCP from Anthropic, allowing an LLM to access backend systems in a structured way with a standard protocol. To complement, Google created its own Agent-to-Agent protocol, allowing one agent to talk to another not as a tool but as an equal citizen, a peer. They can talk, say, “I found this error; what do you want me to do?” or “Go talk to a human if needed.” This is asynchronous; it can take long-working tasks, and they talk back and forth.

I’m pumped when people collect around specific standards. Google had 50+ partners working on Agent-to-Agent. Within IBM Consulting, we have a good agent tech workflow, our IBM Consulting Advantage; we already have MCP integrated and are working on Agent-to-Agent. We’re excited about making this an open ecosystem, working sideways.

Those were my highlights. Just pumped about clients talking specifics—not just a 30-second video, but half-hour sessions deep-diving into challenges, their journey, which models they used, etc. It’s very good to work with product teams and customers at these events.

Tim Hwang: That’s great. Hillery, an avalanche of announcements from many directions. As you look at Google Cloud and their announcements, any trends, thoughts, hot takes from Google Cloud Next?

Hillery Hunter: One thing that caught my eye that Shobhit didn’t mention—so I can grab it—they also talked about AI on-premises, offering those capabilities. That’s exciting; it affirms what we’ve been thinking—clients need to run AI in an air-gap environment. We keep saying AI is a platform conversation, and AI and hybrid cloud are two sides of the same coin. That’s a statement going back to everything we talked about at the beginning: there is data in important places that needs to be secured, sometimes adhering to sovereignty concerns.

Bringing AI to the data, and that one of their announcements affirms that, is something they also see as important. It’s a good affirmation of what we see in the enterprise space: you gotta bring AI to the data. AI is a decision about how flexibly you can deploy AI in all locations you have data and customers. It’s not just a decision about which model or which location it runs in.

Tim Hwang: Any final takes, Kate, on Google Cloud Next? It’s remarkable—every time Shobhit comes to a show, it’s a voluminous list I have trouble parsing but need him to decompose. All these tech conferences going on—it’s great.

Kate Soule: From my perspective, I’m most interested in things like the Gemini 2.5 Pro release, which has been really impressive. Great vibe checks from that model. Really exciting to see them take center stage with a strong release. More great models out there only improves what the field can accomplish. From that perspective, really excited to see them push the boundaries.

Shobhit Varshney: One last parting thought: Google is flexing its B2C learnings. They can train models on so much content—I’m not getting into where content comes from or indemnification, just commenting on the fact they can train on so much more real-world information from B2C space. No one else has access to so much B2C data.

Video generation, for example—the videos they create are very cinematic. It seems they’ve looked at all YouTube videos from good creators; quality is really good, translating into voice experience. This is becoming critical for clients to get voice right. They have an unfair advantage where they can provide nice audio experiences.

A small example: if I have Google Docs, I can ask an agent to create a workflow, do research, and create a long research paper. It creates a three-page paper on why margins are dropping though revenues are up, with competitive analysis. I can click a button and create an audio podcast out of it. Corporate enterprise stuff difficult to consume, now with a nice audio layer—I can listen on my drive to work. Their unfair advantage on audio and experience gives them advantages on the enterprise side that some peers don’t have.

Tim Hwang: With these podcasts going on YouTube, maybe Kate, you’ll get the digital twin of Shobhit you’ve been wishing for.

Kate Soule: Exactly. As long as he gets some royalties from it.

Tim Hwang: Yeah, that’s right. Ad dollars there. The future of educational entertainment is funny—convert all my emails into a Netflix series I watch when I get home. I think we’ll enter strange worlds.

Shobhit Varshney: Here’s the kicker, man—I’ll absolutely close on this. I wanna live in a world where I can insert myself, Shobhit, inside a movie scene. If Iron Man comes to a bar and orders a drink, I wanna be the bartender. If you have celebrities on screen, I wanna be part of that; I could be the driver. I want to immerse myself as part of the video. This was not possible till today. Looking at how far we’ve come with video creation, I think we’re at a point we’ll have super personalized movies cracking jokes I do daily.

Tim Hwang: I’m gonna move us to our final topic. I’d be remiss not to mention this, though we only have a few minutes. I encourage listeners to check out a super interesting report from Pew Research—a survey of American perceptions around AI and how people use it daily.

We only have time for hot takes, but one interesting takeaway was how AI experts’ views are really divergent from people using or experiencing AI daily or even just hearing about it. One result was experts saying jobs won’t be impacted by AI, but people feeling jobs will be impacted; experts generally more positive than the general public.

Do you feel this impacts AI’s prospects going forward? Kate, your quick take in the minutes we have.

Kate Soule: There’s a lot interesting in the Pew report—not enough time now. I think it speaks to researchers’ optimism, which is great; we need optimistic people inventing and pushing technology forward. But I think it also speaks to representation in technology; we still have work to get better representation reflecting the world building this technology.

They broke down men’s versus women’s perceptions of technology’s impact; men matched AI experts. No surprise, most AI experts and research are still predominantly done by men. I think it reflects needed diversity, different opinions, broader perspectives we need to grow and bring into AI research as a discipline.

Tim Hwang: A great note to end on. Hopefully, a good sell to check out the report; lots of data worth parsing. I agree; it points to the need for greater diversity efforts.

As usual—I say this every episode, like saying “agent”—we’ve had more to cover than time. But Shobhit, Kate, Hillery, thanks for guiding us through our 50th episode. Thanks for joining us. If you enjoyed this, get us on Apple Podcasts, Spotify, and podcast platforms everywhere. We’ll see you next week on Mixture of Experts.

IBM z17 makes more possible
 

A full stack AI solution with IBM z17

Learn how IBM z17 processes up to 5 million inference operations per second with less than 1 millisecond response time.

Transforming and simplifying the mainframe for greater productivity and efficiency with AI on IBM z17

Find out why 88% of IT execs say that app modernization is key, and 78% see mainframes as central to transformation. Learn how IBM helps clients boost value, AI productivity, and efficiency across key systems.

IBM® watsonx Assistant™ for Z

Unlock new levels of productivity on the IBM Z platform with a generative AI assistant.

Learn more about AI

What is artificial intelligence (AI)?

Applications and devices equipped with AI can see and identify objects. They can understand and respond to human language. They can learn from new information and experience. But what is AI?

What is fine-tuning?

It has become a fundamental deep learning technique, particularly in the training process of foundation models used for generative AI. But what is fine-tuning and how does it work?

How to build an AI-powered multimodal RAG system with Docling and Granite?

In this tutorial, you will use IBM’s Docling and open-source IBM® Granite® vision, text-based embeddings and generative AI models to create a retrieval augmented generation (RAG) system.

Stay on top of the AI news with our experts

Follow us on Apple Podcasts and Spotify.

  1. Subscribe to our playlist on YouTube