Is OpenAI about to release their biggest AI project? In Episode 16 of Mixture of Experts, host Tim Hwang is joined by Nathalie Baracaldo, Kate Soule and Shobhit Varshney. Today, the experts chat about IBM’s 2024 Cost of a Data Breach Report and analyze how gen AI might reduce the cost of cyberthreats. Next, rumors are circulating on the internet about OpenAI dropping “Project Strawberry,” what they internally reference as a “level 2” model. Are the rumors true? Tune-in for more.
The opinions that are expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Tim Hwang: Is AI going to save computer security?
Nathalie Baracaldo: I think there’s a balance. So while new tools are helping a lot, on the other side, we are also seeing new risks that arise with AI.
Tim Hwang: There is no evidence that Strawberry is anything at all.
Shobhit Varshney: OpenAI does need something that is significantly better than where they are right now. So I do believe that they have to release something mega pretty soon.
Tim Hwang: I’m Tim Hwang, and I’m joined today, as I am every Friday, by a tremendous panel of researchers, engineers, and others to hash out the week’s news in AI. Today: Nathalie Baracaldo, who’s a senior research scientist and master inventor; Kate Soule, who’s a program director in generative AI research; and Shobhit Varshney, senior partner consulting on AI for the US, Canada, and Latin America.
So before we get into this segment, I want to do our usual around-the-horn question. I think it’s a really simple one, but it teases up really well to kind of get into this topic. The question simply is: data breaches are very expensive today. Do we think in about five years that the costs of an average data breach will be going up or down? Will it be greater than or lesser than the kind of damage that we see nowadays? Shobhit?
Shobhit Varshney: More.
Tim Hwang: Kate, how about you?
Kate Soule: I think down.
Tim Hwang: All right, great. Going down. Okay, well, we just got some disagreements, so let’s get into this segment.
So we’ve got a couple of news stories to focus on today. The first one is actually a story that comes right out of IBM. IBM released a report a few weeks back called the “Cost of a Data Breach,” which is the latest edition of an annual report they do estimating the costs of data breaches. It has some fascinating implications for AI and cybersecurity. Right now, it estimates that the average cost of a data breach is rising—a 10 percent increase over last year, where the average data breach cost is about $4.88 million. But I think one of the most interesting things is that it estimates there’s an average $2.22 million cost savings from the use of security AI and automation. That’s a huge, crazy difference.
I want to get into the discussion with Nathalie. Bringing you in first: that’s like a 50 percent difference, right? I’m kind of curious how you think about the use of AI in the security space, how these two worlds intersect, and the implications for AI in cybersecurity.
Nathalie Baracaldo: Thank you, Tim. So, I read the report and I’m very, very happy to see that Gen AI and AI in general really reduce the cost of incidents and help the security teams a lot. I think there’s a balance. So while new tools are helping a lot, on the other side, we are also seeing new risks that arise with AI. The amount of benefits that we have with these new tools is fantastic, so I’m very excited that we’re heading in the right direction. But we cannot forget that we do need to protect those tools against adversarial attacks throughout the entire pipeline of the system. So overall, I’m very excited to see the entire community heading in the right direction. Definitely, including AI for auto-verification and helping humans is really helping out. So yeah, that’s my thoughts.
Tim Hwang: Yeah, for sure. That’s really helpful. And Shobhit, I’m thinking, when you talk to clients—you work with clients on a wide range of AI implementations, and the security space is something we haven’t covered much on this show before—I’m kind of curious: in the market, do you see more and more enterprises wanting this, thinking about this intersection? And if there are particular use cases that come to mind where you’re like, “Wow, that’s really making the difference in reducing the impact of data breaches or preventing them in the first place.” Just curious about what you’re seeing out there.
Shobhit Varshney: Yeah, absolutely. So it’s a very, very hot topic for all of our clients, and it’s a two-way street. There is AI that’s helping you drive better security, like pattern recognition, to secure things. But there’s also the reverse, where the security teams are doing a better job at protecting the AI as well. So it’s both directions.
We are learning quite a bit. We’ve gotten much closer to our security services within consulting. There are a few things you do in security: there is prevention, making sure you’re detecting fast enough, investigating what happened, and being able to respond—the whole lifecycle of it. Across the whole platform, from a tooling perspective, you’re doing things like managing the attack surface, red teaming, posture management, things of that nature. There are quite a few areas where Gen AI, or AI in general, has been able to make a meaningful difference.
The report we’re talking about is a massive study. To give you the scale: we looked at about 600-plus organizations that had data breaches in the last year across 17 industries. We interviewed close to 4,000 senior security officials who dealt with the breaches. We looked at the entire spectrum of where AI is getting involved.
The number one reason for breaches was human error or the need for human training to prevent them from happening. So small things like social engineering: I can use a generative AI model to create a very plausible email that people will be tempted to click. So that “click-baitedness” of how we generate content has been applied to social engineering attacks.
Tim Hwang: Right, like using it for red teaming is what you’re talking about now, right?
Shobhit Varshney: So red teaming is a great use case. The second one: I’m working with a large Latin American bank on cybersecurity pattern detection. We’re saying, “Here’s a set of things that happen. Can you create an early alert based on the pattern you’re seeing?” And then the same information needs to be assimilated at different levels and sent out as alerts. So we’re able to automate parts of what a human would have otherwise done in managing the whole lifecycle from detection to managing the incident. On these SWAT calls—you join a call that’s been running for six hours, executives jump in and say, “Hey, can somebody recap?” That’s a very easy one for us. So now we generate recaps of what has happened so far, actions people have committed to. So those things show up on the side; anybody who joins the SWAT call knows exactly where we are.
Tim Hwang: That’s really cool. I never really thought about that. I think that’s the funny thing: when you think about AI and security, you imagine a hyper-intelligent machine system that will just defend against hackers. But a lot of what you’re talking about is optimizing the human team that’s doing a lot of this, which is really important.
Okay, maybe a final question to bring Kate in. I’d love to get the researchers’ view on this. Shobhit talked about a big piece being defending AI systems against subversion or manipulation, which is a huge issue. I was joking with a friend that there’s probably a whole product you could build just around manipulating the chatbots people have on their websites.
I guess, from a technical perspective on defending AI systems, curious if you have any thoughts on where we are there. Is the state of the art getting to the point where we feel we can actually handle some of these attacks when we release these systems into the wild?
Kate Soule: Yeah, well, I want to make sure we give Nathalie a chance to jump in because she’s doing some really exciting work specifically in that space. My perspective—where I’ve seen some really interesting research—is actually on the data itself. Not just the lifecycle, but imbuing the data with different protection, so if it is leaked, maybe it’s not as big a deal. So there’s interesting work, for example, with financial institutions looking at creating privacy-protected versions of data. We create a synthetic version of customer bank transaction records, extract and remove all PII, so you could never identify the individual, and use that dataset to drive decisions. That way, if that information is leaked, sure, some business knowledge is leaked, but not actual customer information. So there’s a whole area of research around synthetic data and making data private that I think is going to be really powerful as a tool. But Nathalie, what are your thoughts? You’re so ingrained in this space.
Nathalie Baracaldo: Yeah, this question I really like because it touches upon the entire lifecycle of the model. In my perspective, risk is throughout the system. Right now I’m working on something really interesting: the concept of unlearning. A lot of people find it interesting that it’s not learning, but actually we’re removing knowledge from a model.
Tim Hwang: It’s like machine unlearning. You’re doing the opposite.
Nathalie Baracaldo: Yeah. It’s like a Yoda saying: you always need to unlearn. The reality is that when we have a machine learning model, we arrive at these large models by feeding lots and lots of data. As Kate mentioned, we try to mitigate what data goes in. However, because the data is so huge, it’s really difficult to filter everything. So at some point, even after we apply defenses—filtering, aligning the model—we may realize the model is spilling out data that’s bad. This will happen, just like in any security area; we see things way after.
Now, what do we do? Option number one is cry. No, I’m kidding. Option one is actually retrain the model, which is not going to solve the problem because of how long and costly it is to train these models. So the idea of unlearning is: rather than retraining, can we create a way to manipulate the model and forget specific information retrospectively? That is one of the things that has got me really excited because it’s a new angle towards security and lifecycle management of the model. I think it’s going to be the future. Tim, you asked the first question about the future; I see having not only guardrails and filtering, but also this way of going back to the model and modifying it to make it better. We don’t need to foresee every single thing that will go wrong if we can do this. So that’s one of the things I think is very trendy. Nobody knows how to fully solve it, but we’re there. It’s getting me really excited.
Tim Hwang: That’s so cool. You hear it here first, listeners: unlearning is the new hotness in machine learning.
Nathalie Baracaldo: I call it the new black.
Tim Hwang: This week, and late last week, rumors are swirling around a thing called “Strawberry.” If you are terminally online like me, there’s a lot of discourse about this potential model that OpenAI is going to release, which promises a substantial increase in capabilities and reasoning ability. Everybody’s saying it might be the model that finally brings the company into “Level 2” in their internal tiering, which is models with much more powerful reasoning.
This is a bizarre story because OpenAI has not disclosed anything publicly. Most of the discussion is led by a completely weird anonymous account that showed up a few weeks ago, goes by the handle “I Rule the World Moe,” which the algorithm just promotes into everybody’s feeds. It promised that today—the day of recording—would be the day we see this godlike model emerge. This account has promised a lot; many have called it out for not providing real detail and just adding to the AI hype.
So I think there are two questions. The first one is: this is just hype, right? We have no reason to believe OpenAI is going to release anything at all. Shobhit, this is just hype, right? We have no reason to believe anything is about to happen today.
Shobhit Varshney: Yeah, so he earlier said it was coming out Tuesday at 10 PT, and he’s been moving it around. There are all kinds of conspiracy theories, like whether this Twitter account is a shadow account for Sam Altman to build excitement. There’s so much fan fiction in the space; I can’t deal with it. I’m just trying to do machine learning here.
I think the arc of reasoning capabilities is improving; it’s not anywhere close to human, but it is starting to get better. I’m very encouraged by how enterprise-friendly features are being added—things like function calling, structured outputs, observability. So we’re moving in the right direction. OpenAI does need something significantly better than where they are now. They have enough competitors nibbling at all the benchmarks. So I do believe they have to release something mega pretty soon. The rumors about Strawberry are very encouraging, but we’ve never seen any benchmarks. The models showing up on LMSYS and others in shadow mode were revealed to be the new o1 model. You still haven’t seen any actual validation. It’s like predicting Apple will come up with the next iPhone—of course that’s going to happen.
Tim Hwang: I like that. A prediction: OpenAI will release something big at some point. Yeah, that makes sense.
Shobhit Varshney: And Tim, our clients, from an enterprise perspective, are no longer jumping up and down with the latest model releases. We’re at a point where, for enterprise value, there’s so much to be done before and after the LLM call. There are so many other, non-functional things: Where is my data? What’s the security? What’s the licensing agreement? Can I commercially use this model? How have I adapted it to my own data? There are millions of things that happen before and after. My team’s focus is on creating the end-to-end workflows with the right evaluations for business value unlock. The model itself we keep swapping out regularly. Our clients aren’t texting me like, “This beat the benchmark by 0.1! What’s up with Strawberry? Can I get Strawberry?”
Tim Hwang: That’s very interesting on the business side. There’s so much hype on social media, but day-to-day, clients aren’t asking about it. Kate, Nathalie, I’d love to bring you in on the research side. Having worked with researchers, what’s interesting is that a lot of this Twitter hype doesn’t impact the day-to-day. People know about it but aren’t really paying attention. Is that your sense? How do you view this whole weird news cycle we’re in this week?
Kate Soule: Okay, thanks. I haven’t been paying too much attention to it. You know, it’s a waste of time. We’ve got more interesting problems to solve than figuring out the meaning behind Strawberry. But I don’t know, Nathalie, what are your thoughts?
Nathalie Baracaldo: Yeah, the first thing I thought—I was very curious about “Project Q,” which seems to be the same as “Project Strawberry”—but being really day-to-day working with these models, my first thought is: they are saying we’re moving to the next level of AI when we cannot really fully measure the performance of the current chat-based models. So I meet it with skepticism. It may be great at answering certain questions in certain scenarios, but if you dig deeper and change the context a bit, it may not work. The reason is we are not very good at measuring model performance. There are tons of benchmarks, but if you throw the model into the wild, you see different stuff. So I meet it with skepticism. I’m pretty sure it’s going to be great, but the other thing I think is: how do you know what’s behind it? The fact that it’s closed-door makes me wonder: is it really intelligence, or are there rules on top of a model tailored to beat specific benchmarks? So we’ll see. That’s my take.
Tim Hwang: That’s a very interesting outcome: OpenAI drops a new big model, but because our evaluations are so crude, it’s unclear how much of an improvement it really is. That’s a potentially funny and interesting outcome.
Shobhit Varshney: I push back a bit on that, Tim.
Tim Hwang: Okay, you think it’ll be obvious?
Shobhit Varshney: When they take action, it’s going to be transparent. We do this every day with our clients. Everybody has some sort of a knowledge search use case. We create our own benchmarks, golden records, grounding of truth. We compare against those. We’ll do human evaluation, use an LLM as a judge. We see a meaningful difference when applying an OpenAI GPT-4o model versus a smaller model. We see a better, crisper response. We see quality improvements over the last 18 months to two years. I’m generally impressed with how well the models work, as long as you do the “before and after” ridiculously well. If you form the question right and get the data, the answers are getting better with model upgrades. I still don’t think smaller models can come close to what OpenAI models are doing. There are bespoke use cases, like COBOL to Java, where IBM’s model, with our first-party data and talent, will obviously outperform a general model. But for knowledge article use cases, or understanding the nuances of an IT ticket with 15 updates to find the root cause, the bigger models have better reasoning capabilities and do an exceptionally good job at finding the needle in the haystack, which smaller models can’t.
Tim Hwang: But Shobhit, do you think we’re at the point where a 0.01 increase in MMLU translates into, “This will improve my accuracy and reduce my cost by X”?
Shobhit Varshney: I see different weight classes. If you’re in the top league of frontier models, you won’t see that much difference from a small benchmark bump because other techniques have a higher impact. But for the same use case, if I go from Gemini to OpenAI to Claude, I do see meaningful changes in how they interpret and respond to data. However, once you pick a model, the way you ask the question, the embeddings, etc., have to be tied to it. You can’t just swap the model out and expect it to behave better. It’s not very plug-and-play right now. But if you find a model and adapt the “before and after” to it, you see a fairly decent quality bump. Again, different weight classes give different results.
Nathalie Baracaldo: Yeah, hearing Shobhit, I totally agree that large language models have substantially improved over smaller models. My comment was really towards how we measure those big models. I think we still have more research to do to measure their performance nicely. And I agree with Kate: a higher MMLU does not guarantee the model will perform great in certain use cases. So yeah, lots of interesting challenges to address there.
Tim Hwang: We are unfortunately at time. So Nathalie, Kate, Shobhit, thank you for joining us as always. And for all you listeners, if you enjoyed what you heard, you can get us on Apple Podcasts, Spotify, and other podcast platforms everywhere. We’ll see you next week.
Get inspired by a conversation between people who are at the forefront of innovation. Tune in to hear Malcolm Gladwell—one of the world’s most renowned thinkers and writers in social science—talk to leaders about technology that can transform your business.
There is so much hype about what AI can do, but how do you use AI to build experiences? In this series, the host Albert Lawrence together with business leaders and IBM technologists bypass the theoretical and show you how to put AI into practice.
Watch AI Academy, a new flagship AI for business educational experience. Gain insights from IBM thought leaders on effectively prioritizing the AI investments that can drive growth, through a course designed for business leaders like you.
Listen to engaging discussions with tech leaders. Watch the latest episodes.