For a hands-on learning experience to develop Agentic AI applications, join our Agentic AI Bootcamp today. Early Bird Discount
/ podcast / Robin Sutara on Responsible AI, Governance, Diversity, and People Behind Data

Robin Sutara on Responsible AI, Governance, Diversity, and People Behind Data

In this episode, we talk to Robin Sutara, Chief Data Strategy Officer at Databricks. From Apache helicopters in the U.S. Army to leading global data strategy, Robin brings a truly unique perspective. She shares sharp insights on enterprise AI adoption, regulation’s impact on innovation, and why responsible AI starts with people. Robin unpacks the real-world use of RAG, compound AI systems, and the crucial role of diversity and workforce enablement in the AI era

About Speaker

From repairing Apache helicopters near the Korean DMZ to the corporate battlefield, Robin has demonstrated success in navigating the high stress, and sometimes combative, complexities of data-led transformations. She has consulted with hundreds of organisations on data strategy, data culture, and building diverse data teams. Robin has had an eclectic career path in technical and business functions with more than two decades in tech companies, including Microsoft and Databricks. She also has achieved multiple academic accomplishments from her juris doctorate to a masters in law to engineering leadership. From her first technical role as an entry-level consumer support engineer to her current role in the C-Suite, Robin supports creating an inclusive workplace and is currently Chair of Women in Data US, as well as an advisor for several early startup organizations. She was also recognized in 2023 as a Top 20 Women in Data and Tech, DataIQ 100 Most Influential People in Data, and WomenTech Speaker of the Year Finalist.

Transcript

Chapter 1 — 00:00 | Introduction & Robin’s Journey — From Apache Helicopters to Databricks 

Robin traces her unlikely path from repairing Apache AH-64 helicopters near the Korean DMZ to 20+ years at Microsoft to her current role at Databricks — with Excel as the unexpected bridge between the two worlds. 

Raja Iqbal: Hello, everyone, and welcome to Future of Data and AI. I’m your host, Raja Iqbal. My guest today is Robin Sutara, Chief Data Strategy Officer at Databricks. Robin has previously worked in key roles like Chief Data Officer for Microsoft UK and Chief Operating Officer of Azure Data Engineering. It is my pleasure to have Robin on the show. Welcome. 

Robin Sutara: Thank you, Raja. So glad to be here. 

Raja Iqbal: So I was looking at your background — you started your career as a technician for Apache Helicopters and now you’re in a very important role at Databricks. From Apache Helicopters to Databricks — tell us about that journey. 

Robin Sutara: I think it maybe isn’t unusual these days for people to come from differing backgrounds into data. I am a bit eclectic in that I didn’t start with a traditional STEM undergraduate background — with the caveat that I actually did start university studying computer engineering. But I ran out of funding after two years, and I ended up enlisting in the US Army. Based on testing, they placed me into repairing Apache AH-64 helicopters — electrical and weapons systems. I was stationed in Korea on the DMZ for a while. Loading Hellfire missiles, loading the 50-millimeter machine guns. I was really just focused on how I was going to serve my time in the military so I could use my GI Bill and eventually go back to school for technology. 

Interestingly, while I was stationed at Fort Campbell, Kentucky, Microsoft came out with Excel. It was still relatively new, and we were trying to figure out how to use it to track maintenance records and Apache helicopter parts. Because it was a computer typing job, they thought it fell to the girl. I think now they all sort of regret that decision. But it was wonderful for me — that was my first footstep into data. I started thinking: how do we optimize these processes? How do we make sure the right parts are in the right place at the right time? 

So I told everybody when I got out of the Army that I was going to work for Microsoft. They never thought it would happen. When I got out, I went to night school while doing computer hardware repair during the day, and I was fortunate to get an opportunity to interview at Microsoft — coming in to do Windows 3.1 support. I got hired in the late 90s and had a fabulous 20-plus-year career there, moving back and forth between business and technical roles. 

My last two roles at Microsoft were really about helping the company think about its internal transformation — how to use data and AI to deliver on the goals Satya Nadella came in with. I was Chief Operating Officer for Azure Data Engineering, which owned everything from SQL Server on-prem up through visualization — all the databases, warehouse, ingestion tools, governance, Purview. Then I was asked to move to London to serve as Chief Data Officer for Microsoft UK, which was half internal facing and half representing our platform capabilities to customers and feeding their feedback back into the product group. 

And then the last two and a half years I’ve been at Databricks as Field CTO, which means I get to travel the world and advise organizations on how to use our data platform as the foundation for their data and AI transformations — not just the technology, but the people, the process, the organizational design, and operating models. 

Raja Iqbal: I love that — from Excel to Data Lakehouse. And you’ve gone full circle. Apache helicopter to Apache Spark. 

Robin Sutara: I always tell people: Apache to Apache. 

 

Chapter 2 — 13:00 | EU vs. US: Does Regulation Actually Slow Down AI Adoption? 

Robin challenges the conventional wisdom that EU regulation slows innovation — and argues that the absence of clear legislation in the US is actually creating more hesitation, not less, particularly in regulated industries like healthcare and financial services. 

Raja Iqbal: You’ve worked on both sides of the Atlantic — as CDO for Microsoft UK and now in the US. What differences do you see in enterprise AI adoption between Europe and the United States? 

Robin Sutara: My experience is primarily with developed countries, so my point of view may be narrow in that respect. But when I think about the UK, Canada, the US, and Mexico — the problems organizations are facing are actually relatively similar. The regulatory requirements may differ slightly, but most of them are talking about the same fundamentals: explainability, transparency, lineage, impact on consumers, the right to be forgotten. Those things are consistent across developed countries, regardless of where you reside. 

What I find is the biggest difference for most organizations is the expectations of their employees and consumers around the data and AI products and services being delivered. Let me give you an example. When I moved from the US to the UK for Microsoft — which was during the pandemic, one week after Brexit — I landed in London and immediately went into lockdown. There were very few products on the shelves because lorries were being stopped at the border. And I started thinking: despite the fact that I now live in the UK, where I have much broader consumer protection, how much personal and health information was I willing to give up to the NHS or to Tesco just to get my groceries delivered? The situational requirements of that moment changed what I was willing to disclose. 

Raja Iqbal: What I hear from you is that the absence of regulation is actually slowing things down in the US — because EU organizations are clear on what standards they need to meet, and that clarity is actually allowing them to move forward. 

Robin Sutara: In broad strokes, yes. Every organization has to decide what their risk-versus-innovation balance looks like. Databricks has put together an entire security framework covering 68 attributes of AI that organizations should come to agreement on across legal, compliance, business, and IT. For EU organizations, because they have clear standards they’ll be held accountable to — with significant monetary implications for non-compliance — that almost creates a foundation for them to build from. Whereas in the US, the executive order is no longer in effect, there’s a lot of regulatory uncertainty, and organizations are trying to balance innovation with whatever legislation might eventually come. 

In fact, for regulated industries in the US — financial services, healthcare, public sector — I see a slower pace of innovation on customer-facing applications than I was seeing 18 months ago in the UK or the EU. The fundamentals of good governance — traceability, explainability, transparency — are the same regardless of jurisdiction. Organizations with a higher risk appetite are still building with those foundations in place. I don’t see either side of the pond completely slowing down on innovation. 

 

Chapter 3 — 21:00 | Culture, Risk Appetite & What Really Holds Companies Back 

More than regulation, more than technology, Robin identifies organizational culture as the single biggest barrier to AI adoption — and points to change management failures as the reason most transformations stall. 

Raja Iqbal: So is it fair to say that more than the regulatory environment, it’s the industry, the company culture, the company size, and the risk appetite that actually dictates how quickly they adopt? 

Robin Sutara: Yes, and I’d add one more factor: the culture of the organization itself. I’ve worked with companies that created amazing innovation but couldn’t get their business users to actually use the data products or AI they built internally. For many of them, it’s because they forgot about the people. They forgot about the change management required to bring the organization along on the evolution they were trying to execute. It’s a complex formula, and I honestly don’t see any particular country or region moving significantly slower than any other. It really comes down to the organization. 

 

Chapter 4 — 26:00 | Databricks’ Evolution — From Lakehouse to Data Intelligence Platform 

Robin explains how a company founded by five PhD students out of UC Berkeley to solve big data and ML problems evolved through the Lakehouse era and is now rethinking itself entirely for the GenAI age — including how it would disrupt itself if it were starting today. 

Raja Iqbal: Databricks started as a machine learning company, became the data platform of choice, and now is going full circle back to AI. How is Databricks adapting? 

Robin Sutara: I can always tell how long an organization has been a Databricks customer based on which team is using the platform. The company was founded 11 years ago — five PhDs out of UC Berkeley, the creators of Spark — really trying to solve the big data and ML issues organizations were struggling with. Early on, they realized it wasn’t just unstructured data in the lake they needed to address. It was structured data too. That led to the creation of the Lakehouse about eight years ago — bridging the gap between structured and unstructured data. 

Today, after acquiring MosaicML about two years ago, we’re really rethinking what a data platform of the future looks like. It’s no longer a separation between AI and BI, or engineering versus data science. We’re calling it a data intelligence platform. There are two parts to that: one is how do we help companies build the AI they want, on their structured and unstructured data, in a way that allows them to swap models in and out as the technology evolves — because the top model of today isn’t going to be the top model of tomorrow. The second is how do we use AI inside the platform itself. 

Our CEO essentially stood up about 18 months ago and said: if we had to recreate Databricks today, how would we disrupt ourselves? How would we do things differently? The result of that question is a lot of what you’re seeing us build now — making the platform smarter, more intelligent about the organization’s own semantics and business definitions, and automating infrastructure management so organizations can focus their talent on solving business problems rather than managing servers. 

Raja Iqbal: So now company-specific jargon — like a company calling its employees “Tricksters” — can be understood natively by the platform. 

Robin Sutara: Exactly. If a Databricks employee types “Trickster” into a natural language interface, we’re not looking for a Lego. We’re looking for a data employee. Understanding what “revenue” means to a specific company, what their fiscal year is, what a quarter is, what EMEA or Americas means in their context — that’s the kind of intelligence we’re building into the platform. Combined with automating things like cluster management, workspace setup, query optimization, and job prioritization. How do we use AI to handle those so organizations can focus their limited talent on the right things? 

 

Chapter 5 — 36:00 | RAG, Compound AI Systems & The Risk of Exposing Proprietary Data 

Robin maps out how enterprises are actually building AI today — not simple prompt engineering, but layered compound systems combining proprietary models, open source, and RAG — and why the Samsung incident is still the cautionary tale every CIO should be referencing. 

Raja Iqbal: For enterprises dealing with their own proprietary data, what do you see in terms of how they’re approaching RAG, fine-tuning, and custom models? 

Robin Sutara: At the very beginning, everyone thought they were just going to do prompt engineering. Then they realized leveraging something like OpenAI for every use case was expensive, and it didn’t necessarily have the domain knowledge of their organization. There was also the concern — very understandably — about what happens to their intellectual property. The Samsung example is probably one of the most famous: employees put trade secrets into ChatGPT trying to do the right thing for the company and essentially exposed that information outside the organization. 

So I think there’s still a real level of risk aversion to feeding intellectual property into large open models — even in a RAG implementation. For most organizations, the answer is some form of compound AI systems: leveraging a proprietary model like OpenAI for certain components, creating their own models for others, using RAG on specific data sources, and tying it all together in a platform that maintains lineage and governance across the whole stack. 

Insurance is a great example. Claims processing is a compound problem. You need document extraction, OCR for handwritten notes from adjusters in the field, computer vision for uploaded photos, and validation that those photos aren’t deepfakes — which is a real problem now, with AI-generated traffic accident images being submitted as fraudulent claims. All of that is a compound system: proprietary model for one component, internal capability for another, RAG on top of that, human in the loop at the end. But the result is saving hundreds of thousands of claims adjuster hours. 

 

Chapter 6 — 44:00 | Multi-Agent Frameworks, Open vs. Closed Source & The Future of Models 

Robin outlines Databricks’ approach to agentic AI — supporting an open ecosystem while building native capabilities — and gives her honest take on where open source and closed source models are heading in the enterprise. 

Raja Iqbal: How is Databricks approaching multi-agent systems? Do you have your own frameworks, or are you integrating the existing ecosystem? 

Robin Sutara: We have frameworks built into the platform, but it’s an open system — so if you want to bring LangChain or LlamaIndex, you can. Our intent has always been to support an open ecosystem so organizations can bring in whatever capabilities they’ve already built. We’re also building more native capabilities in this space, and there’ll be some significant announcements at our Summit in June in San Francisco. Highly recommend attending or joining virtually — there’ll be meaningful innovation around what the teams have been working on for the past year. 

Raja Iqbal: Open source versus closed source — where do you see the enterprise going? 

Robin Sutara: A combination of both, but I’m seeing more of a shift toward open models. The regulatory and legislative trajectory is pointing organizations toward needing to explain to regulators exactly what model they used, what weightings went into it, what data trained it. That’s much easier to do with an open model. That said, there are definitely proprietary models that solve very specific, niche problems efficiently. I think the power of the Databricks approach is being able to tie components of a proprietary model and an open model together in a compound system to solve for the actual business output you need. 

Raja Iqbal: Large language models or domain-specific small language models — which is the future? 

Robin Sutara: Again, both — depending on the use case. We’re definitely seeing an uptick in domain-specific models being created right now. Large language models have been around long enough that organizations have figured out their limitations, and many business problems don’t need that level of scale. More targeted models that can be fully explained and controlled are becoming increasingly attractive. 

 

Chapter 7 — 50:00 | The Human Side — Job Evolution, Change Management & Workforce Enablement 

Robin pushes back on the job replacement narrative and reframes the conversation around job evolution — arguing that the organizations winning at AI are the ones investing in the people who have domain knowledge, not the ones replacing them. 

Raja Iqbal: AI is reshaping the workforce. Where do you see this going? 

Robin Sutara: For most organizations I work with, AI is genuinely being used to improve productivity. But it comes back to culture. There’s something almost amusing — in an uncomfortable way — when people say “we saved 15 minutes per employee across 20,000 employees.” Okay, so what did they do with that time? Did you help the organization understand that those 15 minutes are now available for the thought leadership or the project they never had time for? Because if you don’t answer that question, people just see pieces of their current role slowly slipping away as automation takes hold. 

The organizations doing it well are thinking about every persona across the company. Not just data scientists and engineers, but the finance team member who has been doing the same role for 25 years. That domain knowledge is invaluable — especially when we’re talking about bias mitigation. How do you sit with that person, understand their pain points, and show them the value of what’s possible? Those organizations are going to be the ones that truly transform at the pace they want to. 

Raja Iqbal: We see this too — workers who intentionally resist adoption because they fear replacement. 

Robin Sutara: It happens at almost every organization. There will always be some people who can’t yet see the art of the possible. Sometimes that’s because a significant portion of their current function could be automated. For those individuals, I think about pivoting — how do we give someone with deep industry domain knowledge the tools and capabilities to do things differently, so they can rethink their role rather than lose it? That’s not an instantaneous thing. I worked on digital transformation at Microsoft for 15 years and still left while they were still transforming. It requires sustained investment in enablement, in training, in nurturing technical curiosity where it exists — and thoughtfully recreating the function for those who don’t have a technical background. 

 

Chapter 8 — 58:00 | Bias in AI & Why Diverse Teams Are the Only Real Mitigation 

Robin makes the case — drawing on Caroline Criado Perez’s Invisible Women — that bias in AI isn’t primarily a technical problem. It’s a people problem. And the only real mitigation is building diverse teams who can see the blind spots that homogeneous teams miss. 

Raja Iqbal: Large language models are built on data that is inherently biased — because humans are inherently biased. Does this worry you? 

Robin Sutara: Anyone who hasn’t read Invisible Women by Caroline Criado Perez should. It’s a fascinating look at the impact biased data has on everything from city planning to seatbelt design to job definitions. These things were designed based on data that essentially represents the average man — five-foot-eight, 160 pounds. And if you look at society, very few people fit that profile. So we are leaving out a majority of the world if we limit ourselves to the data sets we have. 

This is why data teams need diverse representation — not just in the data, but in the people working on the data products and services. If I create a team of all women, or all veterans, or all Americans, I will inherently bias the products and AI I’m creating. There has to be someone in the room who says: “Wait — if we use that, we’re missing the perspective from Bangalore, or we’re missing the perspective of someone who didn’t go to university, or someone from a different socioeconomic background.” 

Let me give you an example from that book. A city created an algorithm to prioritize snow clearing. They cleared roads first — obviously, to get people to work. But they didn’t clear sidewalks. Anyone who took public transportation and couldn’t afford to drive was suddenly unable to get to work safely. That wasn’t malicious. The people who built the algorithm just didn’t think about it — because no one on the team had to walk to work in the snow. 

Raja Iqbal: The Google Gemini incident is a good example of overcorrection — trying to mitigate bias and ending up with historically inaccurate results. 

Robin Sutara: If the Gemini team had had enough diverse representation working on that release, would they have caught that mitigating effect before it went public? Potentially. It’s definitely a balance. We will never fully mitigate all bias. This is exactly why I don’t think we will ever get to a point of zero human intervention in AI systems. The people with domain expertise — the ones who say “wait, that’s not right, that’s not how we’d expect that result to be” — are the very people organizations are tempted to displace. If you do that, you’re introducing incremental bias into your AI solutions by removing the people who would have caught it. 

My 20-something kid will say “ChatGPT said this happened in the 80s,” and I’ll say — baby, I was there in the 80s, I can tell you for sure that did not happen that way. We have to build feedback loops. We have to create structures that allow people to say “that’s not right” and have that signal actually reach the system. 

 

Chapter 9 — 01:08:00 | Lightning Round 

Bias mitigation vs. model performance. The most disrupted industries. Job evolution, not elimination. Culture as the biggest barrier. Open vs. closed source. And one resource Robin recommends for anyone trying to understand where this is all going. 

Raja Iqbal: Lightning round — quick answers. If resources were limited, what do you address first: bias mitigation or improving model performance? 

Robin Sutara: Bias mitigation. Absolutely. 

Raja Iqbal: Which industry will be most disrupted by AI? 

Robin Sutara: Ultimately every industry will be disrupted at some point. Right now, anything in professional services — anything that depends primarily on aggregating knowledge into strategy and recommendations. They’ll still be around, but they’ll have to innovate new business models. 

Raja Iqbal: Will AI result in job elimination, job creation, or job displacement? 

Robin Sutara: Job evolution. For most roles, there is the capability to think about how that role evolves or changes rather than being completely replaced. The domain knowledge those employees carry — particularly for bias mitigation — has real value. The goal is to help them evolve their functions, not replace them. 

Raja Iqbal: Biggest challenge to enterprise AI adoption — technology, skill gap, regulation, or culture? 

Robin Sutara: Culture. Helping people see the power and value, and then taking them along on that change journey. 

Raja Iqbal: Open source or closed source — which wins in enterprise AI? 

Robin Sutara: Most organizations will use a combination of both. But I do see open models becoming more prevalent, given the regulatory direction toward explainability and transparency. Proprietary models will still be essential for complex, niche problems. But for most business problems, you don’t need that level of power. 

Raja Iqbal: Large language models or domain-specific small language models? 

Robin Sutara: A combination of both, depending on the use case. We’re seeing a definite uptick in domain-specific models right now, as organizations have identified the limitations of large models for specific business problems. 

Raja Iqbal: One resource — book, paper, podcast, thought leader — for understanding how this revolution unfolds? 

Robin Sutara: There are so many phenomenal resources because the space is moving so quickly. I enjoy CDO Magazine for the top issues facing executives today. The Data Chief podcast. And Databricks has some excellent blogs that go beyond the technology to talk about how organizations are actually using platforms to drive real business value. The problems you’re trying to solve are probably not unique to you — and looking across industries for patterns is one of the most powerful things you can do right now. 

 

Chapter 10 — 01:14:00 | Closing: What Excites Robin Most 

Robin’s answer isn’t about a model, a platform, or a framework. It’s about access — and what it means for society when the power of technology is no longer limited to people who can code. 

Raja Iqbal: Last question — not a rapid fire. What are you most excited about as a human and as a technologist? 

Robin Sutara: What I’m most excited about is accessibility. The power of technology is no longer limited to technologists. My first coding language was Fortran — not super helpful these days. But what I love is that even people who have never written a line of code now have access to information and capability that was previously unimaginable. If I think about what that represents — what we can uncover as a society, the problems we can start to solve — I’m genuinely excited to see what that unlocks. 

Raja Iqbal: Well, thank you so much, Robin. It was a pleasure having you. 

Robin Sutara: Thank you so much. I really appreciate it. 

 

Subscribe

Sign up to get the latest on data science events and webinars