For a hands-on learning experience to develop LLM applications, join our LLM Bootcamp today.
Early Bird Discount Ending Soon!
Future of Data and AI / Hosted by Data Science Dojo

Robin Sutara on Responsible AI, Governance, Diversity, and People Behind Data

Robin Sutara
Field Chief Data Officer at Databricks

Listen on your favourite podcast app

Robin Sutara | Responsible AI, Governance, Diversity, and People Behind Data | Podcast With Data Science Dojo

What do Apache, Excel, Microsoft, and Databricks have in common? It’s Robin!

From being a technician for Apache helicopters to leading global data strategy at Microsoft and now Databricks, Robin’s journey is anything but ordinary. In this episode, she shares how enterprises are adopting AI in practical, secure, and responsible ways—without getting lost in the hype.

We dive into how Databricks is evolving beyond the Lakehouse to power the next wave of enterprise AI—supporting custom models, Retrieval-Augmented Generation (RAG), and compound AI systems that balance innovation with governance, transparency, and risk management. Robin also breaks down the real challenges to AI adoption—not technical, but cultural.

She explains why companies must invest in change management, empower non-technical teams, and embrace diverse perspectives to make AI truly work at scale. Her take on job evolution, bias in AI, and the human side of automation is both refreshing and deeply relevant. A sharp, insightful conversation for anyone building or scaling AI inside the enterprise—especially in regulated industries where trust and explainability matter as much as innovation.

Next Recommended Podcast:

Jay Alammar on RAG, AI Education, and Industry Transformation

Transcript

Full Episode Transcript 

Raja Iqbal: 

Hello, everyone, and welcome to Future of Data and AI. I’m your host,  Rajagopal. My guest today is Robin Sutara:. Robin is the chief data strategy officer at  Databricks. Robin has previously worked in key roles like chief data officer for Microsoft UK  and Chief Operating officer of Azure Data Engineering. It is my pleasure to have Robin on  the show. 

Welcome to the show, Robin. 

Robin Sutara: 

Thank you Raja. So, so glad to be here. 

Raja Iqbal: 

Yeah. So so I was looking at your background, you started your career as a technician for  Apache Helicopters and now you are in data and the role of the Databricks. So tell us about  that journey. I mean, this is, this is I would not say it is unusual but interesting. Right? So, I  mean, from Apache Helicopters to Databricks and in a, in a very important key role. 

So tell us about the journey. 

Robin Sutara: 

Yeah, I think, it maybe isn’t unusual these days for people to come from differing  backgrounds into data. I think, I think I am a bit eclectic in that I didn’t start as a traditional  sort of undergraduate Stem background, with the caveat that I actually did start university,  for studying computer engineering. So at the time, but at the time, I think, you know, I was  really random number generator, chipsets was really what people were focused on. 

I mean, I always teased, I sort of coded. My first coding project was an Ada in Fortran. So, I  don’t even know that people use those languages anymore. So I really did think I’ve always  had sort of this interest in technology. And that’s interesting and sort of that background.  Unfortunately, due to a lack of funding, I had to figure out an alternative route, to, to sort of  enter into the career field. 

And so, after two years of school, I, I ran out of funding for my university, and I ended up  enlisting in the US Army, which is where, based on, testing, they sort of place you into roles.  And I was fortunate to land and at the time, with the Apache, aged 64, helicopters doing the  electrical and weapons systems, and so it was very much, still gravitating toward my  passion around technology. 

But how was I going to be able to turn this into a career? And so I was really just turning sort  of screwdrivers, right. Loading. Hellfire missiles. Loading. The 50 millimeter machine guns  were stationed in Korea on the DMZ for a while. And so I really, just sort of focused on how  was I going to serve my time in the military so that I could use my GI Bill and eventually go  back to school for technology?

Interestingly, at the time, though, while I was stationed in Fort Campbell, Kentucky,  Microsoft came out with this amazing product called Excel, which I’m sure everybody at  some point has used that as their database of choice. But because it was still relatively  really novel and new, we were trying to figure out, particularly at my duty station, how are  we going to use things like Excel to be able to track things like maintenance records,  Apache helicopter parts to be able to deliver? 

And because it was a computer sort of typing job, they thought it fell to the girl. So I think  now they all sort of regret making that decision. But it was, it was wonderful for me to have  the opportunity to figure out sort of that was my first, I think, footstep into data and just say,  oh, wow, we can really start to think about how do we optimize our processes, how do we  make sure we have the right parts of the right place at the right time, to be able to do these  repairs as efficiently as possible? 

And so I told everybody, when I got out of the Army, I was going to go work for Microsoft. And they just really never thought that that would happen. So so when I got out, I went to  night school and did my while. I did computer hardware repair during the day. And then I  was super fortunate to get an opportunity to interview for Microsoft to come in and do AI,  E5 support. 

Windows three one. And so I got hired into Microsoft in the late 90s. And then I had a  fabulous 20 plus year career in various sort of, roles, starting in technology and moving  back and forth, I think, between business and technical, sort of sort of roles, as you 

mentioned, sort of my last two roles were, those that were really trying to help Microsoft  think about their internal transformation and how were they going to use data and AI to be  able to deliver on that? 

Those transformational goals that Satya Nadella came in and had for the company? And so  I had the opportunity to be, chief operating officer for Azure Data Engineering, which is the  group that owns everything from SQL server on prem, up and to the point of at the time, up  until the point of visualization, so was all the databases, the warehouse, ingestion tools,  governance, the purview, etc..  

So very, very exciting times as they were looking to do exponential growth. How could I help  them drive the business to be more data driven, as opposed to just being conversational  and their decision making? Could we really sort of ground those decisions in data and then  based on that role, then, I got asked to move to London, in, in the United Kingdom and  serve as the chief data officer, where my role was sort of half internal facing in on helping  the organization be more data driven, focus on data and AI and the capabilities that the  platform had to help the company, operate internally and externally. 

How could we, you know, represent that to the organization and get feedback into the  product group on what customers were trying to do? And then, as you mentioned, last two  and a half years now with Databricks, in this role and as, the field CTO, which essentially  means I get to travel the world and advise organizations on, how to get to the, you know,  how to partner with Databricks as they think about using our data platform as their  foundation for their data and AI transformations that they’re looking to do internally.

How do they think about not just the technology, but the people, the process, the  organizational design, operating models, etc.? So can I bring the 20 plus years experience  and Microsoft bring some best practices from the 12,000 customers at Databricks, to really  help our customers be successful? 

Raja Iqbal: 

That is great. Thank you for the overview, Robin. So, I mean, as you were describing it to me,  I had this I was almost thinking like, you know, that tagline for your career from Excel to  Data Lakehouse, right? I mean, if you look at look at this, you started with Excel. That was  your first job, and now you are, you’re advising people how to scale, right? 

Beyond those million rows. I think that was it. 

Robin Sutara: 

I always tell people I’ve gone full circle now. Apache to Apache. Right. So I’ve got an Apache  helicopter. 

Raja Iqbal: 

That’s very interesting, but I take it back to you. I say I love that, right. So that’s that’s also a  good point. Yeah. So, so you work with, as a chief data officer for Microsoft UK, and then  now you’re, you’re working in the US. So yeah, I’ve been on both sides of the ocean. Right.  Correct. So what similarities and dissimilarities and anything that comes up as a result of that in terms of, enterprise adoption of data in the, in Europe versus the United States,  right. 

Raja Iqbal: 

You know, I think regulations are a biggest difference. 

Robin Sutara: 

Yeah. 

Raja Iqbal: 

And maybe, maybe any other thing that you see in terms of enterprise adoption of, AI or for  that matter, you know, data and any challenges that, come. But yeah. Journey. 

Robin Sutara: 

I think that’s a great question. I actually think, again, I’m because my experience is  relatively limited to, you know, developed, developed countries as opposed to developing. I  would probably say my point of view is, is May is maybe narrow. And that and that aspect,  because if I think about the UK and many of the, you, countries that exist as well as, you  know, sort of Canada and the US, Mexico, are my primary areas where I’ve worked in, for  many of them, they are a developed country. 

And so the problems that they’re facing are actually relatively similar. So so you brought up  regulatory requirements, etc.. Right. If I think about the Air Act and GDPR and sort of all of  those things and, and I look at the US and think about, well, we don’t have the equivalent of  the EU Air Act, right? 

We do have state legislation that emulates similar sort of requirement and, regulatory  requirements at this point. 

Raja Iqbal: 

CPA you’re referring to DPA. 

Robin Sutara: 

Delaware has a version I think even right now, you’ll see, there are multiple states like  California. Delaware tends to lead, in this space. The states have created some level of AI  regulation waiting for federal legislation or regulation to come into place. And so while we  say they’re, you know, they are different countries and there’s different sort of cultural, you  know, expectations and backgrounds in organizations, which I would say is the biggest  differences. 

Right. British, organizations, are definitely different culturally than American organizations,  which are different than Canadian does them, you know, Mexican, organizations and  companies that we work with. And so if I let you in between Germany and the UK, right.  Like, there is a difference, I think with the people when you think about a transformation.

But when I think about the technical requirements, for many of those countries it’s very  similar. Right. So the regulatory requirements may be slightly different, but for the most  part they’re all talking about, you know, explainability, transparency, in the, in lineage, you  know, the impact it’s having on consumers. And can you be able to articulate that all the  way from the data sets that you use to the models, to the algorithms within the models, the  weightings, etc., to the data products and services and things like GDPR, CcpA at the  consumer wants to be forgotten. 

Do you have the technical capabilities to be able to deliver against that expectation, the  consumer expectation or citizen patient expectation, whatever it might be? And so that  tends to be very, very similar in developed countries, regardless of where you reside. But I  do find for most organizations, the biggest difference that they’re struggling with are the  expectations of their employees, the expectations of their consumers, etc. in, in the data  and AI products and services that they’re delivering for them. 

So let me maybe give you an example. So when I moved from the US with Microsoft, from  the US to the UK, it was actually during the course of the pandemic and it was one week  after Brexit. So. So I landed in the UK in London, immediately had to go into lockdown.  There was no, and there was very little products on the shelf at the time because most of  the lorries were being stopped at the border because they hadn’t figured out that whole EU,  UK, movement of the supply chain. 

And so it’s super sort of fascinating to think about, okay, how much information, despite  the fact that I now live in a UK, in the UK where I have much broader protection as a  consumer, how much personal information or health information was I willing to give up to  NHS or Tesco as a retailer to get things like my groceries delivered, because there was such  a shortage of supply? 

So I think, you know, examples like that sort of show us that, yes, there are regulatory  requirements, but situational requirements may, may create this environment where you’re  willing to rethink, what what information you’re willing to disclose, what you’re willing that  information to be used for. And now that we’ve come out of the pandemic, you’re seeing  sort of maybe a re hardening. 

I think of some of those GDPR requirements particularly being enforced out of Europe, less  so in the US. I think as we as we continue to go through, you know, regulatory decision  making with the new, with a new presidential, cabinet and members that exist today, there  is still a little bit of uncertainty for us. 

And so I think it’s always interesting to watch those global dynamics and what you’re willing  to opt into or out of and how it impacts and organizational decision on how they’re going to  use data or AI to deliver. 

Raja Iqbal:

In that sense. Because so, I mean, I have a technical glitch. Just give me one moment.  Yeah. 

Raja Iqbal: 

I mean, my apologies. Sincere apologies. Here. Let me see. Can you see me now? Okay.  Sounds good. Okay. So? So. Yeah. So the enterprises, let’s actually continue the discussion  here. So there’s the regulatory environment. Don’t you think that, the difference in the  regulatory environment that can make a difference in terms of how adoption happens,  because EU tends to be very, conservative when it comes to, you know, how to govern AI  and data. 

Do you think, that imposes some barrier on how enterprises are going to use AI? 

Robin Sutara: 

For some organizations, it can slow the pace of innovation. But to be honest with you, over  the last 18 months, I think since then, you know, sort of this, generative AI, sort of hype, that  has happened. Most organizations are actually trying to figure out practical applications of  how to leverage AI. Right. 

In some capacity, whether it’s internal facing, process optimization and, you know,  employee empowerment, whatever they’re looking to deliver, even in the EU. Or maybe  they’re, they’re less likely to do a customer facing application because of those risks or  regulatory requirements. It doesn’t mean I think that they’re slowing down the pace of  innovation. I think they’re just rethinking, the application of of those AI capabilities. 

And so how do they do it more internally where they can minimize their risks, etc.? But but I  think, to be honest with you, I think with, you know, the that GDPR has been around for a  significant period of time. EU, I act was communicated, I think relatively early. And so for  many organizations until that I think they see litigation around the EU, AI act and how it’s  actually being enforced. 

For many of them, I don’t see it actually slowing down. Their innovation, I do think they are  being cautious on making sure that they have the right components in place. They have the  right reporting in place. You know, they have the right capabilities. They should they be  questioned or should the regulator come back to ask about a specific application or AI  execution implementation that they’re doing across their environment? 

I think I think they are making sure that they’re taking that into account. But I don’t see  actually a difference in the work that I have done between, you know, the US and and the  EU or the UK in sort of their pace of innovation or their want or drive or desire to leverage AI  to innovate, just as quickly as the other side of the pond. 

And so I don’t see it being a prohibitive, I don’t see it being prohibitive for organizations to  actually execute against it. In fact, most U.S organizations are slower than what I see in the  UK or they or the EU, because there’s still a lot of uncertainty in the US on what is the  regulatory requirement going to be. 

And particularly in regulated industries like financial services, health care, you know, public  sector government, where there’s higher levels of scrutiny, scrutiny, there’s there’s maybe  almost a slower pace of innovation, there for those customer facing applications that we  saw 18 months ago in the UK or the EU. But but I don’t see there being a big difference  between organizations, how they’re executing, what they’re executing, or their pace of  innovation, to be honest. 

Raja Iqbal: 

So that’s a that’s a that’s a very interesting viewpoint, because usually, so what I hear from  you and please correct me if I’m interpreting it correctly. What I hear from you is that, the  absence of regulation is actually slowing things down as opposed to it is accelerating. You  know, because sometimes, I mean, we hear, we hear about, like, EU is, they’re creating a  lot of regulations, but, you know, and which is, slowing down what, how innovation can  happen because, in the way I, what I hear from you is slightly different, right? 

So because they are clear on what regulations are there, the and that that’s actually  allowing them to adopt, especially in, more regulated industries like healthcare and  finance. 

Robin Sutara:

Yeah. And broad strokes. Yes. Right. You will always have there will always be a difference  between organizations, their appetite for risk versus innovation. I think, every company is  trying to decide what that balance is and what is the right thing, for them to do what it you  know what? How much risk are they willing to sort of take on? 

Databricks has done some phenomenal work, actually. We have a field. So that has put  together an entire, security framework that takes into account 68 attributes of AI, that  organizations should think about and come to agreement across legal compliance. The  business it, etc.. And how much of that right sort of establishing what is that risk versus  innovation balance that they’re willing to take on. 

In an effort to actually leverage and execute against their AI strategies? And so, as much as  I’m making sort of a broad statement, I would say every organization has to determine for  themselves how much risk appetite they’re willing to take on to allow for a piece of  innovation, because it is a give or take, right. 

And making sure that they’re providing it. But I do I do find that for most EU organizations,  because they have those standards that they’re going to be held accountable to, and the  monetary, right implication of not complying could be relatively significant. That almost  leaves a base, you know, a basis for them to be able to leverage. 

Whereas in the U.S, in the US, we have an executive order. The executive order is no longer,  you know, in effect, there’s just there’s a lot of uncertainty. And so for many of them, unless  they have a state legislation or regulation that they’re trying to comply with based on their  business operations within that state boundaries, for most of those organizations, it is a lot  more of, how do how do we balance innovation and the pace of innovation while  minimizing our risk of our exposure for whatever the legislation will be, at the time that it  comes out? 

But again, for most of them, if you think if you look at the legislation, regardless of whether  it’s state, or in the case of the EU or the UK version of the EU, AI act, for most of them, the  fundamentals of it are the same. Right there. There has to be traceability, explainability, you  know, transparency. 

Like there’s just fundamental things that are required of whatever they are. And so for those  organizations that have a little bit more of a, risk appetite to innovate, they’re still executing  with those foundations in place to protect themselves as, as regulation and legislation  starts to get decided, in, in, in, outside of the EU. And so, I don’t know that I see either side  of the pond completely slowing down on the pace of innovation. 

If anything, I think the pace of innovation in the technology space is now creating an  environment where organizations, can leverage technology in better ways to be able to  execute, whatever pace of innovation that they want to, to, execute. 

Raja Iqbal:

I so, so, I mean, is it is it safe to, say that, more than regulatory environment, it is the  industry, perhaps the culture of the company, the company size, the risk appetite. That is,  that dictates, the, the how quickly they adopt as opposed to. 

Robin Sutara: 

All right. And then probably the one other factor I would add in there would be, you know,  the culture of the organization, like you said, the, the company themselves, because I had  worked with some organizations who have created some amazing innovation, but they  can’t get the business users to actually use that data product or service or a AI, that they,  that they’ve created within the organization. 

And for many of them, it’s because they forgot about the people and they forgot about the  change management that would be required on how to bring the organization along, on the  evolution of the innovation that they’re trying to execute, execute. So, yeah, so so it’s a  complex I think, 

It’s a complex. 

Formula. And I would love to just tell you in broad terms, but I don’t see any particularly  country or region or area of the world, at least in my travels and interactions that are  moving significantly slower, than anybody else.

Raja Iqbal: 

Okay. That’s, that’s great. And thanks for elaborating on this. In terms of, when you, when  you look at Databricks, it is interesting that, Databricks started with the as a machine  learning company. Yeah. And then, for a long time, people almost forgot that they are, they  are they started as a machine learning company, and they became, you know, the data  platform. 

Right? So the data platform of choice, you know, significant player in that space. And then  now, going full circle. Right. So now again, I so tell us, I mean, how is data bricks actually  adopting, adapting to this. Right. So because they really are seen as a data platform at the  moment. Right? So at least that’s how you look at it. 

I mean, maybe there are others who look at it as a ML company, but yeah, I mean, for a long  time they have been the data platform. 

Robin Sutara: 

Yeah. I can always tell, always when I walk into an organization, depending on what team is  using Databricks platform on on how long they’ve been with us. Right. As a, as a customer.  So as you mentioned, I mean, the company was founded 11 years ago, five PhDs out of UC  Berkeley. The creators of spark. So really, they were trying to solve the big data and ML sort  of issues that organizations were struggling with. 

I think early on, they realized that it wasn’t just unstructured data in the lake that they  needed to be concerned with. It was also the structured data. And so they went from that.  How how do we solve this big data ML problem to how do we now help customers have a  data platform that allows them to bridge the gap between their structured and  unstructured data. 

And that was the creation of the Lakehouse. Right. Eight years ago, and being able to, I  think, evolve since then that we are the data platform of choice for many, many  organizations. And so it’s been interesting that if you walk into a company primarily used on  the data science, they probably been with us since the beginning. 

Gather data. And if if it’s the data engineering team that’s the strongest group using the  Databricks platform, they tend to be the Lakehouse sort of an era, of users that came into  the company and today. Right. If I think about it, we have actually thought about how do we  continue to evolve the platform leveraging the technical capabilities that exist today that  didn’t exist when the company started 11 years ago. 

And so a lot of that has been, how do we apply AI in generative AI, and genetic systems, etc.  within the platform? So there’s sort of two problems that we’re looking to solve. About we  did the acquisition of mosaic ML about two years ago. Was that announcement, and we  really had to rethink, like what is a data platform of the future look like? 

It’s no longer sort of this, separation between AI and BI and sort of engineering versus data  science, etc.. And so while we had broken the, the some of those barriers with the  lakehouse, I think what we’re thinking about now is a data intelligence platform. And there’s  two factors to that. I think. One is how do we help companies think about the AI that they’re  looking to build? 

Right. And how do we make sure that they can do that on their structured and unstructured  data. So builds on the Lakehouse foundations. But how do we think about the pace of  technology that’s happening in AI now. So the top model of today isn’t necessarily going to  be the top model of tomorrow. And so for many organizations it was can we create a  platform that allows you to do in the in data science in a way that allows you to do things  like MLOps or Lem ops or swamp models in and out, depending on, you know, whether  what had the best return on your investment based on the sort of solution that you’re 

trying to solve, or how do we do things like compound or genetic systems, like how can the  platform support that natively building it in? Because, again, the intent at Databricks has  always been, how do we help companies minimize the amount of data copies, that they  have to create? How do we help them break down those silos? So it’s no, it doesn’t do any  organization any good if they constantly are having to copy data in and out to be able to  create a new model or a new, AI solution or a new data product to be able to deliver. 

And so I think foundationally, we have thought about how do we build on the lakehouse and  help companies with their own AI goals, objectives and missions. And how do we also  leverage AI, within our platform? So, you know, when I started with the company two and a  half years ago is very much about how do we make data bricks simple.

How do we make sure it stays an open platform so that we’re not, you know, locking  organizational data into our platform, that they can move it as they need it or be able to use  it with other tools and, and solutions that are built on on open systems. How do we make  sure that we’re doing that cost effectively? 

Now? It’s a lot of how do we make the platform smarter? How do we actually leverage AI  inside of the platform? How do we disrupt ourselves? You might be right. Our CEO  essentially stood up, you know, to a year and a half ago in ChatGPT and, and large language  models were really sort of at the precipice and said, if we had to recreate Databricks, how  would we disrupt ourselves? 

How would we do things differently? How would we rethink how we actually built some of  these products? And so it’s been a phenomenal pace of innovation for Databricks over the  last 18 months to really think about how would we have done things differently, right. How  

how do we make sure that we’re creating a platform that’s bridging the gap between your  structured data, your unstructured data with some level of governance and control so that  you have full end to end lineage and explainability and enforcement and policy, etc.. 

Across not just structured data, but also your AI asset. So your notebooks, your book  models, etc.. And so for us, it’s really been about how do we disrupt in that space. And so  how do we leverage AI in the platform to do things like understand business semantics so  that we understand, right. So for example, a Databricks employees are called tricksters. 

So if I type in right into I into a natural language interface trickster, I’m not looking for, you  know, a Lego. I’m looking for a data right employee. And how do so how how do we, you  know, leveraging the platform to be able to understand that what does revenue mean?  What is our fiscal year? What is a quarter. 

How do we define, you know, amea or America is etc.. And the other thing is, how do you  actually leverage the platform to make it more cost effective? How do we make sure that  we’re helping organizations? One of the biggest issues continues to be just infrastructure  

management, right. And turning clusters or servers on and off, or setting up workspaces or  optimizing it or prioritizing jobs or optimizing your queries, etc.. 

So how do we leverage AI in the platform to do automate some of those things so that  organizations can focus their talent, the limited talent that exist, right. How do they make  sure that they’re focusing them on the right thing, which is solving problems for the  business? It’s not managing infrastructure. And so I think you’ll continue to see us innovate  and disrupt in that space. 

How do we help companies create AI, and then how do we use AI in the platform so that it’s  intelligent about the organization, about, sort of the things that matter? 

Raja Iqbal:

Yeah. So so let me expand on the, this example, Brixton for example. Right. So and now  they can be, company specific. Jargon. They can be company specific terminology and for  that matter, intellectual, companies, intellectual property, they’re, proprietary data that is  sitting inside the platform. And hopefully, hopefully OpenAI and other models, they don’t  have access to it. 

And then they have not, so that that, knowledge is not built into your, whether it is a closed  source or open source models right now. And that’s why rag, rag or as we call it, retrieval,  augmented generation, those kind of, approaches exist. You’ll fine tune models, and  sometimes you build, domain specific and a very specific to your company and your own  custom models. 

So does Databricks. And for that matter, let’s even before I go to Databricks, I mean, so for  your businesses, I mean, do they, do you see the appetite for, you know, rank being the  platform of choice, a choice out of people? Are enterprises fine tuning their, models, or are  they building their custom models? 

What do you see? Actually, how are they dealing with their own proprietary data when it  comes to building application? 

Robin Sutara: 

Yeah. So maybe just one maybe, correction. So based on an organization’s usage of the  platform and leveraging the metadata that goes into Unity Catalog, we do know those 

things. Granted, it’s just within that organization in that workspace. But because of that, we  are able to extrapolate what do we think this table does? Well, you know, can we start to  use AI to do things like rolling column tagging like that? 

Right. But again, all of that, like you said, is an organization’s intellectual property. And so it  exists within their their instance of unity catalog within their workspace, so that we can  leverage the metadata in that way. And the platform is able to then expose that to them in a  way that allows them to actually apply what that is is an example, though, of, you know,  essentially what we call compound systems, right? 

I don’t think for most organizations, I do think, right at the very beginning, everybody  thought, oh, we’re going to be able to leverage these proprietary models, and we’re just  going to do prompt engineering. So how do we create everybody to be a prompt engineer?  And then they realize, oh, wait, that’s really expensive to leverage. A thing like OpenAI for  every use case and not they don’t necessarily have the domain knowledge of our  organization or they don’t necessarily have. 

We don’t want to give access to our intellectual property. Right. That feeds potentially back  into the model or. Right. We also, I think the example of Samsung is maybe one of the most  famous where the, the employees, you know, we’re really trying to do the right thing for the  company, put trade secrets into ChatGPT and essentially now exposed that outside of their  organization. 

And so I think they’re still, again, talking about that risk versus innovation. I still think there  is a level of, risk aversion to leveraging these big open source models. Even if you’re doing a  ragged implementation against those open source models. I do think for many  organizations, there’s still this question of how much control do we have and how much are  we exposing things that are really, really are intellectual property and are essentially, you  know, the foundation of our company and why we exist and what makes us unique or  valuable compared to our competitors. 

And so for most organizations, it is some level of compound, AI systems that they’re  creating. They’re leveraging, open AI or, you know, proprietary models to be able to solve  some part of the problem they’re creating their own models to be able to you’re right. And  actually doing, net new creation and models via open source capabilities or even things  that they’re building in-house to be able to deliver. 

Can they execute a piece of it with a small language model or lever drag against a piece of  it? Then the question becomes, how do you leverage the platform to tie all those pieces  together so that you’re actually then able to deliver some value proposition? So if I think of  examples like, insurance, is a great example, like how do you actually process claims? 

So it’s it’s a very simple, you’re really not a simple you. Right. But it’s very much comprised  of very similar sort of problems that they’re trying to solve. Can you do some, some level of  document extraction and can you do some level of, you know, OCR capabilities to be able  to read handwritten notes from the AI insurance adjuster in the field? 

How do you tie that together with, you know, some level of computer vision or unstructured  data, the, pictures that get uploaded. How do you now actually validate that those pictures  are not deep fakes and that they’re actually legitimate? Right. I think we saw this rise of  insurance companies dealing with false claims with AI generated, you know, pictures of  traffic accidents that they were paying out on. 

Like, there’s that. So I think that’s a great example of insurance companies have actually  started thinking about how do we do things like fraud and solve for those cases. And it’s a  compound system. It’s leveraging OpenAI for the right component of that. They’re creating  some level of capability internally, leveraging drag on top of that, and then being able to pull  those together into, end to end compound system, to be able to leverage, a capabilities to  automate some of it. 

And then the human in the loop at the end to validate, etc.. But it’s literally saving hundreds  of thousands of hours, you know, claims adjusters to be able to process that volume of  information. 

Raja Iqbal: 

Yeah. And I heard, this, agent AI and multi-agent systems, a few times in the conversation  so far. So does, Databricks. Do you do, is Databricks more around, you know, bringing in the  current existing ecosystem? So, you know, you have frameworks like Lang Lam indexed,  and the most notable ones. 

And there are others too. So are you have something your own frameworks for multi-agent  collaboration or you have probably the platform has it built in. I’m just curious. Right. So  because I work on the technical side of it, we have, bootcamp as well, right? We teach  people. I’m just curious. I mean, how how is it? I was Databricks approaching it. 

Robin Sutara: 

Yeah. So we have frameworks that are built into the platform to be able to deliver again,  some of that. But again, like I mentioned, it’s an open system. So if you want to bring chain  and you can go I think there’s lots of organizations who have some level of capabilities that  they’ve already built or been able to deliver. 

And so our intent has always been can the platform supports an open ecosystem so that  organizations can bring in the right capabilities that they need to be able to do, and support  those? But we are thinking about how do we help organizations automate some of that?  How much can we create as a result? There’ll be some big, exciting announcements that  come out at our summit in June of this year in San Francisco. 

So if you can’t attend in person, wrote highly recommend that you join us virtually. Because  there will be I think this will be a big space of announcements, of innovation, of what the  teams have been working on over the past year. 

Raja Iqbal:

And in terms of, open source versus, closed source. So, you know, you can you can use  OpenAI AI. You can use, lama or any of the open source models. So where do you see the  future is, do you think it is going to be open source or is it going to be closed source or  something else? 

Robin Sutara: 

I think, and that sort of ties back, I think, to the regulatory legislative requirements that  we’re going to see. I do, I do think there will be some level of proprietary models that solve  very niche sort of problems or issues as part of a broader compound or agenda ecosystem.  But I think lots of organizations are starting to really think about if I have to do something  like explain to the regulators what this model is, what weightings were used, what data  went into it, etc. I think I see more and more organizations trying to figure out how how  much of that can they make open so that they have more control and more visibility and explainability into it?

But there are definitely, there are definitely proprietary models that  are able to deliver efficiently or effectively. So I think provided that they’re super clear on  helping organizations be able to explain to the regulators specifically what part of the  compound system that propriety model is looking to solve for. I think we’ll continue to see a  combination of both, but I think that’s the power of something like the Databricks platform.

Can you tie into a compound system? Can you use components of a proprietary model and  an open model to be able to solve for the business problem or output that you’re looking to,  solve for? 

Raja Iqbal:

Okay. And when you when you see enterprises on their journey to adoption, what is what do  you think is a common call it common myth? Commonly when they get it wrong. And what  is the biggest barrier? I mean, so in general, what is what stops them or what holds them  back from being. And I am not a neo, company in the sense, basically adopting AI for their,  business processes. 

For almost every organization, it turns into, people process issue. And my observation. Yes.  And so either you don’t have the right data talent that’s been educated and enabled on sort  of the capabilities of the technology or the platform, or you haven’t thought beyond just  enabling your data personas to the business users, that actually have to leverage the  systems or tools.  

Robin Sutara: 

And so for, for almost every organization, we’ve talked about digital transformation or data  transformation for decades now, I think it’s almost this new era of people are super afraid of  what they don’t understand. And so for many organizations, their biggest hindrance is how  do we how do we not just do this migration of legacy, sort of technical debt or process debt  or business operational day? 

How are we not doing it right? We don’t want to bring that technical debt into a new format.  So, for example, when cloud first came out, I remember lots of organization just doing  almost a lifting shift from on prem to the cloud. The problem is they never thought about  modernization. So. Right. So could you optimize the way that that warehouse was  constructed to take advantage of the cloud and not just bring current construct from on  prem right into a different infrastructure?

I think we’re seeing that same thing with AI now. Yes. The process, the business process  works in the steps of ABC. So they’re not thinking about now, how do we change business  process to be Y and Z. Right. And and really rethinking internal processes to take advantage  of the of the technical capabilities. And I think for many organizations that’s hindering sort  of their ability to innovate as fast as they want, because all you’re doing is bringing your  business or process or enablement get from one format or one technology to another. 

And so we really have to think about how do we bring an organization along on that journey  to be able to say, what could you do if you weren’t having to do that manual task of  rationalizing 100 Excel spreadsheets, every week to be able to report on, on revenue? And  right. And so what are the things that you never have time to do? 

And how do we think about enabling technology to get rid of some of that stuff that people  are really afraid of? Do? Well, I have a job at the end of it. How much is going to be  automated? What is AI going to displace, or replace part of me? And so I think it is, you  know, it’s no different than the Industrial revolution where machines started to take over,  you know, manual processes that people were doing. 

We have to take people along on that journey to say, how are we going to enable you?  Because you have the domain knowledge, the understanding of the processes, the  understanding of the business, the understanding of our customers or patients or clients,  whatever that might be. And so I think for most organizations, it is that unlocking the 

domain expertise of the organization and making sure that the technology is an enabler, not  something to be feared by the company, by by the organization. 

Raja Iqbal: 

Yeah, yeah. So that’s great. So, Robin, you mentioned about jobs, and that was that the  next, I would use this as a segue way to, to talk about society. Right. So, at the end of the  day, we are humans, right? So we have, we live in the society. Our job. So, you know, our,  physical health, mental health, you know, how we work, where we work, all of that, is also  important to us. 

So, how do you feel about or, where do you see this? Things that are going to, to be, I mean,  no one knows exactly what is going to happen, but how do you see, I, adoption actually  shaping the future of workforce? 

Robin Sutara: 

Yeah. So so, like I said, I do see for many organizations that I work with, it is actually  optimizing for improving productivity. I think for many organizations, though, it comes back  to that culture. Like, how do you make sure that the organization, understands the value  proposition of that productivity increase? It’s always so, amusing, I think, when people say,  oh, well, we saved 15 minutes of all 20,000 employees at the company and say, okay, so  how did you translate that into something else? 

Like, what were they able to do? And now, as a result of saving the 15 minutes a day, did did  you now say, okay, now there’s the opportunity for you to deliver against, thought 

leadership or the next project that you haven’t had time to do. And so I think lots of  organizations were sort of missing that step of, oh, now we have to actually help the  organization understand, because otherwise they just see pieces of their current role or  functions slowly slipping away as we start to automate or leverage. 

I to be able to do that. And so I think those organizations that are doing it really well are  taking, you know, taking the company, the entire company and thinking about enablement.  You’re right. Absolutely. You know, organizations like yours that are helping us really bring  up then the data science capabilities and the next level of data scientists that will have. 

But how are we thinking about the business user and finance department who’s been doing  that same role or function for the last 25 years? That that domain knowledge is invaluable,  right. And being able to translate. And so how do we sit with them and understand like what  are their pain points and show them that value. And I think sometimes we miss that  opportunity. 

And so I think those organizations that are going to be able to innovate and truly transform  themselves in a way, that, you know, at a pace that they want to, they’re going to have to  think about every persona across the organization, and how do they create a way that  allows them to go on that transformation journey? 

Raja Iqbal:

Yeah. That’s, that’s a very interesting point. So, so we work with companies as well. And  then one of the areas that we have seen is internally is reluctance because, you know,  some of the workers, they think that they are going to be replaced. Yeah. Do you see that? I  mean, have you heard of this, that that is a barrier to adoption that workers intentionally do  not want to adopt? 

Robin Sutara: 

Because, yeah, I it happens at almost every organization. You’re going to have some  persona, right, that just can’t see the future. They can’t see the art of the possible. They  can’t see what what their job would look like. And maybe that is because, a majority of their  current function could be automated, or could we could leverage AI and such a capability. 

And so for those I really think about, you know, how do we think about, pivoting. So, for  example, I do think data is probably the best space in the world for us to bring, diverse  perspectives, diverse point of views. And that requires us thinking about how do we enable  somebody that has domain area of expertise or industry knowledge? 

How do we think about, you know, giving them the tools, the capabilities to be able to do  things differently, so that they can rethink their job or what that would look like as a result of  the knowledge that they have about the industry or the knowledge that they have about the  process, etc.. And I would love to say that like that’s an instantaneous thing.

But if you think about it, I worked on digital transformation at Microsoft for 15 years and still  left, and they were still transforming. They’re still transforming today after I left. And so for  these organizations, I think it’s just going to be this instead thing. I would say, it’s it’s all  about those those quick wins that being able to execute. 

But it’s also about giving your organization and the people across your organization,  particularly those that are worried about what does their job of tomorrow look like. And to  think now about what’s the enablement that you can give them, what’s the training that you  can give them? Where do you you have some level of technical exuberance or a desire to  learn and, you know, great new capabilities. 

How do you grow and foster that today? And then how do you rethink those that, don’t  necessarily have the same technical background or technical aptitude? What does a future  for them look like? And how do you recreate their function. And it’s a great it’s a long  process to take them on that journey. And so making sure that you’re investing, you know, in  the employees where it makes sense to take them along in that journey with you. 

Raja Iqbal: 

Yeah. And as a human, we know what these, these tools and technologies they are capable  of, and we also know their limitations. So speaking of, large language models, they are built  on data sets that are inherently biased. Right? And bias comes from, I mean, these  companies began to gather data from data that has been generated by humans.

And humans are inherently I mean, we have our own biases. So does this as a human, does  this worry you? You’re optimistic that these, these tools are going to eventually there is  going to be some self-correction that is going to happen. So that’s the first part of my  question in terms of bias and in general, overreliance of humans on these tools. 

I mean, does this worry you because I hear all sorts of, options. Some people, you know,  that they are worried. Some people, they say, no, I mean, I’m, I’m an optimist when it  comes to technology. I would love to take, like, maybe more of a, Robin Sutara, not the  field CTO, a Databricks, but as a human. 

I mean, how do you feel about this? 

Robin Sutara: 

Yeah, I think, anyone who has never read the book Invisible Women by Caroline Criado  Perez, I think it’s a fascinating read on, the impact that bias data can have on everything  from city planning to to job definition to how seatbelts in cars are decided. Right? They’re  all designed for, those things are designed based on data that is essentially the average  man, which is five, eight, 160 pounds. 

If you look at society, there are very few average men, right? There are very few men that are  only five, eight, right. At 160,000. So I it was such a fascinating book to sort of read through  to say, hey, we are leaving out a majority of society if we inherently depend on just limiting  ourselves on the data sets that we have. 

And that’s why I think, there has to be some level of introduction across the data teams.  And, you know, the data product teams that are being developed, how do you make sure  that you have diverse representation on that team? Because you’re right. We all have  inherent biases. So if I only create a team of all women, or I only create a team of all  veterans, or I only create a team of only Americans, I think I will be very inherently biased in  on the products or services, whether they be data or AI, that I’m creating as a company to  be able to deliver to society. 

And so how do we make sure that you are creating organizational teams and structures that  give you representation? Because there’s going to be somebody in that room that says, hey,  wait a minute. Like, if we use that, we’re missing out on this perspective from Bangalore,  right? Or, you know, somebody that didn’t graduate university, whether it’s socioeconomic  or cultural or whatever it might be. 

And so I really think when we talk human in the loop, it has to be a diverse team of humans  and really like, how do we really think about setting up our data teams, our organizational  structure, our enablement plans, making sure that we have diverse representation across  all of those? Because data in and of itself is inherently biased, right? 

It’s it was created with biases in mind. And unless you have somebody on the team that’s  going to help you recognize where those biases might exist, you might do something like  plan cities that don’t take into account people that don’t own cars. So you’re now creating a  socioeconomic bias for those that can’t have to walk to work or take public transport. 

Right. Or things like, the snow. There’s an example in that book about clearing snow. And  they didn’t actually they only cleared the roads and not the sidewalks. And so now you’re  essentially creating based on data, right? Our algorithm, AI models that were created to  

prioritize snow clearance. So essentially, you know, said anybody that took public  transportation that can’t afford to drive themselves to work is now putting at a  disadvantage because they’re unable to get to work as a result, because of the way you  prioritize that snow clearing, like, things like that, you just don’t think about, right when  you’re a city. 

I wouldn’t know if I had been a city planner. Of course I would prioritize getting the roads  cleared, over the sidewalks to get people to work, etc. and so it’s just really fascinating. On  if we understand those biases in the first place and somebody can point them out, how can  you then start being mindful and planful and taking those things into account for the  products that you’re going to deliver? 

You see it in healthcare. You see it in, particularly I think there’s some great examples in  there. I’ve done some, work with the women and data around women’s safety, and sort of,  you know, how do we leverage data to, to help in the women’s safety arena, etc.? And so I  just think there’s so much opportunity as a society, I think data is phenomenal.

We can solve some amazing problems with data and AI, but it does require us to be  deliberate about creating diverse systems, meaning not just the data being diverse, but the  people being diverse that are working on the data products and services to better make  sure that we can deliver. 

Raja Iqbal: 

Yeah. So remind me, I mean, this diversity aspect of it. So, I mean, mitigating bias can be a  tough problem. I’m, I’m not sure if you remember, one of the recent Google Gemini  releases. You know, they were trying to mitigate bias, right? So, show me a roomful of  CEOs, right? All white men showing up in a meeting room. 

Show me the founding fathers. I mean, they and then the AI can actually sometimes say  overcorrect things, and now you’re showing a mix of all colors and races. For the right  reasons. We are trying to mitigate the bias, but in some cases, there is some historical  factual correctness that you have to worry about. Right? So it can be actually very tough. 

Right? So because, inheriting models, they learn because of bias in data. Right. So they  fundamentally they learn because of some what we call signal. I mean, technically it is  bias, right? So it’s a it’s a fascinating problem and it’s a very complex problem to solve  actually. 

Robin Sutara: 

And it but if you would think about it, I mean, if, if the Gemini team had had enough diverse  teams, right, working on that version of the release, would they have caught that mitigating  bias before it went public? And all of a sudden they were, you know, creating this, you know,  incremental racial backgrounds of the Founding Fathers. 

But potentially I think it’s definitely a balance. And I think we’re there is a huge amount of  risk that we won’t be able to mitigate all bias, but it does work. This is why I don’t think we  will ever get to, not having some level of human intervention in these things, because.  Right, right. I it requires that it requires people to use this system. 

And this is why, I think for organization to hack in making sure that you’re enabling those  people that have that domain expertise of your business, of your processes, etc. if you think  you’re going to displace them, you are essentially introducing incremental bias into your  company, into your AI solutions that you’re creating, because you’re getting rid of the  people that have the domain area of expertise to say, oh, wait a minute, that’s not right,  right. 

Or that’s that’s not how we would expect that result to be, etc.. And so I really think, you  know, whether it’s organizational or societal or however it is, we have to think about what is  that feedback loop, how do we make sure that we’re allowing people to give that feedback  so that we can constantly work on optimizing and improving the system, because just  inherently having it’s always interesting, you know, my 20 something kid, you know, says,  well, ChatGPT said, this happened in the 80s.

And I was like, baby, I was there in the 80s. I can tell you for sure that did not happen that  way. Right. And so, like you said, they just have this, right, there are people that just have  this complete and absolute confidence in the results that they’re getting out of the system.  We have to make sure that we’re creating, you know, structure that allows people to say,  that’s not that’s not right. 

I mean, right, my my knowledge, my expertise, my, insight. And I want to have a team and  an organization that can bring that point of view because I only have a limited narrow based  on my life experiences, my, you know, my upbringing and my education, etc.. And so how  do we make sure that we’re creating organizations and teams and structures to be able to  support that, that people in that part of it. 

Raja Iqbal: 

And most probably having the right guardrails on, on these systems. Right. So I think that  it’s very important. So we’re coming to a close. Let’s actually just wrap up, quickly. So in the  next phase, but I will quickly ask you, you know, some, rapid fire questions, and then you  will answer, short answers. 

I mean, elaborate, if you like. Short answers would be, fun. Right? So, so if resources were  limited, which what do you address first in AI, bias mitigation or improving models,  correctness of performance? 

Robin Sutara: 

Bias mitigation. Absolutely. 

Raja Iqbal: 

Okay. In terms of, different industries, they are going to be disrupted by AI in different  manners. Sorry. Which one do you think is going to be disrupted? Which you what use  cases? In which industry do you think will be disrupted as a result of, the AI solution that we  are seeing? 

Robin Sutara: 

I think ultimately every industry will be disrupted at some point. I think right now, anything  that sort of professional services, anything that has a dependency, I think just on,  aggregation of knowledge into a strategy, capability, etc.. So I think professional services  immediately and at some point all industries are going to be disrupted. 

Raja Iqbal: 

So without naming names, I mean, no more of those consulting and services companies as  that’s what you’re telling me. 

Robin Sutara: 

I think they’ll still be around, but I’ll have to I think they’ll have to, you know, innovate new  business models. 

Raja Iqbal: 

And so no more, business and PowerPoints only. 

Robin Sutara: 

Exactly. 

Raja Iqbal: 

Yeah. In terms of, jobs, will AI result in job elimination, job creation or job displacement? 

Robin Sutara: 

Job evolution.

Raja Iqbal: 

Okay. Can you elaborate? I mean, I know I asked you for short answers, but, when you say  job evolution, what do you mean? 

Robin Sutara: 

Well, I think for, for most or, I think for most jobs or roles, I think there is the capability to  think about how does that role evolve or change as opposed to being completely replaced  or displaced by AI. And so, again, bringing that domain knowledge and understanding that  those employees have in that current role, there’s value there particular as we think about,  the bias mitigation, etc..

And so how do we make sure that we’re helping them evolve their roles and their functions  as opposed to displace or replacing them with AI? 

Raja Iqbal: 

Yeah. And what is the biggest challenge to enterprise adoption of AI? Is it technology? Is it  skill gap? Is it regulation? Is it culture? Something else? 

Robin Sutara: 

I think it’s still primarily culture, right? I think it’s helping people see the the power and the  value and then taking them along in that change journey. 

Raja Iqbal: 

Okay. In terms of, open source versus closed source models, which of the these are going  to be which one is going to be the winner? In enterprise, I. 

Robin Sutara: 

I think most organizations are going to use a combination of both, but I do see more open  models becoming, more more prevalent as opposed to closed models for regulatory and  legislative requirements and explainability and transparency. 

Raja Iqbal:

Is a time for me to sell my stocks for open set like OpenAI. I don’t think it’s graded right, so I  should not, 

Robin Sutara: 

MBA they’ll continue to sell amazingly, you know, big complex problems will definitely still  need proprietary. But I think for most organizations, you don’t need that kind of power to  solve every business problem. And so I think we’ll start to see more and more and open,  sharing of models and capabilities across the ecosystem. 

Raja Iqbal: 

And perhaps an extension of the same question, large language models or domain specific  small language models. Which one do you see being used? 

Robin Sutara: 

Again, it’s going to be a combination of both. I depend depending on the use case. I, we are  seeing more and more domain specific. I think, models being created right now, because  large language models have been around much, much longer, and organizations have  figured out the limitations on what business problems can and can’t be solved with those. 

Robin Sutara: 

And so we’re definitely seeing an uptick in, more domain specific business models being  created right now. 

Raja Iqbal:

Okay. And my last one here, if you have to mention one book or paper or a thought leader or  a talk that I should go and watch, to understand, how this revolution is going to unfold. And  really, if I want to understand what is going on, what would that book or paper or dog or  thought leader be? 

Robin Sutara: 

I think there are so many phenomenal books that are being created because the space is  moving so, so quickly. For me, yeah. Right. I enjoy, reading things like CDO magazine to sort  of get the top, the top issues facing executives today. I enjoy things like the Data chief  podcast, and Databricks has some phenomenal blogs I think that continue to come out not  just talking about the technology and the platform, but also how are organizations  leveraging those platforms and capabilities and sort of the real business value that they’re  being able to to drive as a result? 

And for many organizations, that problem that you’re trying to solve for is probably not  unique to you. And so looking across industries, how do you take something from retail or  supply chain and provide it to a form of your right, a form company to help solve or supply  issues, etc.? I think we’re going to see a lot more knowledge sharing. 

Robin Sutara: 

Some of these best practices so that we can continue to push the pace of innovation. 

Raja Iqbal:

Okay. And my last question, this is not a rapid fire question. Just a closing talk. But what are  you excited about as a human and as a technologist? When you look at, that everything  that is happening around us. 

Robin Sutara: 

I think what I’m most excited about is the accessibility of technology is now no longer  limited to just technologists. As I mentioned, I mean, my last coding language was Fortran,  right? So it’s not super helpful these days. I can get by and skill enough. But what I love is  the fact that even my, you know, parents and grandparents have access to the information  that, you know, typically. 

Raja Iqbal: 

And they can write Python. Now, I have a great grandparents can actually write Python  code. 

Robin Sutara: 

So if I think about the power now that that represents and sort of the impact, I think that we  can have on society, I think I’m super excited to see what that uncovers for us. 

Raja Iqbal: 

Well, thank you so much, Robin, for your time. It was a pleasure having you. 

Robin Sutara:

Thank you so much. I really appreciate thank you.

Subscribe to our Podcast

Subscribe to our podcast for the latest insights in data science, AI and technology. Let us know if you would like to know about our webinars, tutorials, newsletters and more!

Join our growing community of 900K+