The Center for Data Innovation spoke with Rumman Chowdhury, Global Lead for Responsible AI at Accenture. Chowdhury discussed a tool she developed that can reduce bias in an AI system and the value of a social science background in data science.
This interview has been edited.
Joshua New: You are the first person to have your role at Accenture. Could you describe your role, and why Accenture thought a position like this was necessary?
Rumman Chowdhury: This is a brand new role, and when I was hired into it, nobody knew what it actually meant or what it would entail. All they did know was that AI was a sweeping new technology that would impact everything we do so it was important we approach it ethically. That being said, Accenture has a really strong sustainability and responsible business practice, so it’s in the wheelhouse of what we’ve done before in terms of thinking about the social implications of of business. They needed someone who understood the technology, which I do as a data scientist, but who also understood how institutions and people can react to new policies or technologies, which I do as a social scientist.
When I first started, it was all about awareness. I started in January 2017 and we were all trying to wrap our heads around this space. Social scientists, philosophers, and technical folks were all trying to understand what it meant to do responsible AI or even measure the implications of the technology. So the first year’s goal was awareness and evangelizing, and this past year was about moving to positive action. Once we were all onboard with the idea that we need to be ethical, there was a lot of hand-raising about what that actually means. We started off with a lot of grand principles, but when it comes to actually executing them, that’s quite hard.
My mandate is not to just pontificate, but to actually build solutions and create a business out of it, and I relish that opportunity. It shouldn’t be at odds to want to do good and to want to do good business. That’s what I’m out to show—being responsible and ethical actually equates to good business. So my goals for next year have to do with agency and accountability, which is something I’ve been thinking a lot about. If you follow the conversations in the AI space, everyone is talking about “how to we unpack the black box?” and coming up with bottom-up solutions. But now we realized that this is not a bottom-up solution—governance is a critical component to this. My prediction for the next year is that you’ll see a lot about agency and accountability now that governments are paying attention, emphasizing regulation and accountability mechanisms.
New: You recently debuted an AI fairness tool you helped develop that can help identify biases in an AI model. How does this work exactly?
Chowdhury: This tool both identifies bias and helps users identify and fix bias. There is a very specific definition of fairness which ties into legal precedence about disparate impact and protected classes. Drawing on that definition, we created a tool that creates predictive parity.
It’s not about pushing a button and achieving fairness, it’s about enabling people to make good decisions. So one part of the tool looks at the relationship between a sensitive variable, such as gender, and a nonsensitive variable to show you how even if you were to drop a particular variable from your model, it still may have indirect causal links to other variables and impact your outcome.
The second part of the tool has to do with disparate impact. We actually see whether different subgroups of a sensitive variable are treated differently, and what’s interesting is we can actually fix a model for that. No statistical manipulation is perfect and it might lead to a decline in global accuracy, but what we find is that you can have global accuracy that is quite high if you have a non-representative dataset or if the burden of bias is held by one subgroup. Not surprisingly, this happens with subgroups that tend to be protected classes because those that are discriminated against will shoulder more bias.
The third part of the tool creates predictive parity. For classification models, we allow users to equalize the false positive rates, meaning the number of people who were granted something incorrectly. For example, if we’re talking about granting loans, this would be the portion of people granted a loan that shouldn’t have been due to their credit. What we illustrate with the tool is what your false positive rate is for different subgroups as well as your global false positive rate. This lets you make corrections to equalize them and we also show you the impact this has on your costs.
New: From what I understand, this tool would only be effective on certain kinds of models dealing with certain kinds of data related to sensitive variables like protected classes. However, a broader concern around ethical AI is preventing bad things from happening that aren’t so obvious or explicitly illegal. What would be involved in making this tool more generalizable?
Chowdhury: If we’re talking about what is actually legal or illegal, that’s already covered by people who focus on regulatory or compliance issues. My role is to venture into that gray area of things that are bad but not illegal. You’ve probably seen examples of people searching for pictures of CEOs and seeing pictures of mostly white men. We can see the problem with that, but it’s representative—it might not be fair that it is representative, but it is representative. So as Google decides how to adjust its algorithm to turn up these pictures, there is no right answer about what to correct for. Should it be representative and show what the world actually looks like, or should it try and represent a better version of society that we want to strive for? A lot of people think that this is a self-reinforcing bias, where if children only see pictures of white men as CEOs, only white men will become CEOs.
My title emphasize responsible AI rather than ethical AI for a reason. It’s not my role, nor do I think it should be my role, to tell people what morality is or what ethics are. I have yet to go to a single company that doesn’t have stated core values around diversity, for example, or respecting individuals, or other values that relate to these kinds of ethical questions. The point of developing the fairness tool was not just to correct imperfections, but also to spark conversations about how to approach these algorithms. That there are a lot of situations that raise a lot of challenging questions about fairness, and I want to enable data scientists to be able to go up the chain of accountability and figure out what they should be correcting for. While their model accuracy might change, it could make the results more fair. These are pretty heavy decisions to be made by an engineer so there should be directional guidance coming from the top down. This tool is meant for people to start focusing on that process.
In terms of generalizability, I would absolutely love to do that. I’m working on a paper now developing a framework to translate academic literature into applied tools for these situations. There are a lot of restrictions on what companies can deploy, but I’m fully on the lookout for interesting new research, partners to work with, and people to create prototypes with. My very ambitious goal for the next year is to develop a new prototype every quarter. We might not reach that goal, but I want to work super hard to get it out there. My next focus is to develop a tool for explainable AI. If I’m a company and I’m not making my own AI solutions in-house—which is the case with most companies—how do I understand potential bias in something that I’ve outsourced? So we’re planning to add to this suite of offerings in this space with more tools and more generalizable products.
New: You’ve spoken before about how challenging it can be to figure out what making AI fair or ethical actually means. This is actually a major frustration of mine in many tech policy discussions, in which someone will say “AI should be fair,” and everyone nods and then the conversation moves on without actually digging in to what that actually means or how to operationalize it. Since you started at Accenture, have you seen this debate progress in a meaningful way? What can people do to address this issue in a more productive way?
Chowdhury: I think there has been so much evolution in this space. Everyone is in on this together, but everyone, including academics and people that have been grappling with these questions for years, are still grappling with the application part. Additionally, when you take the narrative of fairness outside of the western world, these issues look very, very different.
I think of this as like the Gartner hype cycle of ethical AI. When I first started, we had the Asilomar AI principles and statements like “AI should do no harm,” which is like the most standard statement you could make about any technology, but good luck operationalizing that. What do you mean by harm? Psychological? Physical? And harm to whom? These are actual real questions that didn’t get answered and this is a higher bar than we hold any other technology to. When you start asking those questions, you end up falling into this pit of philosophy and discussing what harm actually means. This can be really interesting and some people work on this full-time, and that’s great, but we need to be able to actually crawl out of this pit and deliver answers.
The fairness tool was inspired by this talk about 21 definitions of fairness which highlights the ways people have quantitatively tried to define fairness. We settled on the notion of predictive parity because there is legal precedence and it’s something we take action on. What’s really interesting is that Google’s AI principles came out a week before we announced the tool and they too emphasized predictive parity, entirely independently of us. As we have these conversations, I’ve found it heartening that people are coming to some of the same conclusions. Of course I don’t think this should always be happening and I get asked about this a lot from engineering types who say “tell me how to define fairness and I’ll do it.” But wouldn’t that be a sad world to live in if we had only one way to define things like fairness, happiness, and so on.
When I go to companies, I frame it by asking what their core values are as an organization. So for example I’m working with a banking client right now to help them develop responsible AI governance. You would think that they only care about regulatory compliance, but they’re actually quite concerned about issues like the future of work and how jobs will be impacted by AI. This comes directly from their code of business ethics. I have an easy job in some ways because these core principles are already defined for me. I don’t worry about these broad global AI principles and focus instead on what companies actually want to be doing and the impact they want to have.
New: Unlike many data scientists I’ve spoken with, you have a social science background, rather than a straight STEM background. How do you think this has influenced how you approach your to AI?
Chowdhury: I will say that any quantitative social scientists is well equipped to be a data scientists. The only thing that that I had to learn when I came to SIlicon Valley was how to write scalable code. Social scientists are well equipped to be data storytellers. I guarantee that I got my first data science job because I explained that quantitative social scientists can take data and tell what your output is, but also tell you what it means in context. The term “data storyteller” didn’t exist at the time, but that concept of taking a quantitative output and explaining the context and understanding what your takeaways and influences might be is, in my mind, what a data scientist does.
I draw on my background almost every day. For example, I was talking with my co-lead the other day about how we needed to brush up on the First Amendment given the concerns about what’s happening now on social media. These issues require us to go back into our brains and think about some really fundamental social science concepts. All this talk about AI governance and governing bodies makes me have to think about statecraft and everything I learned about how governments form and how constitutions were written. On top of that, I get to think about all the stuff I did for my master’s degree about statistics and quantitative methods.
I also happen to have a subfield in political theory. So thinking through the fairness tool and disparate impact, I’m thinking about John Rawls and the concepts of justice and fairness. This is the idea of thinking about how to make policy as if you did not know what your educational background, gender, or other factors are, which challenges you to make policy as fair as possible.
The field could absolutely use more people who think this way. For example, Google just hired their first AI ethicist who used to be a philosophy instructor. I think this is a really great space for mixed conversation and I think that everyone has something to contribute.