Conf42 Machine Learning 2021 - Online

Responsible AI in Health: From Principles to Practice

Video size:

Abstract

AI has made amazing technological advances possible; as the field matures, the question for AI practitioners has shifted from “can we do it?” to “should we do it?”. In this talk, Dr. Tempest van Schaik will share her Responsible AI (RAI) journey, from ethical concerns in AI projects, to turning high-level RAI principles into code, and the foundation of an RAI review board that oversees projects for the team.

She will share some of the practical RAI tools and techniques that can be used throughout the AI lifecycle, special RAI considerations for healthcare, and the experts she looks to as she continues in this journey.

Summary

  • Tempest Vansky is a machine learning engineer at Microsoft. He will talk about responsible AI in health from principles to practice. When new technology is developed and unleashed, safety and responsibility considerations usually follow.
  • Some of the concerns with AI today are quite familiar from biomedical research. With medical research, privacy is obviously about most important, keeping that research data private. In responsible AI, we're starting to consider both of these types of harm and how to avoid them.
  • We have a responsible AI review board on my team because we work across really diverse customer AI projects. The following questions are very helpful to ask when thinking about the ethical implications of an AI project. How could this technology be misused and what could go wrong?
  • Health data is extremely sensitive and private, especially genetic data. An important aspect of responsible AI in health is collaborating with domain experts. Race and sex can also introduce a lot of bias into the model. There's also the issue of unequal access to healthcare.
  • The first tool is called data sheets for data sets. It helps us answer the question of whether a model treats different users fairly. The last tool helps us deal with models that need to be explainable.
  • So I wanted to share some resources. I would definitely recommend watching the coded bias documentary on Netflix. Kate Crawford's book Atlas of AI just came out. And then lastly, my team, commercial software engineering, the applied machine learning team is hiring in a number of different places.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi. My name is Tempest Vansky, and I am a machine learning engineer at Microsoft in a team called commercial software engineering. And today I'm going to speak about responsible AI in health from principles to practice. So, an overview of what I'll speak about is the historical journey of responsible AI, lessons from biomedical research in responsible AI, some ethical questions to ask about projects, and some tools to help when working on AI projects. So I'll start with the historical journey of responsible AI as I see it. So, when new technology is developed and unleashed, safety and responsibility considerations usually follow. So, for example, the development of cars. We take car safety for granted these days, but once there was the first car on the street, and then there was the first car fatality, and then can manufacturers started adding things like windshields and headlights and traffic lights on the street and seatbelts, and eventually a driver's test, which only came into being in 1955. So we take that all for granted, but it wasn't always there. And I feel like responsible AI is in a similar position to the very early days of cars, where the technology has been released. And now there's a lot of focus on safety and responsibility. And what's been interesting to observe over the last couple of years is that machine learning researchers and practitioners have moved from asking, can we do it? To should we do it? For an example, look at facial recognition. A couple of years ago, this was really exciting. It's a genuinely exciting engineering breakthrough where we can get computers to recognize human faces. It's something that people have been working on for decades. So there was a lot of excitement about whether we could have this breakthrough. But now that we've seen the consequences of releasing facial recognition technology, people, even companies, are asking, should we be doing this? Should we be using this technology at all? So I will now speak about my personal journey with responsible AI. Now, my background is in biomedical research because I'm a biomedical engineer. It's been interesting to see that some of the concerns with AI today have actually are quite familiar from biomedical research. So one of the most important documents that was written in the medical research ethics world was the Belmont report in 1979. And this established new norms for ethical research. This was in response to some really bad research that has happened, mistakes that had happened, and this was a response. So this was decades ago. So medical research, it has a couple of decades of a head start on doing things in a more ethical way. So that's quite interesting. So some of the lessons that we can use from biomedical research, so some of the standards that we see there that are now considered part of responsible AI are the concerns of data transparency. So when you publish a medical research paper with human subjects, you have to be very clear about who the human subjects were, how many people were there, what was their race and sex, and what level of education did they attain, and what part of the country are they from, that kind of thing. You have to state that really explicitly when you publish a paper. And that's now being considered part of responsible AI is being transparent about who was in your data set and who was not in your data set. Another standard is informed consent. So initially it was informed concerns to have your data used in a study. And in terms of responsible RAI, we're obviously now asking, have people consented to having their data used by this machine learning algorithm. There's also the concept of group harm and individual harm. So if the study could harm an individual or even harm the whole group, that that individual is from. Likewise, in responsible AI, we're starting to consider both of these types of harm and how to avoid them, and then privacy. So with medical research, privacy is obviously about most important, keeping that research data private. And likewise in responsible AI, we're very interested now know it's our responsibility to keep people's personal data private. Because I had this background in biomedical research, I think it had primed me, and I had this kind of lens for working in AI, and I came across a project to work on. But I had ethical concerns with this particular project. And thankfully, I was supported by my leadership to ask, should we do it? Kind of questions, should we do this project? And the support that I got from my team and the tools that I discovered for addressing responsible AI issues have prepared me for recognizing and addressing responsible AI issues on future projects. So I wanted to share some of the learnings that I've gathered along the way, in case they're helpful for you. So now I'm going to talk a little bit about responsible AI reviews. So we have a responsible AI review board on my team because we work across really diverse customer AI projects. They're very complex, they're in different industries, each one is different, and we see a huge variety. So we have this responsible AI review board, and it's a sounding board for people to express different views and explore different ideas and ultimately provide responsible AI recommendations for our projects. And I find that the following questions are very helpful to ask when thinking about the ethical implications of an AI project. So, first of all, let's remember that AI is not magic. So is this problem actually solvable with RAI or can it be solved in a simpler way? So, for example, sometimes a SQL query will do the job. So can we just write a SQL query? Or do we need an advanced machine learning algorithm that needs a lot of maintenance and has a lot of responsibility of its own? And similar to this is, does this problem have a technical solution, or is this a problem that could be solved with some kind of social intervention? So get that out the way. Do we need technology at all? Do we need AI at all? If yes, then it's helpful to think of who are the stakeholders in this project. So think about each different group that this RAI impacts. And think especially if there are any vulnerable groups. So vulnerable groups might be children, the elderly, immigrant groups, or any groups that have been historically oppressed. And other stakeholders might be regulators, lawmakers, even company and their companies and their brands. Think about all the different stakeholders that could be affected by this technology. And once we've identified a map of different stakeholders, it can be helpful to think, what are the possible benefits and harms to each stakeholder so exhaustively list benefits and has to each it's useful to ask, does the data used by this code contain personally identifiable information? Most of the time when we're training a model, we don't need to know people's names and address and telephone numbers. So really we don't need to work with that data. If for some reason we really need that data needs to be handled in the appropriate ways, it's useful to ask, does this model impact consequential decisions like blocking people from getting jobs or looks or health care? In these situations, we have to be extremely careful, and often in these situations, the model needs to be explainable to explain why that decision was made. A couple more questions to ask are how could this technology be misused and what could go wrong? And I like to call this black mirror brainstorming. And the idea is named after the UK tv series called Black Mirror, where they explore how technology goes very, very wrong. Does the model treat different users fairly? Is the model accurate overall? But is there a particular group that it's performing very badly for? How does the training data compare to production data? So if we rai a language model using tweets, that is not appropriate for doing a medical literature search, because it's a very different language. So it's our responsibility to make sure that those align appropriately. Another question is, what is the environmental impact of the solution? And there's more and more interest in this topic recently. So, for example, if we have a huge language model that takes days or weeks to train. That's using a lot of computational power, it's using a lot of electricity. So what's the environmental impact of that? It's worth thinking about. And then how can we address concerns that arise? So, through answering all these questions, what concerns have arisen? Do we need to reformulate the problem, rethink it? And are there some risks that we can mitigate? And I'm going to discuss some tools that we might use shortly. I did want to highlight some special considerations for healthcare and responsible AI. And the first one that might come to mind for people is privacy. So health data is extremely sensitive and private, especially genetic data, which tells us so much about a person and even their family members. So it's extremely important to maintain that privacy and not let information leak through the model somehow. And on a similar note is security. So we need to follow the right regulatory requirements to handle data securely. And sometimes that means doing a lot of training, like HIPAA training, to be compliant with handling that kind of sensitive data. I think an important aspect of responsible AI in health is collaborating with domain experts. This is crucial, I believe, for all machine learning practitioners, especially so in healthcare. Are there doctors or nurses or even patients who can do a sense check? Do you have access to that domain expert? That's really important if you're working on a healthcare project. And then there is this idea of open, oversees, closed science, so we want to get the balance right. So on one hand, we have open science where, say, we have sequencer genome for cardiovascular research. But hey, this data set could actually be really useful for respiratory research as well. So could we share it with those researchers? Because that could benefit everyone. So that's open science, and we've got to balance that with keeping people's data private and secure, so we have to get that balance right. There's also the issue of unequal access to healthcare. So that's really something that we have to keep in the forefront of our mind. Healthcare people in wealthier parts of the world have better access to healthcare. And something that I have found in the USA, because I've recently moved to the USA, is how important it is to consider the bias that's introduced to the cost of healthcare, because healthcare is so expensive in the USA. Any data about billing costs, prices, we have to be really careful of, because it can contain a lot of bias because of unequal access to health care. And I'm going to show an example of that shortly. And then lastly, race and sex can be extremely important disease predictors as we've seen with COVID it affects different groups differently. However, race and sex can also introduce a lot of bias into the model, because historically these groups have been unfairly treated in healthcare. What I found works quite well, actually, is not just turning away race and sex and just ignoring them completely, because a model can still be biased without these features, as I'll show in the next slide. But what works really well is to capture that data and keep that data so that you can actually audit how fair your model is for those different groups. But you can only do that if you have the data. That's my recommendation, is that they're actually really helpful to have. So this is a really interesting paper by Obermayer Etel. It's called dissecting racial bias in an algorithm used to manage the health of populations. And they show how an algorithm that was actually used in production in the USA did not use race as a feature at all, but it was still very racially biased. So I think this is definitely worth checking out this paper. So now I'm going to talk about some responsible AI tools that you could find helpful. So the first tool helps us answer the question, is there a good representation of all users? And this tool is called data sheets for data sets. And I really like the idea because it comes from electronic engineering, where if you buy an electrical component, like a little microcontroller, you always get a data sheet with it, and the data sheet tells you all about that component, how to connect it, what the operating temperatures are, and so on. And the idea is that when you build a data set, you should compile a data sheet too, explaining how the data set has collected, who was in the data set, who was not in the data set, what limitations there are, so that every data set is accompanied by a data sheet. Another tool I would recommend using is it helps us answer the question of whether a model treats different users fairly. And one particular tool is called Fairlearn. Fairlearn is produced by Microsoft. It's an open source Python package. And here I've used it to look at overall accuracy. The first line, so this was a model which had an area under the RoC curve of 92%, which is great, but then it helps you break down accuracy by different groups. So we can see how well the model performed for female and non female people in this example, and we could also break down the accuracy for different races and see how accurate it is for each of these different races. And you can use any sensitive feature here to check how your model is performing. And then the last tool I wanted to mention helps us deal with models that need to be explainable. This tool is called interpretml. It's also an open source Python package developed by Microsoft, and it's a whole suite of functionality and visualizations, and also a model called the explainable boosting machine, which prioritizes explainability without sacrificing accuracy. Actually. And here's an example of the explainable boosting machine applied to the adult income data set where we're trying to predict who earns more than $50,000 a year. And you can see that it gives a weighting to each of the different features to say how important that feature was for this prediction. So for this person, we can see why the model decided whether or not they earn more than $50,000. So in orange we can see what was for that decision. So like number of years of education and in blue we can see what has against that decision. So for example, marital status and age, and we can see what positively affects and negatively affects the decision about whether someone earns more than $50,000. So it's very transparent and explainable, which is great. So I wanted to share some resources. I am a machine learning practitioner and I look to machine learning researchers that are at the forefront of thinking about this topic and I like to read and follow. Kathy O'Neill, Hannah Wallach, Temnet Jabru, Rachel Thomas, Deborah Rai, Kate Crawford, Arvind Narayanan just to name a few people. I would definitely recommend watching the coded bias documentary on Netflix. This is a great primer if you're new to this idea of responsible AI. Kate Crawford's book Atlas of AI just came out. I'm also looking forward to reading the redesigning AI book by Darren Ashimoglu, which is a collection of essays from some of these people. And here is a link to the Obama racial bias article, also to the different Microsoft responsible AI tools. And another resource is the GitHub repo that my team has where we share our best practices for engineering and machine learning fundamentals, including responsible AI. So we've shared that at this link. And then lastly, my team, commercial software engineering, the applied machine learning team is hiring in a number of different places. You can find our open job roles at this link and thank you very much. It's very easy to find me on Twitter and LinkedIn. And thanks very much to conf 42 for having me.
...

Tempest van Schaik

Biomedical Engineer @ Microsoft

Tempest van Schaik's LinkedIn account Tempest van Schaik's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways