Just like humans, algorithms can develop bias and make skewed decisions. What are these biases and how do they impact decision-making?
An algorithmic bias in making
If we took a hard look at every model ever built for classifying who is the optimal candidate for:
- A credit loans
- A job promotion
- A free scholarship or
- Any other opportunity,
would we see a pattern in certain groups of people being granted these opportunities over others? Are our algorithms and formulas biased?
Understanding the problem
Would we see these models repeatedly make decisions about who should be the part of the “have” and “have not” groups? Further, do these models truly pick the optimal candidate? Instead, might they pick according to what someone personally thinks is the optimal candidate?
Please note, it has been argued that algorithms are not completely subjective-free. In fact, just like the humans who develop them, algorithms can come with inherit bias. Some examples of this include rejecting credit loan applications from African Americans and Latin Americans.
Another example include advertising high paying jobs to men more often than women. As a result, these incidents have led us to question how companies and governments construct models that influence decisions.
Research groups like AI Now, recently launched the initiative to fight algorithmic bias. As a result, it’s bringing the issues to light. It’s crucial that we as data scientists keep our algorithms in check. This is to avoid developing yet another tool that is used to discriminate against people.
So how can we keep our algorithms in check?
In recent years, researchers have come up with ways to detect if a model is biased in its decisions about people. A 2016 paper called Equality of Opportunity in Supervised Learning proposes a framework.
This framework uses “equalized odds and equal opportunity” as a criterion for assessing a model’s fairness when classifying people. This criterion allows features to predict an outcome or class (such as predicting “high credit risk applicant”).
Importantly, it prohibits abusing a particular attribute of a person (such as race) to do this. The model must be equally accurate in all demographics. Consequently, it is punished if it only performs well on the majority of people. This means that the predicted outcome must have equal true positive/negative rates and false positive/negative rates across all demographics.
The framework is conducted as a post-learning step. Therefore, it doesn’t require modifying the algorithm or model itself. Then it assesses whether the results from a model seem skewed towards a group of people. For example, a flawed model is one that makes it harder for African Americans who do pay back their loans to apply for loans.
This model makes it easier for Caucasians who don’t pay pack their loans to apply for loans. Therefore, this framework ensures that this kind of model would be determined as unfair, as it would not result in equal false positive/negative rates for both African Americans and Caucasians.
The framework also overcomes the problem of loss of utility when using demographic parity. This requires a predicted outcome to be independent of a particular sensitive attribute. Using the framework, the predicted outcome is allowed to depend on a particular attribute, but only through the actual outcome. This prevents the attribute from being a proxy to the actual outcome while avoiding loss of utility.
Predictor variables and skewed data
Another framework for detecting algorithmic bias is testing how different predictor variables or attributes might skew the predicted outcome. A 2017 paper called Counterfactual Fairness shows how different variables influenced the results of the 2014 stop-and-frisk New York City police initiative.
The data showed that the police officers mostly stopped and frisked African Americans and Hispanics. This happened despite most of those people being innocent or not as suspicious as predicted.
Subsequently, actual incidents of crime were in fact similar across all races. When considering all predictor variables, including the race attribute, the model learned to correlate race with the criminality outcome. Then, the researchers were able to get a more accurate spotting of criminals. Researchers used variables that only related to a person’s criminality.
This was instead of if they had of built a model highly dependent on race and appearance.
First, this research shows that relying on race as a predictor leads to a skewed outcome. Second, it also shows how ineffective the police would be by allowing such bias to be at the core of their decisions.
Visualizations of the predicted versus the actual data show how some locations with a high number of arrests could be completely missed if they were to depend on race. How we construct our models and the variables we use can truly affect people’s opportunities, livelihood, and overall well-being. Therefore, this must be handled ethically and responsibly.
As data scientists, our philosophy should be built on the pursuit of truth, not the manipulation of models to find the most convenient or profitable results at all costs, even at the cost of our ethics.
It is important that we include bias assessments as part of the process, so we can be more confident that our models are designed to better our understanding of people and make smarter decisions, not dumb and discriminatory decisions.