There’s a lot of talk about how AI bias may skew data models. In our continuing thought leadership series, AI Innovators, Brighterion Head Sudhir Jha discusses this hot button issue with Karen Webster, CEO of PYMNTS.com.

Bias. Humans all exhibit it, even when we try not to. AI bias is no different. Even when we create algorithms to neutralize discrimination, we must be careful with the data used and how the platform’s deep learning is trained. It’s a matter of finesse, analysis and openness to change.

Sudhir Jha, Mastercard Senior Vice President and Head of Brighterion, discussed this complex issue with Karen Webster, CEO of PYMNTS.com, to bring clarity to the ongoing media coverage. We’ve posted a video of the interview. But first, let’s talk about what AI bias is – and what it isn’t.

If you’d applied for a job in the last decade, you would have learned pretty quickly that unless you used keywords the HR software was scanning for, you likely didn’t get an interview. One key employer discovered bias in their screening software.

Removing hiring bias

While endeavoring to develop a system that would improve internal recruitment, Amazon was puzzled to find the system was dismissing female candidates. It came back to data training: the system was recommending candidates that resembled those hired in the past. Yet when engineers discovered the problem and removed candidates’ genders from the data, results only marginally improved.

Amazon’s engineers dug further and discovered gender-based words were at the root. Words like “executed” and “captured” were typical verbs used by men on their resumes; women used different language. Hiring became more gender diverse when the bias was removed.

How to set your criteria to achieve a bias-free model

An article in MIT Technology Review suggests that bias starts long before the data is collected. It suggests keeping three things in mind when building a model:

  1. Framing the problem. Be very clear on what problem you’re trying to solve. If you are a credit card company, are you to identify creditworthy customers or to achieve maximum profitability. If the algorithm discovered that giving out subprime loans was an effective way to maximize profit, you could wind up engaging in predatory behavior even if that wasn’t your company’s intention.
  2. Collecting the data. Ensure the data you collect is a true representation of reality. For example, if the deep learning algorithm is fed mostly photos of light-skinned faces, it will have a difficult time recognizing people with dark skin. It’s what we saw above with Amazon, where the algorithm was trained to find that men were more successful.
  3. Preparing the data. This is the finesse. With your business goal in mind, you must determine what attributes will deliver the results you want. MIT cites age, income, and number of paid-off loans as potential attributes of creditworthiness. Attributes such as the applicant’s gender, race, and home address could give very different results.

How to monitor for bias

After the AI model is built, it’s time to get it ready for production.

“It’s how you train the model, both in terms of what data set you’re using and the techniques you’re using that can definitely create the bias,” believes Sudhir Jha.

“Once the model is built, you start trusting it, not realizing it has bias. As it continues to self-learn, the bias can get reinforced over and over again,” Sudhir says. “But I would say that it’s not that different from even a human training, right?”

He offers an example from policing. If police recruits are trained to stop certain people at traffic lights, it affects who they regard to be “suspicious” people. “So, any training that is done, has to use unbiased data and unbiased techniques. Once you can do that, AI models will not be biased.”

A more important issue, Sudhir believes, is that models are extremely complicated, making it hard to determine bias. He says there’s a lot of work going on to make AI models more legible and explainable, and therefore easier to find inherent bias.

Look at your organization, then look at your model

Sudhir notes that biased data doesn’t come from a vacuum – it comes from people in organizations, even if subconsciously. It’s increasingly important to ensure biases are addressed as machines become more autonomous in decision making.

Sudhir cites the data science toolset Brighterion uses. It doesn’t just focus on modelling, but also on data ingestion, training and retraining. “Each part of that toolset has to ensure they are embedding those checks and balances.”

Listen to Sudhir’s interviews with Karen Webster, CEO, PYMNTS.com in our ongoing video series, The AI Innovators.