New Thinking

How Can We Ensure Big Data Doesn’t Perpetuate Discrimination?

Share this post:

The concepts of big data and algorithms often have a mythological feel. Add some data, filter it through a mathematical formula, and voila—insight from the algorithmic gods. The detachment from the human-origin of all data is reminiscent of the famous line by infomercial legend Ron Popiel, when pitching the benefits of the hands-off cooking process with his Showtime Rotisserie: Set it, and forget it!

That philosophy may not work so well for algorithms. A slew of data scientists and other experts have recently challenged this sort of blind optimism towards big data, where data goes in and an unbiased and objective insight comes out. It indicates a potential watershed moment where governmental actors and society at large may start demanding greater oversight, unless businesses utilizing Big Data get in front of the issue and demand change from within.

“You are not going to get truth just because you are optimizing to engagement,” said data scientist Cathy O’Neil, author of Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. O’Neil was speaking to an audience at Grand Central Tech, a New York-based startup accelerator and entrepreneurial hub. Offering a blunt assessment of the current landscape of how algorithms are created and deployed, she sounded the alarm that the status quo is ripe for exploitation and the perpetuation of discrimination. Just because it relies on math doesn’t mean we shouldn’t question the process and results.

“You’re making a decision about people that matters to them,” said O’Neil, citing examples about teachers who are assessed by, and sometimes fired, by the results of algorithms that lack transparency. O’Neil was approaching the topic with first-hand experience, having worked as a data scientist to predict people’s purchases. Her experience seeing how the proverbial sausage is made, however, left her disillusioned and wanting to act as an agent for change. Looking out to an audience filled with the very people who are creating and utilizing algorithms for their startups, she implored everyone to consider the consequences of flaws in the data or equation: “You have to think about the cost of failure.”

While mathematical failure can seem abstract, the results are quite profound, potentially amounting to whether someone gets a job, a loan, or goes to jail. An algorithm uses historical data to make a prediction on the future, so the potential to perpetuate and reinforce bias is strong.

O’Neil points out that despite her critique, she is pro-algorithm. A glaring issue, she mentions, is that data scientists are making decisions that impact individuals and society at large without having received any real training in ethics. “Data scientists who have never been trained in ethics are making ethical decisions without really thinking about it,” said O’Neil. Unlike the fields of law and medicine, which heavily incorporate ethical training, data science is a relatively new area of study. As O’Neil mentions to the audience, there is not yet a Society of Data Scientists to create a uniform code of conduct.

I sat down with O’Neil outside of the Grand Central Tech event to better understand how we can ensure that algorithms are not perpetuating bias.

“I wouldn’t call myself so much a cynic as I am a skeptic,” said O’Neil. The seeds of her skepticism were sown during her experience as a financial quant during the financial crisis of 2008. Working on the inside, O’Neil witnessed the blind faith that people put in math. “They had been duped—they had been duped because people trust math. And I didn’t want to be part of that.”

O’Neil sees the same problematic pattern of unquestioning faith happening right now with our attitude towards Big Data. The consequences of being a “winner” or “loser” in the algorithm’s eyes behooves us to question its legitimacy and societal effect. “Everywhere where there used to be a discussion, a process, a human process that was complicated and sometimes political, we’re seeing that replaced with a scoring algorithm,” said O’Neil, mentioning the sentencing, credit, and the process of getting a job.

There is a practical reason for using an algorithm in these processes, of course. For example, as online recruitment has expanded, so has the ability for lots of people to apply to an open position. This, in turn, has made the reliance on humans to filter the first batch of resumes all but impossible. Using an algorithm to filter applicants is often now a necessity. This raises the question, said O’Neil, “Who gets filtered out?”

This was the problem underlying a current lawsuit against the Houston School District, as teachers who lost their jobs because of an evaluation algorithm were unable to challenge the evaluation. Even the school district, the seventh largest in the nation, didn’t understand how the calculations were made. The algorithm that the school district used, the Educational Value Added Assessment System, was created by a private company that treats the algorithm as a trade secret.

It is situations like this that worry O’Neil. An unquestioning faith in the truthfulness of algorithmic outcome is a recipe for disaster, where formulas unable to be critiqued create a class of winners and losers. Unlike the financial crisis, where everyone noticed when the process failed, O’Neil states that the new world of data science has been failing with little notice. It was this realization that served as the impetus for O’Neil to write Weapons of Math Destruction.

I realized that, for myself, I would never be the loser in the system that I was building. I would always be the winner. That the people building the system would always win, and the failures would be invisible to us unless we looked for them—and there was nobody looking for them.

Even the process of being measured by a machine may not apply equally. O’Neil explains that  personality tests are are often a requirement of many job applications. But someone like her—O’Neil received a PhD in Mathematics from Harvard University—will never be subjected to a personality test in order to get a job. According to O’Neil, we are creating a new class system digitally that we can’t even see. “It is not just propagating our imperfect world, it is exacerbating inequality,” she said. “It is the opposite of the American Dream.”

We need to stop with the blind faith. We have this assumption, which is completely incorrect, that when we optimize something to efficiency, or when we optimize something to profit, or if we optimize something to political convenience, then we get for free: fairness or legality or non-discrimination. It is simply not true. When we optimize for A, we get A. We don’t get B.

She mentions as an example facial recognition system that may be optimized utilizing white faces, thereby not working as well on black faces. Or when Facebook data is used for credit scoring. The data gleaned from one’s Facebook profile effectively showcases race, class, and other demographic information that runs counter to the Equal Credit Opportunity Act.

“If businesses don’t get up in front of this, then they’re going to have to do whatever the government comes up with,” said O’Neil, who stated that government will slowly start coming to grips with the questionable legality of many algorithms. The big missing piece, which O’Neil foresees as changing, is the ability to audit algorithms. In order to ensure that we do not perpetuate discrimination and act in an illegal fashion, we need to open up the black box.

“I’m asking for an audit to make sure what you’re doing is legal,” said O’Neil. “Transparency is a very different question. I don’t care about your source code, I just want to make sure it’s not illegal.”

Sara Wachter-Boettcher makes a similar appeal towards transparency with her new book, Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech. A web consultant in Philadelphia, Wachter-Boettcher comes at the issue from design perspective, considering how bias may be baked into the product. As an example of bias-baked-in, she points to the recent firestorm of criticism received by FaceApp, a selfie-filter app that had a “hotness” filter. As TechCrunch and other media outlets reported, FaceApp had correlated hotness to whiteness—thereby making selfies more “white” in order to increase their apparent attractiveness.

“Bias in, bias out,” said Wachter-Boettcher, who, like O’Neil, emphasizes that algorithms are created by fallible humans. She believes that criticism of the current system is healthy and necessary, given the impact that algorithms have on lives. “It is easy to fall into the status quo where you think tech is neutral.”

In order to truly ensure that big data doesn’t perpetuate bias, we may need to consider what role we want technology to play. “Is it a mirror? Or should it be shifting us towards a better future?” said Wachter-Boettcher. If we view it as a mirror, the consistency offered through an algorithm may re-embed and strengthen bias.

“We need a lot more discussion about this,” said Wachter-Boettcher, arguing that Silicon Valley may have been sustaining criticism for awhile but has so far avoided major regulation. Wachter-Boettcher finds that the best path forward may be self-regulation, but that companies so far have been unnecessarily opaque with their algorithms. “I’m starting to see some fear of regulation, which might change behavior,” she said. “People should have access to what data is being used.”

As a recent report by the AI Now Institute makes clear, our ability to act with a cohesive ethical code with artificial intelligence has so far been wanting. The best next step, then, may be the creation of the Society of Data Scientists.

Learn more about IBM iX

More stories

Dreamforce ’18: Paige O’Neill, CMO, Sitecore

Amanda Thurston (Editor-in-Chief, thinkLeaders) sits down with Bluewolf Women Innovators Network (WIN) Honoree Paige O’Neill (CMO, Sitecore) to discuss the state of women leaders in the workplace and how organizations and leaders can drive change to create an inclusive work environment. Follow @IBMthinkLeaders on Twitter > Join the IBM thinkLeaders LinkedIn group > Subscribe to […]

Continue reading

Prediction Machines: To Fight Misconceptions about Artificial Intelligence, Clarity is Key

Cheap, ubiquitous, and incredibly valuable—that’s how the economists behind a new book about AI see the future of prediction, thanks to the technology that helps fill in the blanks. However, misperceptions about what AI is and how it works continue to prevail, even among tech-savvy marketers. In order for AI technology to achieve its full […]

Continue reading

Changing the World, One Website at a Time

Corporate social responsibility is important. Very important. With the ability for widespread internal communications and the advantage of robust organizational structures, corporations are poised to help in a big way. And it seems the timing is right to double down on CSR for two reasons: Consumers want to support brands with charitable missions (see: Toms […]

Continue reading