“The data that most people use to train their Artificial Intelligence models comes with a built-in bias” — HackTalks

John (Juan) Tubert
4 min readJul 8, 2020

Originally published by HalkTalks: https://hacktown.com.br/blog/blog/os-dados-que-a-maioria-das-pessoas-usa-para-treinar-seus-modelos-de-inteligencia-artificial-ja-vem-com-um-vies-embutido/

We talked to John Tubert, Head of Technology at R / GA NY, about the evolution of Artificial Intelligence, the existence of biases that are scary, and how to avoid them. Check it out.

Tell us about yourself and your career?
I am originally from Argentina, but have been living in the US for the past 30 years. I now live in NYC with my wife and two daughters.
For the past 20 years I’ve been working in the technology area, building and managing websites, mobile apps, social media experiences as well as other types of customer experiences.

I now lead the technology department for R/GA NY. Prior to this I led the technology team of our San Francisco office for 5 years, and also helped build the Buenos Aires R/GA office back in 2010.

How has AI been present in your work?
I have been personally interested in AI for a long time and at work it’s not different. We have used AI for internal projects as well as client work, also earlier this year we had our first global AI summit, to make everyone at the company smarter about AI and ML.

How sophisticated is AI becoming? And what’s the most impressive AI application you’ve seen lately?
AI is becoming more and sophisticated and with cheaper and more powerful computers, cloud storage and also more and more companies that make building models easier and faster (Google AutoML for example), more and more companies and individuals are building more applications that are powered by AI.

There are so many impressive uses of AI, that is hard to think of just one. Personally I love the ones that make things faster and easier like the auto-complete functionality of Gmail, but there are also impressive uses of AI to help with fraud detection that companies like Amex have that can help them save millions of dollars, or even more impressive how AI is used to save lives by diagnosing signs of lung cancer in x-rays.

Why do many types of AI have inherent biases? How do you see that?
The answer is simple, the data most people use to train their AI models has bias on them. Even large tech companies use bias data to train their models.
Also many companies do not add “bias testing” as part of their development cycle or use historical data that already has biases. For example, imagine I build an AI system to find the best professors for a college, but use historical data for the past 100 years. If most of the professors we had the past 100 years were male, the AI algorithm might only qualify make applicants as “good”.

Can you share an example or two of how bias can creep in and show up in an algorithm?
There are so many examples of this in different industries. A few that impact me are the following, which I talked at my SXSW talk last year:
- Cameras powered by AI fail to detect darker skins. In some cases it miss-labels and in others it fails to detect them altogether. So small things like soap dispensers might not work or more important things like self driving cars might not identify a person with darker skin as a person.
- A company named Compas, created an AI tool that helps determine if a convicted criminal would be likely to commit a crime. The algorithm proved to be biased against african americans.
- Amazon also created an AI tool to rate candidates, and they soon realized that it was discriminating against women.

As far as you can see, what are the impacts that AI biases can have on our lives?
As mentioned on the previous question, the impacts are huge, if autonomous cars cannot see people with darker skins well enough, how are we going to have them on the street? Can governments use AI tools to help find criminals, if these tools have biases?

What measures can be taken to reduce bias? Are there any steps the technology industry more broadly can take to help address the problem?
To reduce the bias companies can hire more diverse teams, they will find the bias earlier in the process, they can add more diverse data or use existing data sets that contain more diverse data. Companies can also make the train data public, that way collectively we can find the biases.

The good news is that companies like IBM, Google and Amazon are already taking steps to help address these problems.
Google has released recently a bunch of tools to help with this, you can see more info here. IBM has been working on the “The Fairness 360 Kit” for the past two years and Amazon is also working on tools like this as well.

Are there any best practice examples on that?
I think in many ways it’s too early for this, but I am sure in the next few months there will be some to share using tools and practices I shared before.

How does R/GA handle AI Bias? Do you have any good examples to share?
We use many of the tools I shared above and also constantly read and share best practices with the rest of the team. We also partner with startups with our ventures team.

Originally published at https://hacktown.com.br.

--

--

John (Juan) Tubert

Chief Technology Officer @ Tombras, New York / Creative Technologist, passionate about Metaverse, Web3, chatbots and AI.