Vintage sexism can show up in modern algorithms Martyn Goddard/REX/Shutterstock
Employers: do the ladies on your payroll have any āfemale weaknessesā that would make them mentally or physically unfit for the job?
The question comes to you courtesy of the year 1943. It was posed in a , written for the flummoxed male supervisors at Transportation Magazine tasked with integrating a new female workforce during a wartime shortage of manpower.
Back then, you wouldnāt be surprised to see logical reasoning like āMen are to programmers as women are to homemakersā. Or āMen are to surgeons what women are to nursesā. Or āMen are bosses. Women are receptionistsā.
Advertisement
But to have these associations littering software in 2016? Thatās exactly what at Microsoft Research and his colleagues found at the Fairness, Accountability, and Transparency in Machine Learning workshop in New York City. The group let a data mining algorithm loose on Google news articles, where it examined the word associations it found there. When they scoured the associations it had come up with, they discovered a trove of familiar stereotypes coded into occupations, with some weighed heavily as either masculine or feminine. The sexism was straight out of the 1943 playbook, with jobs ranked as male including philosopher, captain, warrior and boss. The top jobs on the “she” end of the spectrum? Homemaker, nurse and receptionist.
Itās tempting to throw your hands up and blame sexism in the tech industry, but the story is more subtle than that. The problem is partly down to the way computers learn our language ā and all the inadvertent sexist, racist and otherwise unsavory predispositions it carries.
Picasso = painter
When we humans hear a word like āroseā, it might elicit a rush of related memories and associations: romance, the color red, Shakespeareās famous line.
But for a machine, there arenāt many clues about meaning in the arrangement of a handful of letters. So, to help computers form associations, programmers often turn to a popular technique called āword-embeddingā. The computer crunches through a pile of text, mapping words as āvectorsā that demonstrate their relationships to each other.
Through these maps, machines can learn the subtle linguistic links that come intuitively to humans. For example, a king and a queen pair together ā theyāre both royalty ā but one is male and the other female. Itās similar for uncle and aunt. Einstein was the scientist and Picasso the painter. Beijing is probably the capital of China, not Germany. Pile all these relationships together, and youāve got some semblance of meaning.
Inevitably, less agreeable associations are also hiding in those calculations, and these are what Kalai has been hunting. He thinks itās valuable to find these flaws, because technology has so much power to amplify our stereotypes. Imagine, for example, that youāre doing a web search for āCMU computer science phD studentā. The search engine wants to give you the most relevant results, so perhaps it decides to show you links to male students first, sidelining women to the second page. In an unfortunate loop, this also makes women look even less likely to be programmers, reinforcing the bias.
This mechanism can launder a multitude of sins. In of word-embeddings, and his colleagues at Princeton University looked at how closely words were associated with pleasant terms (āloveā, āpeaceā, āhappyā) and unpleasant ones (ādeathā, ādisasterā, āvomitā). Flowers, for example, mapped more closely to pleasant words, while insects related more closely to the unpleasant. Musical instruments ranked as more pleasant than weapons.
Again, worrisome relationships surfaced. Female names were more closely associated with home and the arts, while male names dovetailed with career and mathematics. It wasnāt just sexism: European-American names (Adam, Stephanie, Greg) ranked as more āpleasantā than African-American names (Darnell, Yolanda). āWe have found every linguistic bias we looked for,ā they write.
Lazy bias
What should we make of findings like these? Well, in short, even the most unbiased algorithm is going to flag up the biases of a slanted culture. If you donāt take steps to remove it, you should assume prejudice is well-represented in all your software. Worse: if you donāt get rid of it, the glossy appearance of impartiality conferred by search engines and algorithms could actually amplify our subtle biases, Kalai says.
Itās a lesson we keep having to re-learn. We learned it when it came out that Google serves higher-paying job ads to men, or that Uber drivers cancel more often on riders of colour, or that artificially intelligent hiring programs or sentencing programs might carry historical baggage.
So how do we get rid of it? Itās less about stamping out prejudice than about not being lazy, says , a web developer in San Francisco. His imaginary culprits are tech bros āChad and Bradā ā āmental shorthand for developers who are just trying to crush out some code on deadline, and don’t think about the wider consequences of their actions,ā explained CegÅowski at a talk this month at the Direction16 conference in Sydney, Australia. They donāt mean to algorithmically punish you for being female or having an ethnic name or living in a low-income neighborhood. They were just hustling to push a product out. āThe tech industry slaps this stuff together in the expectation that the social implications will take care of themselves.ā
Perhaps we could make algorithms that can strip these mistakes back out of software. Kalaiās group has come up with tools to tweak the word maps without losing much of the original meaning. For example, certain words could be reset to gender-neutral. Other words, like āgrandmaā and āgrandpaā, could be āequalisedā, making them more similar in meaning without losing the gender essential to their definition.
Their group is hopeful, and maybe we can be too. After all, we know to laugh ruefully when we see the language of 1943. Maybe we can teach our machines the same trick.
Topics:



