Abstract
In recent years, headlines such as ‘Is Google Translate Sexist?’ (Mail Online, Is Google translate sexist? Users report biased results when translating gender-neutral languages into English in 2017) or ‘The Algorithm that Helped Google Translate Become Sexist’ (Olson, The Algorithm that Helped Google Translate Become Sexist in 2018) have appeared in the technology sections of the world’s news providers. The nature of our highly interconnected world has made online translators indispensable tools in our daily lives. However, their output has the potential to cause great social harm. Due to the continuous pursuit to create ever larger language models and, as a consequence thereof, the opaque nature of unsupervised training datasets, language-based AI systems, such as online translators, can easily produce biased content. If left unchecked, this will inevitable have detrimental consequences. This chapter addresses the nature, impact and risks of bias in training data by looking at the concrete example of gender bias in machine translation (MT). The first section will provide an introduction to recent proposals for ethical AI guidelines in different sectors and the field of natural language processing (NLP) will be presented. Next, I will explain different types of bias in machine learning and how they can manifest themselves in language models. This is followed by presenting the results of a corpus-linguistic analysis I performed of a sample dataset that was later used to train a MT system. I will explore the gender-related imbalances in the corpus that are likely to give rise to biased results. In the final section of this chapter, I will discuss different approaches to reduce gender bias in MT and present findings from a set of experiments my colleagues and I conducted ourselves to mitigate bias in MT. The research presented in this chapter takes a highly interdisciplinary approach, as it takes expertise from linguistics, philosophy, computer science and engineering in order to successfully dismantle and solve the complex problem of bias in NLP.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
See Engström’s chapter in this anthology for a critique of these international frameworks.
- 2.
Note, however, that not all language communities and countries have equal access to NLP systems (Siavoshi 2020).
- 3.
The most common equivalent of a female ‘nurse’ in German is Krankenschwester or its abbreviated form Schwester. However, the counts for other, less frequent terms like Arzthelferin, were also included here. This is simply due to the fact that the English ‘nurse’ has multiple equivalents in German. Also, the German Schwester may also be translated into the English ‘sister’. These, however, were kept separate in the word counts.
- 4.
References
Ackerman, L. 2019. Syntactic and cognitive issues in investigating gendered coreference. Glossa: A Journal of General Linguistics 4(1): 117. https://doi.org/10.5334/gjgl.721.
BBC News. 2021. Reddit removed 6% of all posts made last year. 17 February. https://www.bbc.co.uk/news/technology-56099232 (accessed 23 May 2021).
Bender, E.M., and B. Friedman. 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics 6: 587–604.
Bender, E.M., T. Gebru, A. McMillan-Major, and S. Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? Conference on Fairness, Accountability, and Transparency (FAccT ’21), 14. https://doi.org/10.1145/3442188.3445922.
Beukeboom, C.J. 2014. Mechanisms of linguistic bias: How words reflect and maintain stereotypic expectancies. In Sydney symposium of social psychology: Social cognition and communication, eds. J.P. Forgas, J. Laszlo, and O. Vincze, 313–330. New York: Psychology Press.
Blodgett, S.L., S. Barocas, H. Daumé III, and H. Wallach. 2020. Language (technology) is power: A critical survey of bias in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. https://arxiv.org/abs/2005.14050.
Boddington, P. 2017. Towards a Code of Ethics for Artificial Intelligence. Cham: Springer.
Bolukbasi, T., K.-W. Chang, J. Zou, V. Saligrama, and A. Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Proceedings of the 30th International Conference on Neural Information Processing Systems, 4356–4364.
Caliskan, A.J., J. Bryson, and A. Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356(6334): 183–186.
Chen, I.Y., F.D. Johansson, and D. Sontag. 2018. Why is my classifier discriminatory? Advances in Neural Information Processing Systems 31: 3543–3554.
Chowdhury, G.G. 2003. Natural language processing. Annual Review of Information Science and Technology 37(1): 51–89. https://doi.org/10.1002/aris.1440370103.
Criado-Perez, C. 2019. Invisible Women: Exposing Data Bias in a World Designed for Men. London: Penguin.
Darwin, H. 2017. Doing gender beyond the binary: A virtual ethnography. Symbolic Interaction 40(3):317–334.
Davidson, T., D. Bhattacharya, and I. Weber. 2019. Racial bias in hate speech and abusive language detection datasets. https://arxiv.org/abs/1905.12516v1.
Dignum, V. 2018 Ethics in artificial intelligence: introduction to the special issue. Ethics and Information Technology 20:1–3. https://doi.org/10.1007/s10676-018-9450-z.
Equality and Human Rights Commission. 2018. Equality act 2010. https://www.equalityhumanrights.com/en/equality-act/equality-act-2010 (accessed 23 May 2021).
Etzioni, A., and O. Etzioni. 2017. Incorporating Ethics into Artificial Intelligence. The Journal of Ethics 21: 403–418. https://doi.org/10.1007/s10892-017-9252-2.
Eubanks, V. 2017. Automating Inequality: How high-tech tools profile, police, and punish the poor. New York: St. Martin’s Press.
Floridi, L., J. Cowls, T.C. King, and M. Taddeo. 2020. How to Design AI for Social Good: Seven Essential Factors. Science and Engineering Ethics 26:1771–1796. https://doi.org/10.1007/s11948-020-00213-5.
Friedman, B., and H. Nissenbaum. 1996. Bias in computer systems. ACM Transactions on Information Systems (TOIS) 14(3): 330–347.
Garvey, S.C. 2021. Unsavory medicine for technological civilization: Introducing ‘Artificial Intelligence & its Discontents’. Interdisciplinary Science Review 46(1–2): 1–18. https://doi.org/10.1080/03080188.2020.1840820.
Gehman, S., S. Gururangan, M. Sap, Y. Choi, and N.A. Smith. 2020. Real toxicity prompts: Evaluating neural toxic degeneration in language models. Findings of the Association for Computational Linguistics: EMNLP 2020, 3356–3369.
Google AI. 2020. Artificial intelligence at Google: Our principles. https://ai.google/principles.
Government Digital Service (GDS) and Office for Artificial Intelligence (OAI). 2019. Understanding artificial intelligence ethics and safety. https://www.gov.uk/guidance/understanding-artificial-intelligence-ethics-and-safety.
Hagendorff, T. 2020. The Ethics of AI Ethics: An Evaluation of Guidelines. Minds & Machines 30: 99–120. https://doi.org/10.1007/s11023-020-09517-8.
Hagerty, A, and I. Rubinov. 2019. Global AI ethics: A review of the social impacts and ethical implications of artificial intelligence. https://arxiv.org/ftp/arxiv/papers/1907/1907.07892.pdf.
Heaven, W.D. 2020. Open AI’s new language generator GPT-3 is shockingly good—and completely mindless. MIT Technology Review, 20 July. https://www.technologyreview.com/2020/07/20/1005454/openai-machine-learning-language-generator-gpt-3-nlp/ (accessed 23 May 2021).
HLEGAI (High Level Expert Group on Artificial Intelligence), European Commission. 2019. Ethics guidelines for trustworthy AI. https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai.
Indurkhya, N., and F.J. Damerau, eds. 2010. Handbook of Natural Language Processing, 2nd ed. Boca Raton: CRC Press.
Jakobson, R., L.R. Waugh, and M. Monville-Burston. 1990. On language. Cambridge, MA: Harvard University Press.
Jobin, A., M. Ienca, and E. Vayena. 2019. The global landscape of AI ethics guidelines. Nature Machine Intelligence 1:389–399. https://doi.org/10.1038/s42256-019-0088-2.
Kilgarriff, A., V. Baisa, J. Bušta, M. Jakubíček, V. Kovář, J. Michelfeit, P. Rychlý, and V. Suchomel. 2014. The sketch engine: Ten years on. Lexicography 1(1): 7–36. http://www.sketchengine.eu.
Korteling, J.E., A.-M. Brouwer, and A. Toet. 2018. A neural network framework for cognitive bias. Frontiers in psychology. https://doi.org/10.3389/fpsyg.2018.01561.
Mail Online. 2017. Is Google translate SEXIST? Users report biased results when translating gender-neutral languages into English. Mail Online, 1 December. https://www.dailymail.co.uk/sciencetech/article-5136607/Is-Google-Translate-SEXIST.html (accessed 13 May 2021).
Mittelstadt, B. 2019. Principles alone cannot guarantee ethical AI. Nature Machine Intelligence 1: 501–507. https://doi.org/10.1038/s42256-019-0114-4.
Nadkarni, P.M., L. Ohno-Machado, and W.W. Chapman. 2011. Natural language processing: An introduction, Journal of the American Medical Informatics Association 18(5): 544–551. https://doi.org/10.1136/amiajnl-2011-000464.
Nosek, B. A., M.R. Banaji, and A.G. Greenwald. 2002. Harvesting implicit group attitudes and beliefs from a demonstration web site. Group Dynamics: Theory, Research, and Practice 6(1): 101–115. https://doi.org/10.1037/1089-2699.6.1.101.
Olson, P. 2018. The algorithm that helped Google translate become sexist. Forbes, 15 February. https://www.forbes.com/sites/parmyolson/2018/02/15/the-algorithm-that-helped-google-translate-become-sexist/?sh=7e5e82c87daa (accessed 13 May 2021).
Prates, M. O., P.H. Avelar, and L.C. Lamb. 2019. Assessing gender bias in machine translation: a case study with Google Translate. Neural Computing and Applications 32: 6363–6381. https://doi.org/10.1007/s00521-019-04144-6.
Quah C.K. 2006. Machine translation systems. Translation and technology. Palgrave textbooks in translating and interpreting, 57–92. London: Palgrave Macmillan. https://doi.org/10.1057/9780230287105_4.
Reddy, S., and K. Knight. 2016. Obfuscating gender in social media writing. Proceedings of 2016 EMNLP Workshop on Natural Language Processing and Computational Social Science, 17–26.
Rudinger, R., J. Naradowsky, B. Leonard, and B. Van Durme. 2018. Gender bias in coreference resolution. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2: 8–14.
Sap, M., D. Card, S. Gabriel, Y. Choi, and N.A. Smith. 2019. The risk of racial bias in hate speech detection. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1668–1678.
Sattelberg, W. 2021. The demographics of reddit: Who uses the site? Alphr, 6 April. https://www.alphr.com/demographics-reddit/ (accessed 25 May 2021).
Shah, D., H.A. Schwartz, and D. Hovy. 2020. Predictive biases in natural language processing models: A conceptual framework and overview. https://arxiv.org/pdf/1912.11078.pdf.
Siavoshi, M. 2020. The importance of natural language processing for non-English languages. Towards Data Science, 22 September. https://towardsdatascience.com/the-importance-of-natural-language-processing-for-non-english-languages-ada463697b9d (accessed 22 May 2021).
Swan, O. 2015. Polish gender, subgender, and quasi-gender. Journal of Slavic Linguistics 23(1): 83–122. https://www.jstor.org/stable/24602179.
Tomalin, M., B. Byrne, S. Concannon, D. Saunders, and S. Ullmann. 2021. The practical ethics of bias reduction in machine translation: Why domain adaptation is better than data debiasing. Ethics and Information Technology. https://doi.org/10.1007/s10676-021-09583-1.
Tsamados, A., N. Aggarwal, J. Cowls, J. Morley, H. Roberts, M. Taddeo, and L. Floridi. 2021. The ethics of algorithms: Key problems and solutions. AI & Society. https://doi.org/10.1007/s00146-021-01154-8.
UNESCO. 2020. Elaboration of a recommendation on the ethics of artificial intelligence. https://en.unesco.org/artificial-intelligence/ethics.
Wagner, C., D. Garcia, M. Jadidi, and M. Strohmaier. 2015. It’s a man’s Wikipedia? Assessing gender inequality in an online encyclopaedia. Ninth International AAAI Conference on Web and Social Media. https://arxiv.org/abs/1501.06307.
Webster, K., M. Recasens, V. Axelrod, and J. Baldridge. 2018. Mind the GAP: A balanced corpus of gendered ambiguous pronouns. https://arxiv.org/abs/1810.05201.
Wesslen, R., D. Markant, A. Karduni, and W. Dou. 2020. Using resource-rational analysis to understand cognitive biases in interactive data visualizations. IEEE VIS 2020 Workshop on Visualization Psychology (VisPsych). https://arxiv.org/abs/2009.13368v2.
Wikimedia Foundation. 2020. Addressing wikipedia’s gender gap. https://wikimediafoundation.org/our-work/addressing-wikipedias-gender-gap/ (accessed 23 May 2021).
Yu, H., Z. Shen, C. Miao, C. Leung, V. R. Lesser, and Q. Yang. 2018. Building ethics into artificial intelligence. Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI'18), 5527–5533. https://arxiv.org/abs/1812.02953.
Zhao, J., T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang. 2017. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. https://arxiv.org/pdf/1707.09457.pdf.
Zhao, J., T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang. 2018. Gender bias in coreference resolution: Evaluation and debiasing methods. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2: 15–20.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ullmann, S. (2022). Gender Bias in Machine Translation Systems. In: Hanemaayer, A. (eds) Artificial Intelligence and Its Discontents. Social and Cultural Studies of Robots and AI. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-88615-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-88615-8_7
Published:
Publisher Name: Palgrave Macmillan, Cham
Print ISBN: 978-3-030-88614-1
Online ISBN: 978-3-030-88615-8
eBook Packages: Social SciencesSocial Sciences (R0)