Write-Up: Bias in Algorithmic Systems

So, here comes the final write-up I handed in this summer for the Critical Algorithm Studies course. The course is really cool, and if you have the chance to do so, I absolutely encourage you to enroll. Last semester, we looked at algorithmic systems, and how they (re)produce bias, from various viewpoints (it’s possible that the approach changes for next year).

Having studied algorithmic systems from various points of view, I have troubles ranking the current issues in algorithmic systems. For example, there is the issue of awareness and knowledge, showing that many people are not aware of how algorithms influence their everyday life. However, the issue interdependency of data, algorithms, decisions and the people affected is very important, too – consider how decisions made based upon algorithms may be in turn used as input for future decisions. Looking at this interdependency, this intertwined-ness, draws the focus over to the technology industry.

Wachter-Boettcher (2017) tells this story of it: an industrial sector where people are hired into jobs not primarily because of their skills, but because of their “cultural fit”. Where these people then go on to create things according to how they think things should work. And how these people test things – not with real users and people who are different from themselves, but with their colleagues. Who have been hired because of their cultural fit. What could possibly go wrong? For one, many errors and failures in algorithmic systems can be traced back to inadequate training and testing, including skewed datasets for the respective tasks. As Angwin et al. (2016) point out, training data that misrepresents groups can lead to amplified misrepresentation and thus disparate impact in the criminal justice system, resulting in harsher verdicts for Black people compared to white people. This impact of misrepresentation has also been shown to be true for gender related issues, e.g. by Bolukbasi et al. (2016), who demonstrated and analysed gender bias in word embeddings. Their work shows that software designed to create analogies, trained on commonly used natural language datasets not only shows the same bias as humans, but even amplifies it. Intersections of marginalization in training data also show their effect, as discussed by Buolamwini and Gebru (2018), who investigated the gender and ethnic biases in face recognition algorithms.

Bias and debiasing are currently very actively researched and worked-upon topics in the areas of Machine Learning, Artificial Intelligence and Automated Decision Making. According to Bonchi, Castillo, and Hajian (2016), debiasing in machine learning can be applied in various stages of the learning process: pre-processing, in-processing and post-processing. Finally, there has also been research on how to present data-mined information as to best support users to make correct decisions. Berendt and Preibusch (2014, 2017) show that depending on what the goal of a task is, information should be shown, highlighted, or suppressed from the users’ view.

Currently, a lot of the work on debiasing in machine learning comes from a technical perspective. It seems that many experts think that the issue of biases is something that can be (more or less) easily fixed with a technical solution (e.g., Bolukbasi et al. 2016; cf. Allhutter 2018). But aside from faulty data sets and algorithms reproducing their creators’ prejudice, there is also the issue of social biases, including power gradients and their effects (Allhutter 2018). Algorithmic systems should never be analysed and discussed without taking their contexts into account. Questions to be asked and discussed are, for example: Who created this system? Who had they in mind as users? Who does the system affect, and in what ways? Whose time is to be freed up? Who should be available to do work? (Allhutter 2018) Whose working conditions improve, whose become worse? Carson (2015) takes a rather humorous approach here, claiming that “Silicon Valley startups are obsessed with developing tech to replace their moms”. One example are chores: Carson (2015) lists three startups (two located in San Francisco, one in New York) that – using an algorithmic system – connect people who have a chore to do, but cannot or do not want to do it, with people who are willing to complete the task. According to the listicle, one of the startups has “contract workers to help host parties, wait in line […] at concerts or return items to the store”. Applying the questions I listed above, the company could be checked for its incorporated social bias. To keep the analysis short, I will focus on the first startup mentioned by Carson (2015), TaskRabbit.

Who created the system? According to their website (TaskRabbit, Inc 2018), the current leaders (CEO, CTO, COO and VPs) are three white men, one white woman and one Black woman. The women joined the women joined in 2017 (leading the acquisition of the company by IKEA) and 2018 respectively. TaskRabbit’s founder is Leah Busque, a white woman (Moran 2011).

Who had they in mind as users? According to Newton (2013), Busque had the idea to create TaskRabbit (then “RunMyErrand”) when she was out of dog food, but had no time to go to the shop. IKEA U.S., since the acquisition, has introduced a furniture assembly service provided by TaskRabbit (Perez 2018).

Who does the system affect, and in what ways? TaskRabbit “connects you with skilled Taskers to help with odd-jobs and errands, so you can be more productive, every day.” (TaskRabbit, Inc 2018) This phrase shows a certain bias against “odd-jobs and errands” being part of a productive life, which begs the question if taskers are not productive.

Whose time is to be freed up, who should be available to do work? “Popular categories for Taskers” (TaskRabbit, Inc 2018) are Handyman, Cleaning, Delivery, Moving, Furniture Assembly, and Personal Assistant. Without an account, I was not able to browse open tasks in order to get an idea of what jobs might fall into the categories, especially “Personal Assistant”. However, various articles present anecdotes of jobs offered in New York, NY via TaskRabbit. These include wall fixes (because the Client’s father-in-law tried to fix a TV mount to a wall, and failed), companies looking for actors for Halloween parties, people looking for help to prepare for parties with 150+ guests, and help for a Thanksgiving Dinner with 25 persons attending (Morgan 2017).

There is no way as to see taskers’ profiles, except for the various testimonials presented on the website. For registered Clients (the people posting tasks on TaskRabbit), it is possible to check the Linkedin profiles of taskers applying for their task.

An in-depth analysis of crowd-work systems such as TaskRabbit could be even more illustrative of social biases in this sector of the labour market. It should especially focus on the taskers’ views – not using the word “worker” is in my opinion already an interesting choice –, and should not be confined to an analysis of the demographic data, but also the whole cycle of “tasking”: from the process of registering to finding tasks, submitting invoices and getting paid. This would make for a fascinating project which could be completed as part of the Media Informatics Project in the upcoming semesters.

Companies can register as taskers at TaskRabbit. However, their focus seems to lie on individual users. The testimonials on their website, and the stories the company shares in interviews or articles primarily talk about how the economy is changing and individuals can not meet their needs in spite of having full-time jobs. TechRabbit presents itself as a solution to this and urges policy makers to catch up with these developments. (Brown-Philpot 2017) This is an argument also presented by Brundage and Bryson (2016): legal frameworks in various areas are not yet up to the challenges set by algorithmic systems. Apart from labour and health regulations, as mentioned by Brown-Philpot (2017), Brundage and Bryson (2016) argue that further areas that have catching-up to do are general education (with regard to changes to existing/emerging jobs, but also changing job descriptions), and AI/CS education (especially with regard to ethical challenges), patent law (deciding who can use which parts of algorithmic systems for their future work), and tax and liability laws (affecting which countries will be more promising for companies to set up headquarters in).

Apart from the already discussed areas of bias/debiasing and policy, there is one more very important topic in the area of algorithmic systems: transparency, awareness and fairness. In my opinion, it is very hard, maybe even impossible, for many people to recognize in how many areas of their life algorithmic systems are already being used (most people do not even think about how their search machine results impact their life choices). Even more so, it is obscure who uses algorithms how and to what end (e.g. people applying for a job can not know in advance if the HR department uses AI-support). Finally, the people creating algorithmic systems – from data scientists to computer engineers to the designers who are responsible for the various front-end systems – must be empathetic with people who are affected by their creations. This holds true especially for those who are affected in ways not foreseen or planned by the creators, often dubbed “edge cases” by software engineers. Wachter-Boettcher (2017), argues that this wording pushes unforeseen use cases out of the engineers’ horizon, and makes them easier to ignore. She proposes to instead use the term “stress cases” – which shows better where the problem lies: in the system that is being stressed by these use cases.

Understanding how algorithmic systems work, how they impact us as individuals and as societies and how it may be possible to influence them will be big steps towards more fairness in algorithms.

References

Allhutter, Doris. 2018. “Of ‘Working Ontologists’ and ‘High-Quality Human Components’. The Politics of Semantic Infrastructures. Handbook of Digital STS: Princeton University Press, Edited by D. Ribes and J. Vertesi,” 2018.

Berendt, Bettina, and Sören Preibusch. 2014. “Better Decision Support through Exploratory Discrimination-Aware Data Mining: Foundations and Empirical Evidence.” Artificial Intelligence and Law 22 (2): 175–209. https://doi.org/10.1007/s10506-013-9152-0.

Berendt, Bettina, and Sören Preibusch. 2017. “Toward Accountable Discrimination-Aware Data Mining: The Importance of Keeping the Human in the Loop—and Under the Looking Glass.” Big Data 5 (2): 135–52. https://doi.org/10.1089/big.2016.0055.

Bolukbasi, Tolga, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. “Man Is to Computer Programmer as Woman Is to Homemaker? Debiasing Word Embeddings.” 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain., 9.

Bonchi, Francesco, Carlos Castillo, and Sara Hajian. 2016. “Algorithmic Bias: From Discrimination Discovery to Fairness-Aware Data Mining (KDD 2016 Tutorial).” 2016. http://francescobonchi.com/algorithmic_bias_tutorial.html.

Brown-Philpot, Stacy. 2017. “Policy Must Catch up with the Gig Economy | Opinion.” 2017. https://www.freep.com/story/opinion/contributors/2017/09/04/part-time-work-benefits/630828001/.

Brundage, Miles, and Joanna Bryson. 2016. “Smart Policies for Artificial Intelligence.” Not Peer Reviewed. http://arxiv.org/abs/1608.08196.

Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification∗.” Proceedings of Machine Learning Research 81, 1–15.

Carson, Biz. 2015. “Silicon Valley Startups Are Obsessed with Developing Tech to Replace Their Moms.” Business Insider. 2015. https://www.businessinsider.com/san-francisco-tech-startups-replacing-mom-2015-5.

Angwin, Julia, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. “Machine Bias.” Text/html. ProPublica. May 23, 2016. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.

Moran, Gwen. 2011. “Building a Business on Busy Schedules and Making Errands Pay.” Entrepreneur. November 21, 2011. https://www.entrepreneur.com/article/220557.

Morgan, Brittney. 2017. “Hiring Help: The (Real) Everyday Chores (Actual) New Yorkers Are Outsourcing.” Apartment Therapy. 2017. https://www.apartmenttherapy.com/hiring-help-the-real-everyday-chores-actual-new-yorkers-are-outsourcing-240860.

Newton, Casey. 2013. “Temping Fate: Can TaskRabbit Go from Side Gigs to Real Jobs?” The Verge. May 23, 2013. https://www.theverge.com/2013/5/23/4352116/taskrabbit-temp-agency-gig-economy.

Perez, Sarah. 2018. “IKEA U.S. Launches a Furniture Assembly Service from TaskRabbit.” TechCrunch (blog). March 13, 2018. http://social.techcrunch.com/2018/03/13/ikea-u-s-launches-a-furniture-assembly-service-from-taskrabbit/.

TaskRabbit, Inc. 2018. “TaskRabbit Connects You to Safe and Reliable Help in Your Neighborhood.” TaskRabbit, Inc. 2018. https://www.taskrabbit.com/.

Wachter-Boettcher, Sara. 2017. Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech. 1 edition. New York, NY: W. W. Norton & Company.