Sign In

Communications of the ACM


Data Science Meets Law

patterned text of 'data' and 'law'

Credit: Andrij Borys Associates

The legal counsel of a new social media platform asked the data science team to ensure the system strikes the right balance between the need to remove inciting content and freedom of speech. In a status meeting, the team happily reported that their algorithm managed to remove 90% of the inciting content, and that only 20% of the removed content is non-inciting. Yet, when examining a few dozen samples, the legal counsel surprisingly found that content which was clearly non-inciting has been removed. "The algorithm is not working!" she thought. "Anyone could see that the content removed has zero likelihood to be inciting! What kind of balance did they strike?" Trying to sort things out, the team leader asks whether the counsel wants to decrease the percentage of removable non-inciting content, to which the counsel replied affirmatively. Choosing another threshold for classification, the team proudly reported that only 5% rather than 20% of the removed content was non-inciting, at the expense of reducing the success rate of omitting toxic content to 70%. Still confused, the legal counsel wondered what went wrong: the system now was not only removing clearly non-inciting content, but it had also failed to remove evidently inciting materials. Following several frustrating rounds, new insights have emerged: the legal counsel has learned the inherent precision-recall trade-off. In addition, the team leader realized that the definition of inciting content that was used in the process of labeling the training data was too simplistic. The legal counsel could have helped clarify the complexities of this concept in alignment with the law. The team leader and the counsel regretted not working together on the project from day one. As it turns out, both were using the same words, but apparently, much of what they meant has been lost in translation.

While both data scientists and lawyers have been involved in the design of computer systems in the past, current AI systems warrant closer collaboration and better understanding of each others' fields.2 The growing prevalence of AI systems, as well as their growing impact on every aspect of our daily life create a great need to that AI systems are "responsible" and incorporate important social values such as fairness, accountability and privacy. It is our belief that to increase the likelihood that AI systems are "responsible," an effective multidisciplinary dialogue between data scientists and lawyers is needed. Firstly, it will assist in clearly determining what it means for an AI system to be responsible. Moreover, it would help both disciplines to spot relevant technical, ethical and legal issues and to jointly reach better outcomes early in the design stage of the system.


No entries found

Log in to Read the Full Article

Sign In

Sign in using your ACM Web Account username and password to access premium content if you are an ACM member, Communications subscriber or Digital Library subscriber.

Need Access?

Please select one of the options below for access to premium content and features.

Create a Web Account

If you are already an ACM member, Communications subscriber, or Digital Library subscriber, please set up a web account to access premium content on this site.

Join the ACM

Become a member to take full advantage of ACM's outstanding computing information resources, networking opportunities, and other benefits.

Subscribe to Communications of the ACM Magazine

Get full access to 50+ years of CACM content and receive the print version of the magazine monthly.

Purchase the Article

Non-members can purchase this article or a copy of the magazine in which it appears.