No Ground Truth: Sex Trafficking and Machine Learning

A tweet last week by Carol Fenton led me to an article in the Vancouver Province by Nathan Griffiths which begins with a seemingly shocking statistic:

Roughly 40 per cent of online ads offering escort and sex work services in B.C. included language indicating child sex trafficking, according to a new pre-print study from SFU.

The article also includes this blurb from the head of a Canadian anti-sex-trafficking organization:

“This is research-based proof and evidence that human trafficking is prevalent in our province, and that human trafficked victims are bought and sold in the same industry as sex workers,” Tiana Sharifi, CEO of Sexual Exploitation Education, said in a statement.

Note the difference: The first quote speaks of sex ads “included language indicating child sex trafficking” whereas the second quote directly says “human trafficking is prevalent.” In some ways, this whole post is about that difference.

If you’ve been following Elizabeth Nolan Brown’s coverage of sex trafficking at Reason magazine or through her Twitter account, you’re probably familiar with the ongoing moral panic over sex-trafficking. From bogus claims of 300,000 children at risk to police “sex-trafficking stings” that arrest hundreds but result in no sex trafficking charges, there’s a long history of law enforcement agencies and anti-sex-trafficking NGOs exaggerating the problem.

So I was suspicious of the Province article. I suspected that the study’s descriptions of complicated subjects became less nuanced and more sensational as they traveled further from the source. By the time parts of the paper make it from researchers to reporters, commentators, and people with an axe to grind, a lot of important context can be lost.

I’ve obtained a copy of the pre-print research paper ^[1]Barry Cartwright, Richard Frank, Tiana Sharifi, Mandeep Pannu. Deploying artificial intelligence to detect and respond to the use of digital technology by perpetrators of human trafficking. International CyberCrime Research Centre, Simon Fraser University, British Columbia, pre-print April 2022., and that does seem to be what happened here. The paper, titled “Deploying artificial intelligence to detect and respond to the use of digital technology by perpetrators of human trafficking,” was produced by researchers at the International CyberCrime Research Centre at Simon Fraser University in British Columbia, Canada. And as far as I can tell with my limited expertise, it appears to be a perfectly reasonable piece of research.

What it’s not, however, is a study of sex trafficking. Not really.

Before explaining what I mean by that, let’s take a look at the research. The Province article gives a pretty accurate summary:

The research, the first of its kind in B.C., used machine learning and custom web crawlers to collect and analyze a data set of online ads for escort and sex work services. It was co-authored by SFU’s International Cyber Crime Research Centre and Sexual Exploitation Education, a Vancouver-based agency working to prevent human trafficking and sexual exploitation.
[…]
The research involved training a machine learning model on “a highly elaborate list of key words and phrases,” including emojis, according to Sharifi. Ads were collected from four of the most popular websites in Canada offering escort and sex work services.

You don’t develop machine learning (ML) solutions the same way you solve a problem with ordinary software: You don’t have to provide the detailed step-by-step process for reaching the solution. Instead, you use a collection of input samples matched to the desired outputs to “train” the ML model. During this process, the ML algorithm tries to extract information from the inputs that will allow it to produce the correct outputs. This is the learning part of “machine learning.” and different type of ML algorithms perform this process in different ways, have different costs, and produce different forms of solutions.

For example, suppose you want to analyze traffic patterns in a tunnel, perhaps to predict pavement wear, by determining how many trucks are passing through. You’ve already installed cameras with motion detectors to capture images of every vehicle, but the system is also capturing images of cars, and you need to separate the trucks from the cars to prepare your reports. You decide to try doing that with machine learning.

To train the model, you’ll have to show it a bunch of pictures of vehicles and tell it which images contain cars and which images contain trucks. In this case, you can get a good training set by using a sample of actual images from the tunnel and having a team of people look at each one to classify it as a car or truck.

Let’s say you determine the vehicle type for 10,000 images. You then feed the images with corresponding vehicle types to the machine learning algorithm in training mode and let it do its thing. Unfortunately, trained ML systems tend to be somewhat mysterious — they build their knowledge in ways that are inherently difficult to understand, even if you can see how the knowledge is encoded, so the only way to really tell if the training worked is to try the model on new data and see how it does.

That means having your human team and your trained ML model each classify another, say, 1000 images and compare their results. The closer the model comes to matching the human results, the better it’s performing. If it’s 95% accurate at picking out trucks, and that’s close enough for your purposes, your machine learning project has been a success.

The researchers behind the human trafficking study wanted to do the same kind of thing, but with online sex ads, and instead of finding pictures of trucks, they wanted to find ads for trafficking victims.

This was the part that got my attention. Just as you need pictures of trucks to train an ML system to recognize trucks, you need sex trafficking ads to train an ML system to recognize sex trafficking ads. I suppose it’s possible in theory: They could have searched police human trafficking case files to find instances where the victims were the subject of online sex ads and then used those ads for training and testing the ML model against ads tied to actual sex trafficking.

The problem, as mentioned at the beginning of this post, is that actual sex trafficking cases are extremely rare — too rare to build up a large enough corpus of online ads to train an ML classifier. So where did the researchers find those ads? How did they know the ads were for human trafficking victims? The short answer is that they didn’t. Instead, they used a manual classification system to deem some of the ads as possibly related to sex trafficking, and then trained the ML models against those ads.

The researchers started with a list of trafficking indicators from the Global Report on Trafficking in Persons published by the United Nations Office of Drugs and Crime (UNODC) in 2020. With that basic framework, they set about identifying words to search for:

For this present study, we used search terms consisting of key words and key phrases identified by law enforcement agencies and community outreach organizations who specifically seek to educate about, prevent and interdict sex trafficking, we winnowed those down into key words and key phrases where we could confirm through careful manual inspection that they applied exclusively (or almost exclusively) to the marketing of the services of sex trafficking victims, and for the sake of precision, decided for the qualitative inspection that at least three of those key words or key phrases had to be present in order to flag an online advertisement as involving sex trafficking.

The researchers also obtained lists of target words and phrases from the Royal Canadian Mounted Police and Sexual Exploitation Education, a Vancouver-based anti-trafficking agency. Combining these and other sources, they eventually came up with a framework for identifying possible sex trafficking ads based on key words and phrases. Here’s the example from the paper: (Caution, dirty words!)

Indicator	Key Words	Key Phrases
Sweet, compliant, obedient	Sweet; sweet; sweetest; polite; personality; cheerful	Sweet and polite; sweet and polite; super sweet; Great Personality; cheerful mood; nice and friendly
Availability	24/7; 24hrs; anytime	Always available; always available; available right now; all day; any time; Day or night; all day and night; available day & night; Ready Now; always ready; READY ANY TIME; always ready and available; available right now; Available 24hrs; 24 hours; 7 days a week; 7days a week
Flexibility	CARDATE; CAR; CARPLAY	INCALL AND OUTCALL AND CARDATE; CAR DATE; Car-Date; Car Dates; CAR DATES; CARPLAY; incall and outcall; Incall and Outcall; INCALL AND OUTCALL: any where;
No restrictions	Openminded	Open Minded; open-minded; open minded; as much as you want; Low restriction; No Restriction; NO RESTRICTION; NO restrictions; no limit; No limit; NO LIMIT; no limits; all service; unlimited fun; MINIMUM RESTRICTIONS; ALL STYLE; ALL STYLE ARE WELCOME; What you like; your own style; all you can play; do as you wish; anything you want; whatever you desire; TRY EVERYTHING; ALL YOUR NEED; everything you want; all your needs
Unprotected sex	Bareback; raw; BAREBACK; Bbj; BBBJ; BBBj; BBBJ+; B-B-B-J; b88j; bxxJ; B*BJ; B-B-B- ; BB Be J; BEBEFS	bBlowj,ob; BARE BACK; With Out Condom; without condom; without condoms; no condom; NO CONDOM; No Condom; raw; No Cover; no cover; ✈
New in town, short visit	town; new; NEW; plane; arrived; arrival; ARRIVE; ARRIVED; ARRIVAL	Catch me while you can; a few days; new in town; COMING WEEKLY; NEW IN TOWN; NEW here in town; new to town; NEW ARRIVED; NEW ARRIVAL; New Arrival; New arrival; new arrived; Newly Arrived; just arrived; short visit; Short-Term; Short Term Only; just off plane; limited time
Young girl; new girl, fresh and juicy	Baby; baby; BABY; fresh; Girl; GIRL; GURL; Newly; New; new; NEW; YOUNG; yOunG; Youth; tight; Juicy; daddy; Daddy; Pettit; Petite; PETITE; Cherry; student; Student; students; college;	Rope bunny; babygirl; YOUNG GIRL; super tight; Tight Juicy; Sweet and Juicy; Sweet & Juicy; New girl; New girls; Brand new; SUPER TIGHT; SUPER TIGHT FIT; 100% PRETTY; PRETTY GIRL; Pretty Girl; 100% YOUNG; YOUNG girl; young girls; SEXY YOUNG GIRL; REAL YOUNG; First Time; PetiteCute; SEXY YOUNG GIRL; daddy; NeW YouNG Girl; FIRST TIME; 1st time; my first; First Day; first day; first customer; PetiteCute; cutegirl; student girl; school girl; shaved and sweet; sweet, young; SHOWERED & SHAVEN; 19yes; 19yo; 19 Year old; 19 yrs; 19year old; 19yrs old; 20yo; 20/yo; 20 yrs old; 20yrs; 20 years old; 20 year old; 20yrs old.; 19-22; 21 yrs; 21 old yrs; 21 years; 22yo: 22 yrs old; 22 yrs.; 22 year old; 22 years old; 22yrs; 22yrs old
Low prices	CHEAP	NO DEPOSIT; price is low; PAYMENT ON ARRIVAL; 100 dollars an hour; $100 1 hour; FOR CHEAP; cheap rate; CHEAP~RATE; very cheap rate; low rate
Multiple girls	girls;	COME AND PICK; Our selling-points; different girls; Different new girls;
Male Operator		male operator

To their credit, the researchers wanted to distinguish sex trafficking from adult consensual sex work. To that end, they also created a list of key words and phrases that implied consensual work:

Indicator	Key Words	Key Phrases
Availability		Available 2-8pm; Available till Midnight; pre booking; Prebooking; available daytime or evenings; limited availability; make an appointment; 2 hour notice; 9am to 5pm
Flexibility	Uber; etransfer; deposit	no car dates; No Car Dates; etransfer deposit; email money transfer
Restrictions	subscribers; Subscribers; SUBSCRIPTION;	GENTLEMEN’S ONLY; mature respectful gentlemen; NO DEGREDATION; OVER 30 ONLY; respectful clients only; NO AN..AL; no back door; NO BEBEFS; NO BBBF; RESTRICTIONS: NO C1M; NO BB8S; NO BeBeF$; No Greek; NO Greek; NO gr33k; NO GREEEK; NO C0.F; NO $wall*w; SUBSCRIPTION REQUIRED; No dr_gs
Protected sex only	safe; SAFE	no bbfks; safe play; safe play; Safe play always; SAFE PLAY ONLY; SAFE SERVICES ONLY; SAFETY PLAY ONLY; EverySafe; No B4re services; NO B*RE SERVICES; SAFE, CLEAN, & COVERED services; COVERED services; with a condom; All services covered; NO FLUID EXCHANGE
Mature, older	mature; woman; MOMMY; MILF	REAL WOMAN;
Upscale	LUXURY; CLASSY; Upscale; UPSCALE; class;	LUXURY SERVICE; UPSCALE LOCATION; Classy mature
Realistic prices		deposit in advance; Deposits are required; no low ballers; NO LOWBALLERS; NO LOWBALLERS OR DISCOUNTS; No low balling; 1 hr – $400; 1.5hr – $550; 2hr – $750: 400/hr outcall; $450 – 1hr; $450 per hour for outcalls; 500 hr; 2/600; $600 – 1.5 hr; $800 – 2hr; $1100 – 3hr; 1800/6hrs; Overnights 2500; NO NEGOTIATIONS;
Experienced, professional	Experience; professional;

Starting with these criteria, the researchers designed a set of coding rules specifying things like the number of key trafficking words or phrases required to code an ad as trafficking (three), whether to count repeated words or phrases (no), and what to do with ads that mixed trafficking indicators and sex work indicators (exclude them). They also considered how to handle obvious copy/paste duplicate ads and other confusing situations, e.g. an escort described as young and petite but with stats given as 27-years old, 5′ 6″, and 136 pounds.

In all, the qualitative research team manually coded—and cross-validated the scoring of—1,840 online sex ads, breaking them down in accordance with their likelihood of representing sex trafficking or sex trade work. The team derived extensive lists of (ever-evolving) key words and key phrases and made note of the presence (or absence) of specific emojis or symbols. All this information was made available to the web-crawling and machine-learning teams for the purposes of refining their searches and/or the development of the machine-reading/machine learning algorithms.

It was this group of 1840 coded online sex ads, or subsets thereof, that were used to train the machine learning models (they used five different models).

That brings me to an important point: The training set for the ML systems did not use a known set of sex trafficking ads, because no actual sex trafficking ads were available for training. Instead, the training set used was composed of ads that were judged to be sex trafficking ads based on criteria — key words and phrases — provided by UNODC, the RCMP, and Sexual Exploitation Education.

It would be one thing if those words were culled by statistical methods from samples of actual ads for actual sex trafficking, but they aren’t. The paper goes into great detail about the use of key words and phrases, but at no point does it indicate that the word lists were derived through any kind of empirical research. The word lists appear to be entirely the guesswork of cops and anti-trafficking advocates, both of whom may see sex trafficking where none is occurring.

That makes me reluctant to take these word lists seriously as signs of sex trafficking. For example, the word “new” appears in ads for all kinds of products and services, from desert topping to hair care. I think the key words and phrases are probably indicative of some aspects of the various forms of sex work represented by the ads, but I don’t see a case for declaring them to be a priori sex trafficking. As far as I can tell, the difference between the sex trafficking list and the sex work list is the amount of control that the escorts gave up or retained. The researchers are assuming trafficking victims have little control whereas free sex workers can exert more control.

Of course, I’m hardly an expert on how sex workers advertise, so I asked Maggie McNeill — sex worker and sex work activist — to look at the lists and tell me what she saw:

What I see in the examples you’ve provided are mainly due to agency vs independent (no single woman or operator, however coerced, is available “24/7”; that requires multiple shifts); to inexperience vs experienced escorts (inexperienced businesspeople of every type imagine promising the Moon and then bait & switching is a clever tactic to bring in business); and to level of desperation.
Very poor, inexperienced sex workers who need money NOW will often advertise very low prices, services (including bareback) outside the norm, very broad hours, low screening requirements, etc, in hopes of outcompeting the many others in what is right now a bit of a glutted market (as often happens in a faltering economy). Some of them are counting on being able to raise the price (“Oh, that price was just to get me here; if you want to fuck it’ll be extra”) or otherwise change the odious conditions; some are cash-and-dash artists who have no intention on following through with their promises; some are cops trying to cast a wide net; and a few are truly desperate newbies trying to avoid eviction tomorrow.
There are a couple of these which require specific comment: “New in town” generally just means touring, and has zero to do with coercion; it’s also a popular claim in cops’ bait ads, or for liars/those with bad reputations who have recently changed stage names and wish to capitalize on the male lust for “new meat”. And “young” was once the single most popular lie in sex work ads, because some men will always want “as young as possible”; it’s less popular now among experienced or well-mentored escorts because it attracts cops and “trafficking” fetishists…

The bottom line is that when an ML model flagged an ad as “sex trafficking,” it was not because it was trained to recognize sex trafficking, but because it was trained to recognize ads that the researchers coded as “sex trafficking.” And since the researchers coded the ads based on word and phrase lists that were based on information from UNODC, the RCMP, and SEE, the best we can say is that they were training machine learning systems to recognize ads for escorts that people at UNODC, the RCMP, and SEE would think were sex trafficking ads. That doesn’t mean sex trafficking actually happened.

I want to emphasize that the researchers were in no way hiding this. The background and methodology sections of the study go over the coding and training procedures in page after page of detailed explanation, describing their choices and explaining their rationales. There may be a few sentences here and there which could be confusing when taken out of context, but the paper as a whole is quite explicit about exactly what the researchers are doing.

They also go into some detail explaining the difficulty in identifying sex trafficking, which leads to this cautionary note:

However, it is important to heed the caution of Raets and Janssen (2021)^[2]Raets, S., & Janssens, J. (2021). Trafficking and Technology: Exploring the Role of Digital Communication Technologies in the Belgian Human Trafficking Business. European Journal on Criminal Policy and Research, 27(2), 215–238. https://doi.org/10.1007/s10610-019-09429-z: “One critical issue…is the lack of ground truth in identifying human trafficking cases” (p. 227).

Having skimmed the pre-print, I think I can see how it has been distorted by the time it reaches the pages of the Province article:

Roughly 40 per cent of online ads offering escort and sex work services in B.C. included language indicating child sex trafficking

This is somewhat careful in saying that 40% of ads “included language” about sex trafficking, which is not the same as saying that 40% of ads are sex trafficking. On the other hand, Sharifi is dead wrong in this quote:

“This is research-based proof and evidence that human trafficking is prevalent in our province, and that human trafficked victims are bought and sold in the same industry as sex workers,”

As should be clear by now, the only thing proven by this research is that machine learning models can produce similar results to human analysts when trying to detect signs of human trafficking in online ads. This does not prove anything about actual human trafficking in online ads.

I said at the beginning that the research was not really about sex trafficking, and I think it’s clear that there’s no actual sex trafficking data used to train the machine learning model. Rather, this research is about the ability of machine learning to perform certain tasks: The researchers started with humans using key words and phrases to identify human trafficking in online ads and then tried to get a machine to do the same thing. This was a machine learning research project that used sex trafficking ads as raw material.

A few other items in the Province article are also illuminating:

“Teens or older children are at greater risk for online recruitment, as they often have unfettered internet access with limited monitoring by parents or guardians,” wrote the report authors.

That’s technically true — the authors did write that — but that wasn’t a conclusion of the research. It’s part of the Introduction, where the authors summarized a St. Augustine Record newspaper article.

They also noted that children who are lured and groomed in B.C. “may well be trafficked in other parts of Canada or in other countries.”

This was also part of the introduction, and it’s unsourced, probably because it’s an obvious general statement used to describe how research in British Columbia was relevant to other locations as well.

In 2019, there were 511 reported incidents of human trafficking, according to the report. Of those, only a third resulted in charges and even then, 89 per cent of charges were stayed, withdrawn, discharged or dismissed. Among reported cases of human trafficking, 95 per cent of the victims were women.

Again, none of this was part of the research. It was all just background sourced from three different papers published by other people.

At its most basic, this research was an attempt to get a machine learning algorithm to do a task that that humans had been doing — trying to use key words and phrases in online escort ads to detect sex trafficking, especially child sex trafficking. And judging by the results, it pretty much worked, at least well enough for the researchers to announce their intent to integrate these algorithms into their web crawler to create an A.I. system to detect sex trafficking content.

I have mixed feelings about that. Given the rarity of actual sex trafficking in escort work and the harm that is done to consensual sex workers through misdirected or pretextual law enforcement, I question the wisdom of providing police with better surveillance tools. If the researchers had trained the models against the ground truth of actual online sex trafficking incidents, I would say a tool like this helps legitimate sex workers by directing police attention away from them and onto serious criminals. But since the models are based on the unsupported linguistic notions of cops and anti-trafficking activists, I fear they will be misunderstood and misused to justify harassment of sex workers.

In any case, it’s important to be accurate and address research like this for what it actually is.

Footnotes

Footnotes
↑1	Barry Cartwright, Richard Frank, Tiana Sharifi, Mandeep Pannu. Deploying artificial intelligence to detect and respond to the use of digital technology by perpetrators of human trafficking. International CyberCrime Research Centre, Simon Fraser University, British Columbia, pre-print April 2022.
↑2	Raets, S., & Janssens, J. (2021). Trafficking and Technology: Exploring the Role of Digital Communication Technologies in the Belgian Human Trafficking Business. European Journal on Criminal Policy and Research, 27(2), 215–238. https://doi.org/10.1007/s10610-019-09429-z

Reader Interactions

Trackbacks

Leave a ReplyCancel reply