Some Dog-Sniffing Math

In one of his posts today, New York criminal defense lawyer Scott Greenfield writes about the error rate for drug-sniffing dogs:

More to the point was the dog hits simply aren’t anywhere nearly as worthy of credit as courts have held. Consider whether it would be equally acceptable for a cop to flip a coin in order to establish probable cause to search.  For a dog whose established ability to sniff out drugs runs in the typical 50% range, it’s no more likely to be accurate than a flip of a coin.

I’m guessing the “50% range” figure comes from a Chicago Tribune article a few weeks ago based on an analysis of state drug dog data in Illinois, which found a relatively low accuracy rate:

The dogs are trained to dig or sit when they smell drugs, which triggers automobile searches. But a Tribune analysis of three years of data for suburban departments found that only 44 percent of those alerts by the dogs led to the discovery of drugs or paraphernalia.

That 44% figure for success means that the false-positive ratio is a whopping 56%. Scott was being generous when he rounded down to 50%. However, in comparing dogs to flipping a coin, Scott makes a very common math mistake by confusing the dog’s false alert ratio with the dog’s total alert ratio.

It helps if we make up some numbers. Suppose the police dogs in some department are used in 1000 sniffs, and the dogs alert in 200 of them, but a search only finds drugs on 88 of those people. This means the other 112 are false positives, and we can calculate the false positive ratio as the number of false alerts divided by the total number of alerts:

fp = 112 / 200 = 56%

To keep the situation simple, let’s assume the dog never misses any drugs, so the 88 drug carriers are all there were in the sample population of 1000. In other words, 8.8% of the people are carrying drugs.

Now we can calculate what would happen if the police officer flipped a coin instead. Out of 1000 people, the coin would be expected to “alert” for 500 of them. Since 8.8% of the people are carrying drugs, we would expect 44 of these people to have drugs, meaning the other 456 are false positives. Thus the false positive ratio would be:

fp = 456/ 500 = 91.2

That’s a heck of a lot worse than the dog’s 56% ratio. The only way the coin could achieve a false positive ratio as good as the dog’s is if 44% of all the people sniffed are carrying drugs. Then you’d expect the 500 searches to find drugs on 220 people with the other 280 being false positives:

fp = 280/ 500 = 56

As long as less than 44% of the population is carrying drugs, a dog with a known 56% false positive ratio is performing quite a bit better than a random coin flip.

Not that that’s saying much. And it doesn’t really hurt Scott’s point, either, because the dog is still wrong more than half the time, and each time it’s wrong, some innocent person has to endure the humiliation of a police search.

As is probably often the case, although Scott was wrong, the opposition is even wronger:

Dog-handling officers and trainers argue the canine teams’ accuracy shouldn’t be measured in the number of alerts that turn up drugs. They said the scent of drugs or paraphernalia can linger in a car after drugs are used or sold, and the dogs’ noses are so sensitive they can pick up residue from drugs that can no longer be found in a car.

This might be correct in a narrow sense. Dogs certainly are capable of detecting trace odors left behind by things that are no longer there. It’s a reasonable defense of the dog’s nasal prowess.

But so what? This isn’t about the dog, it’s about whether the search is justified. The only reason the police are allowed to invade your privacy and seize your property is because they have a good reason to believe they will find evidence of a crime. If the police aren’t finding evidence as often as they expect to, it suggests their reason for the search is not as good as they say it is. The cause of their error isn’t as important as the fact that they are in error.

I’m no lawyer, but I’m pretty sure a judge isn’t supposed to grant a search warrant because a location might once have had evidence of a crime. The police are supposed to have reason to believe that the evidence will be there when they search. If that’s a good rule for a judge, it ought to be a good rule for a dog. But it’s clear that in at least 56% of the cases when a dog alerts, the evidence isn’t there.

As if that wasn’t bad enough, the Tribune story gives us good reason to believe that the 56% error rate is optimistic.

The Tribune obtained and analyzed data from 2007 through 2009 collected by the state Department of Transportation to study racial profiling. But the data are incomplete. IDOT doesn’t offer guidance on what exactly constitutes a drug dog alert, said spokesman Guy Tridgell, and most departments reported only a handful of searches based on alerts. At least two huge agencies — the Chicago Police Department and Illinois State Police — reported none.

The Tribune asked both agencies for their data, but state police could not provide a breakdown of how often their dog alerts led to seizures, and Chicago police did not provide any data.

That leaves figures only for suburban departments. Among those whose data are included, just six departments averaged at least 10 alerts per year, with the top three being the McHenry County sheriff’s department, Naperville police and Romeoville police.

In other words, the 56% error rate is for dogs working in departments that were willing to disclose their dogs’ performance statistics. We can only wonder how bad the numbers are in departments that don’t want to reveal how well their dogs were doing. And then there are the departments that apparently don’t even care enough to keep statistics.

The most damning item in the Tribune article, however, is that the dogs’ success rate declines to 27% when the person being sniffed is Hispanic.

This is a reminder that these statistics aren’t a measure of the dog’s performance, they’re a measure of the performance of the dog-and-handler system, and I don’t think it’s the dogs that are likely to be prejudiced against Hispanics.

The most benign explanation for these numbers is that police dog handlers are more likely to expect Hispanics to have drugs, and that they somehow inadvertently cue the dog to alert. For example, if they lead a non-alerting dog around the cars of Hispanic drivers for a longer period of time than other drivers, the dog may learn that he can get his master to stop by doing a drug alert.

This sort of unintentional cueing is sometimes called the Clever Hans effect, after a horse that appeared to be able to accomplish all sorts of amazing mental feats, signalling his answers by stomping his foot. Eventually, scientists figured out that his owner would tense up when the horse was supposed to start answering a question and then relax as soon as he reached the right number of stomps. There is evidence that some drug dogs are doing the same thing.

Other explanations for the high error rate with Hispanics are that the police dog handlers are more likely to misinterpret a dog’s behavior as an alert, are intentionally cueing the dog to alert, or are simply lying about the alert because they want to do a search.

(It might also seem possible that Hispanics and their cars are simply exposed to drugs more often–perhaps due to greater involvement in drug culture–and that the dogs are alerting to drug traces. But I can’t think of an explanation for how Hispanics could have increased the rate at which they have had drugs without also increasing the rate for which they have drugs when searched. It seems to me those statistics should rise and fall together, which would not affect the dogs’ error rate.)

A big part of the problem with drug dogs is the lack of standards:

Experts said police agencies are inconsistent about the level of training they require and few states mandate training or certification. Jim Watson, secretary of the North American Police Work Dog Association, said a tiny minority of states require certification, though neither he nor other experts could say exactly how many.

A federally sponsored advisory commission has recommended a set of best practices, though they are not backed by any legal mandate.

Compare this to the situation with the breath testing devices used by police to detect intoxicated drivers. Those things are calibrated and tested regularly. If you get busted for blowing 0.09 and your lawyer can show that the testing device hasn’t been calibrated and tested according to the proper schedule, there’s a pretty good chance you’ll go free.

But if a dog at the side of the road alerts at your car, the cops are going to search you, and whatever they find will be usable, because the judges always believe the dogs.

Update: Radley Balko is taking on this same topic today.


  1. While I tend not to be overly concerned with complex math questions, I think I was correct on this one, as you are looking at an entirely different issue. My statement, about flipping a coin, had to do with a single search, a single coin toss, and a single decision based on that coin toss.

    (By the way, there are many studies about dog sniffs, and they generally come out in the 50% range. You found one of the many, but only one, and have take some liberties in your extrapolation. Be careful about getting hung up in the one you find and assume that someone else must be talking about the same thing merely because you’re unfamiliar with the subject.)

    Unless the odds of a single coin toss coming up heads (for example) are different than 50-50, that’s as far as my analogy goes. It’s fine that you’ve chosen to go elsewhere with the analogy, but don’t base your wandering down your own path on me, or conclude I must be wrong because you’ve decided to write about something entirely different..


  2. I guessed that Scott was referring to the Tribune article because it’s a recent report that’s been discussed around the blogosphere, and because Scott’s post was a convenient lead-in for me to discuss that article.

    In any case, the main point is not about a particular study of drug-sniffing dogs, but about the mathematics of conditional probability. In particular, the following two statements are not equivalent:

    (1) The probability of a drug dog alerting is 50%.

    (2) The probability that drugs will be found, given that a drug dog has alerted, is 50%.

    The first statement is a simple probability, but the second statement is a conditional probability, i.e. it expresses a statement of fact about the probability of one event given that some other event has actually occurred.

    A police officer flipping a coin to decide whether to search someone is a simple probability similar to the first statement. However, a statement that drug dogs are 50% accurate is a conditional probability in that it talks about the probability of finding drugs only after another condition has beem met (the drug dog alerting).

  3. A leap at the wheel

    If an alert is 44% of alerts lead to finding drugs, I think the logical argument is to claim that alerts indicate the suspects probably has no drugs and should be allowed to go on their way.


  4. Jeez, Mark, you’re giving me a headache. Never mind.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>