Monday, June 05, 2006

Schneier on the NSA's base rate fallacy

Bruce Schneier has published another version of his perennial essay on data mining to look for terrorists and the base rate fallacy.

"Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card...

Terrorist plots are different; there is no well-defined profile and attacks are very rare. This means that data-mining systems won't uncover any terrorist plots until they are very accurate, and that even very accurate systems will be so flooded with false alarms that they will be useless."

He uses his usual example to examine the numbers. Even assuming you could build an accurate system (no modern technology comes anywhere close), it leads to the police having to investigate tens of millions of potential terrorist plots every day to have a chance of finding one real plot every month. Totally unrealistic. As Bruce says, finding terrorist plots is a needle in a haystack problem and you don't make it any easier to find the needle by throwing more hay on the stack.

