False Positive

I’ve come across the idea of false positives a number of times. In a healthcare context, for example, it proves a confounding factor for widespread screening programmes for conditions such as various types of cancer. If you devise a methodology which is 99% accurate for a condition which affects a relatively small number of the population, you will miss 1 in every hundred actual cases. Even worse, you will misdiagnose the same proportion. If 1 in 1,ooo people actually have the condition and you test a million people (ie. 1,ooo true positives), you will miss 10 of them. In the scheme of things, that might not be too bad, but you will also identify 10,000 healthy people as having the condition. You probably have some backup tests to double check and thus mitigate the problem but at very least have caused widespread anxiety. Perhaps it is justified by the 990 who get prompt treatment but it illustrates an issue which gets worse as the population increases and the number of true positives decreases.

Writing for Boing Boing, Cory Doctrow highlights a Stats-based response to UK Tories’ call for social media terrorismĀ policing, which runs on the same principles. If you could devise a very clever algorithm with 99% accuracy at spotting terrorists via their online activity and run it on the UK population, you would turn up a massive number of leads to follow up, only a small proportion of whom would be terrorists.

The number of people in the UK population who use Social Media is safely below the 60 million implied by the article but false positives would still swamp genuine positives and this is assuming that your algorithm is impressively accurate to begin with. Also genuine terrorists would be more likely to use obfuscation techniques to disguise their intent, while pity the poor fool who put words like “airport”, “bomb” and “hello NSA” in the same sentence.

Wait a minute! I’d better change that but first let me go and see who’s ringing the doorbell at this time of night…


