You can’t help but ask: If we can see these clear suspect images shared by the FBI now, couldn’t someone possibly have detected the threat before the fact and headed off all that mayhem?
In short, the answer is no. That is simply too complex a task, beyond our human and current technological capabilities, said Prof. Jeremy Wolfe, a visual attention researcher in the departments of ophthalmology and radiology at Harvard Medical School and Brigham and Women’s Hospital. His take on the challenge:
With all these videocameras all over the place providing a mountain of data, somebody is likely to ask if, given that the signs were all there ahead of time, shouldn’t someone have seen this coming? Sadly, that task is essentially impossible, at least at this point. When you don’t have anything specific to look for or a specific place to look, even though you’ve got all this footage, there’s no way to do it.
For example, imagine this problem: Remember the bombs a few years ago in the London subway? You’ve got video cameras everywhere in the London subway. After the fact, you can see the bomber, carrying a backpack down into the subway and coming out without it. Before the fact, you just don’t have enough humans to look at all that imagery of all the people on the ‘Tube”. Perhaps we could use a computer.
Now imagine trying to create a computer program that will mark everybody as they go into the subway, find them again when they come out of the subway — in a different place, of course — and figure out if they are still carrying their backpack. If you think about that for a moment, you realize that is a monumentally hard problem. And that’s actually a smaller problem than the problem of ‘Is anybody doing anything suspicious in Boston today?’
The problem is still hard, but much more tractable, after the fact. Now you concentrate on a restricted range of time and a very restricted place. This probably still involves quite a lot of footage but we can point human eyeballs at every bit of that footage. We can fast-forward and rewind and turn this from an absolutely impossible search task to really quite a tractable one. It’s going to be difficult but it’s quite a tractable identification task.
In the case of the Boston bombing, you might have a pretty good idea that you’re looking for something like a black nylon bag. When you see it, you back up in time to see when it’s not there. That’s not trivial but that’s just good old-fashioned hard work as opposed to being flat out impossible.
We have seen this in other sorts of investigations, after-the-fact. The ability to walk this backwards does not in any way mean it would be reasonable to expect it to be picked up going forward in time. That’s a much more difficult task.
Let’s use a much more trivial example: If I send you into my house and say ‘I lost something. Please find it,’ that’s not going to be terribly easy. However, if I tell you that it’s a black nylon bag, that’s going to give you a lot of help. And if I say, ‘I left it under the kitchen table,’ then it will be quite trivial.
It is a silly example, but it does point out the difficulty of trying to detect an attack ahead of time: What you’re looking for is a threat, and threat could come in all sorts of different forms. For instance, ahead of time we didn’t know we were looking for a bomber. It could have been a sniper, for instance.
But, I protested, don’t we know a terrorist is likely going to have a gun or a bomb?
Let’s assume it was even that simple, and not a guy with a knife. Now let’s think, there are 26 miles of the race-course to monitor, and that’s assuming you’ve decided to worry about the Boston Marathon for some reason. But we didn’t have a reason to focus on the marathon. We would need to look at the whole city or the whole country. There just aren’t enough people to look at the possible information you have. You just would never be able to do it.
Indeed, you’ve got to be impressed by the fact that people do manage to disrupt many attacks before they happen. However they are doing that, they aren’t doing it by examining every piece of video footage from everywhere. That is a technique for learning what happened, not what is going to happen.