Our recent affair with Weiner pictures is by no means the first time alleged experts told us an image was fake because of some abstruse analysis of the pixels or header information. This last affair was a big enough embarrassment, however, to leave us wondering about those experts, and how they can all be so wrong. It also leaves us wondering how much to trust anyone who tells you an image was faked because of some jpeg artifacts or EXIF headers or whatnot.
In this diary, I will answer both of those questions. The shorty-short answer is that virtually all photo-is-fake diaries are bogus. The short answer is that it's very easy to mistakenly see evidence that an image is fake. The slightly longer answer is that lots of people have zero expertise in forensics, but are convinced they have profound knowledge of it because of career experience working with images.
Let me disclaim that I do have actual career experience in image forensics. I am a professor of electrical engineering whose published research is in the area of multimedia security, including image and audio forensics. I am also a program chair and organizer for an annual conference on multimedia security, which in recent years has been about 25% image forensics.
Off the top of my head I can easily name at least 10 people who know 10 times as much as I do on the subject, but still, my background qualifies me to tell you a few important things about image forensics:
1.) Image forensics is a branch of information security, and it requires an extensive background in signal processing, probability and statistics.
If you can't explain to me why complex exponentials are eigenfunctions of time-invariant linear systems, then you are not an expert in image forensics---indeed, you are probably not an expert in signal processing in general. If you can't tell me the difference between ML and MAP rules, then you are not qualified to declare an image fake based on a statistical test.
In my experience, pretty much everyone on the Internet who tells me a picture is faked does not have any of this background. Bear that in mind the next time you see a look-at-the-pixels diary.
2.) You don't learn any of this stuff by using Photoshop for ten years.
Beware anyone who declares photos fake because of shadows or pixels, and then declares himself/herself an expert because of career experience in using Photoshop.
Would you believe I was an expert in computer networking because I have 16 years of experience using web browsers? Maybe if I wrote one, sure, but just by virtue of being a user? Lots of people are users, and it's very easy to be a user without knowing anything about algorithms or impulse responses.
3.) You don't learn about image forensics in art school, or by studying graphic design, or computer graphics, or even by getting a general degree in computer science or electrical engineering.
For some reason, people who work a lot with images get the impression that they are experts on images, and that image analysis is their domain of expertise. In reality, this is like working with people for 10 years and then deciding you are qualified to perform surgery.
An EE background will get you to the point that you can read and understand Hani Farid's papers, but it still doesn't make you any sort of authority on image forensics. You certainly don't become an authority by virtue of having a degree in graphic design.
4.) Even people with a decent background in signal processing and statistics will get statistics wrong.
For some reason, even very educated people trip over basic probability. Google the Monty Hall Problem to see some disturbing examples. From Wikipedia:
After the Monty Hall problem appeared in Parade, approximately 10,000 readers, including nearly 1,000 with PhDs, wrote to the magazine claiming that vos Savant was wrong. (Tierney 1991) Even when given explanations, simulations, and formal mathematical proofs, many people still do not accept that switching is the best strategy.
When anyone tries to tell you the odds of anything, bear this in mind. It is apparently very easy even for the most educated people to get the odds wrong. I don't know why that is, because probability is mathematically very simple. There is just something conceptually trippy about it. This is a bit scary when you consider that most people with degrees in science only take like one statistics course in their sophomore year. And believe me, kids who take a statistics course in their sophomore year don't know dick about anything by the time they become seniors.
5.) Ex-post-facto analysis of anything is a morass of mistakes and fallacies.
A common fallacy we see on the Internet is the confusion of the inverse. In plain English this means that we see something we think is very unlikely or anomalous (look at the pixels here, and the pixels there!) and then we conclude that it couldn't have happened by chance.
This is a common argument on the Internet in general, and it's bogus. The basic mistake is a confusion between "This image is unlikely if XYZ happened" and "XYZ is unlikely because this image happened." The link goes into more detail, but the gist is that elementary probability fools pretty much everyone, including people who should know better.
This is exacerbated by another fallacy, which I call the dartboard fallacy: when you throw a dart at a dartboard, the outcome you observe is incredibly unlikely to happen by chance. And yet it happened. This is actually true of most real-world events: the exact way things happen is always highly unlikely. If you spill a salt shaker, the exact arrangement of grains on the floor is incredibly unlikely. If you take a digital photograph, the CCD noise pattern is incredibly unlikely. At every moment, the exact arrangement of molecules in your immediate vicinity is incredibly unlikely. Thus it is easy to look at anything that happens in reality and spot some weird statistical anomaly.
This is central to most claims that an image is faked. We look at the pixels or headers or file creation dates and find something unusual. We always will, because every image file has something identifiably unusual about it.
6.) Never believe anyone who tells you an image is fake unless:
- The poster is an established expert in image forensics, or
- They found the original image, or an original source of the faked image, or
- They can compute an actual P-value or some other statistic backing up their claims. None of this fuzzy "that looks fishy" crap, but:
- There is ample time for a statistician to read the diary and point out what they did wrong.
Basically that excludes almost every diary we ever see. So my "expert" advice is to assume, a priori, that anyone who tells you a photo is fake because of the pixels or headers is a non-expert promoting pseudoscience, perhaps unwittingly.
Also,
7.) If anyone expresses any skepticism based on these principles, please do not shit on them.
I mean, for the good of the site.