IMINT Forensics: How to Detect Manipulated and AI-Generated Images (Tools, Techniques, Workflow)

IMINT · May 2, 2026 · Updated May 2, 2026

Pixels lie. The job is to make them confess.

IMINT forensics is the part of imagery intelligence that doesn't care what the picture shows — it cares what the file betrays. Every edit leaves a fingerprint somewhere: in the compression grid, in the sensor noise, in a shadow that points the wrong way. A working analyst doesn't argue with the picture. They open it in three tools and watch it contradict itself.

This is the field guide to that workflow — the techniques, the tools, and the parts where most people get it wrong.

Why image forensics is harder than it used to be

Until 2022, "manipulated" mostly meant Photoshop: a removed wedding ring, a pasted face, a sky swap. The forensics game was about catching those edits. Then diffusion models showed up and changed the question. Now the picture might never have existed at all — no scene, no camera, no edit. Just a model hallucinating photons.

That split the discipline in two. Classical forensics (the math of compression and sensors) is still essential, because most disinfo is still cheap recompression and sloppy splicing. AI-vs-AI detection (models trained to spot models) is the new layer on top. You need both.

The classical toolkit: where edits leave bruises

Error Level Analysis (ELA)

ELA re-saves a JPEG at a known quality, subtracts the result from the original, and visualises the difference. Untouched regions converge to a low local minimum of error. Edited regions don't. A pasted-in face suddenly glows white over a dark, uniform background.

Sounds magical. It isn't. ELA is famous for false positives — high-frequency areas like text, edges and fine textures light up the same way edits do. One of the more cited forensic critiques notes that ELA "incorrectly labels altered images as original and incorrectly labels original images as altered with the same likelihood" when used naively. Treat it as a first look, never a verdict.

JPEG ghosts and double quantisation

When you re-save a JPEG, the second compression leaves arithmetic ghosts in regions that were previously compressed at a different quality. Splice a face from a different JPEG and the spliced region shows a different ghost pattern. This is what tools like Forensically and FotoForensics expose.

Double-quantisation analysis is the more rigorous cousin: it looks at DCT coefficient histograms for the periodic artefacts that two stacked compressions produce. Slower, harder, more reliable. If you only run ELA you will miss things that double-quantisation will catch immediately.

PRNU — the camera's fingerprint

Every CMOS sensor introduces a unique pattern of pixel-level noise called Photo Response Non-Uniformity. Two pixels next to each other on the same sensor respond slightly differently to identical light. Average enough pictures from the same camera and that pattern emerges. Now you have a fingerprint that survives JPEG compression, mild filtering, and modest cropping.

PRNU is how forensic labs say "these two photos came from this specific phone." It's also how investigators flag splices: if a region of an image fails to correlate with the rest of the PRNU pattern, something foreign was pasted in.

The catch: PRNU is fragile. Heavy compression, denoising, or platform re-encoding can grind it down to noise. WhatsApp and Instagram routinely do exactly this.

CFA inconsistency

A single-sensor camera doesn't capture full-colour pixels — it captures one colour per pixel through a Bayer filter, then interpolates the rest. That interpolation leaves a periodic statistical pattern across the image. Spliced or AI-generated regions don't share the host image's CFA pattern, because they came from a different sensor (or no sensor at all).

This is one of the cleanest tells when it works, and it's the basis for techniques described in Hany Farid's foundational paper on CFA-interpolated forgeries.

Lighting and shadow physics

This is where forensics stops being software and starts being geometry. Cast shadows must converge to a single light source. Attached shadows on faces must agree with that direction. Eye reflections (catchlights) should match the room's actual light layout. A face composited into a scene almost always fails one of these tests.

Hany Farid — the practical father of the discipline — has built a whole sub-field around this. He calls it "photo forensics from lighting environments," and it's one of the few techniques that works equally well on Photoshop edits and AI generations, because diffusion models still get inconsistent shadows on long scenes with multiple objects.

Copy-move detection

Someone clones grass over a body. Someone duplicates a crowd to make a rally look bigger. Copy-move detection scans the image for self-similar regions — patches that are identical except for position. It catches the lazy edits. It misses the careful ones, because skilled editors rotate, scale and recolour their pastes before pressing save.

Detecting AI-generated images

Classical methods leak signal on AI images, but they weren't built for this. The new layer is model-vs-model: detectors trained on diffusion outputs that learn the distributional fingerprints of specific generators.

What actually shows up under a magnifier:

Anatomical errors. Six fingers, fused fingers, twelve teeth, asymmetric pupils. Recent research catalogues these as the most reliable human-eye tells in current diffusion outputs.
Catchlight asymmetry. A real face lit by one window has matching reflections in both eyes. AI faces frequently don't.
Background incoherence. Crowds turn into mush. Text becomes dream-text. Bookshelves grow impossible spines.
Frequency-domain signatures. Diffusion models leave faint periodic patterns in the Fourier spectrum that don't appear in camera images. This is what most automated detectors actually key on.

The detection layer is moving fast. Reality Defender runs a multi-model ensemble across video, image, audio and text. Sensity, founded by University of Amsterdam researchers, claims 95–98% accuracy on its premium tier. TrueMedia.org went offline in early 2025 for a relaunch. Hive's Chrome extension covers the "while you browse" use case.

A note on accuracy claims: vendor numbers are best-case. Independent comparative work consistently finds that detectors strong on one generator collapse on the next, especially after the image has been re-encoded by a social platform. Run multiple detectors. Trust none of them alone.

The practical toolbox

What an analyst actually opens:

Forensically — browser-based, free, contains ELA, clone detection, noise analysis, level sweep, magnifier and metadata viewer. The thing you reach for in the first sixty seconds.

FotoForensics — the workhorse for ELA on uploaded images, plus a respectable metadata reader. Public uploads are a privacy concern, so don't drop sensitive material in.

Ghiro — automated, open-source, batch-oriented. The one you deploy when you have a thousand images and a deadline. Indexes EXIF, GPS, hashes, ELA, and groups results into cases.

Image Verification Assistant — the InVID/WeVerify suite. Multiple forensic algorithms in one upload. The browser plugin adds deepfake detection, reverse search and a magnifier good enough for licence plates and shoulder patches.

Amped Authenticate — the commercial, court-grade tool. Used by law enforcement worldwide. Combines virtually every classical technique with deepfake detection inside a single forensic-report-friendly interface.

Sherloq — open-source desktop app, basically a power-user GUI on top of every classical algorithm worth running locally. No cloud uploads, which matters for sensitive cases.

ExifTool — the metadata Swiss Army knife. Not a forensic tool by itself, but no analysis starts without it.

For AI detection, run images through at least two of: Hive AI Detection, Reality Defender, Sensity, and the deepfake panel inside InVID-WeVerify.

A workflow that actually works

Read the metadata first. Camera make and model, software field, timestamp, GPS. Half the cases die here — the "iPhone photo from the front line" turns out to have been made in Photoshop on a Windows desktop.
Reverse-image search before anything else. If the image has been online since 2018, you're not investigating, you're re-discovering.
Run ELA and clone detection in Forensically. Note hot spots, but don't conclude anything yet.
Pull the JPEG ghost or double-quantisation view. Cross-reference with the ELA hot spots.
Eyeball the lighting. Pick a clear cast shadow. Trace it back. Does every other shadow agree?
If you suspect AI, hit it with two detector ensembles, then hand-check the magnifier view: hands, teeth, eyes, background text.
Write the verdict in probabilities, not certainties. "Highly likely composited" is honest. "Confirmed fake" is a lawsuit.

Where this goes wrong

The most common failure isn't the technique — it's the analyst. People run ELA, see red splotches in a textured area, and call it manipulation. They run a deepfake detector that returns 0.62 and call it a deepfake. Forensic findings have to be triangulated. Three independent methods agreeing is a finding. One method screaming is a hypothesis.

The other failure is platform laundering. Telegram, WhatsApp, Twitter, Instagram — each one strips metadata, re-encodes the JPEG, and degrades the very signals the classical methods need. By the time an image has been forwarded twice, ELA is mostly noise and PRNU is gone. If you can get the original file, get it. If you can't, weight your conclusions accordingly.

Pixels lie. So do the people who interpret them. The tools are getting better. So are the fakes. The job — staying one careful step ahead — doesn't change.