Face Search for OSINT: PimEyes, FaceCheck, Search4Faces and the Discipline Behind Real Identifications

IMINT · May 2, 2026 · Updated May 2, 2026

One photo, thirty minutes, and a fugitive who'd been hiding for three decades was back on the front page. That's what facial recognition does in a serious OSINT workflow — and that's also exactly why it scares regulators.

Facial recognition is the loudest sub-discipline of IMINT (Imagery Intelligence) right now. Not because it's new — the math has been around since the 90s — but because the indexes have grown so large that a single front-facing crop now pulls hits from across the open web, social platforms, dating sites, leaked databases, and surveillance scrapes. Used well, it's the fastest way to turn an anonymous face into a name. Used carelessly, it's a fast track to libelling the wrong human being.

This post is the working operator's view: how the engines actually work, which one to reach for and when, the techniques that keep your false-positive rate down, and the places where the legal floor is paper-thin.

What facial recognition actually does (in 60 seconds)

Forget the "AI magic" framing. A face engine does three things: detect the face in your image, convert it to a numeric vector (an embedding — typically 128 to 512 floats describing eye spacing, jaw geometry, brow ratio), and compare that vector against an indexed corpus of pre-computed embeddings using cosine similarity. The "match" you see is just a similarity score above a threshold.

That detail matters operationally for two reasons. First: an engine can only return faces it has indexed. PimEyes has not crawled your private Instagram. It crawled the open web. Coverage is everything. Second: similarity is not identity. Two unrelated people with similar bone structure score high all the time. NIST's FRVT benchmarks have shown for years that false-positive rates vary by orders of magnitude depending on demographic, image quality, and threshold — and that demographic skew is real, with within-group false positive rates differing by factors in the thousands across the algorithms tested.

If you're treating a single match as a confirmed ID, you're not doing OSINT. You're guessing in HD.

The engines, ranked by what they're actually good for

PimEyes — the open-web workhorse

PimEyes is the default for global, non-social web coverage. The company has publicly described an index in the billions of images, with their CEO confirming roughly 2.1 billion faces hashed at one point and newer reporting putting indexed images around 3.5 billion. A video-search expansion was announced in mid-2025. PimEyes deliberately does not index most social platforms — it claims to focus on the open web — which is both a feature (cleaner results, fewer scraped portrait shots) and a limit (don't expect Instagram coverage).

This is the engine that Bellingcat used to surface ex-RAF fugitive Daniela Klette, who'd been on the run for thirty years. Capoeira event photos pulled from association websites — that was the weak spot. Not her face on social media. Her face in someone else's photo, on the open web. That's the PimEyes sweet spot.

FaceCheck.id — for social profiles and the grey corners

FaceCheck.id indexes around 977 million images and explicitly hits social media — Instagram, Facebook, Twitter/X, LinkedIn, TikTok, YouTube — plus dating apps, escort directories, and adult content. Where PimEyes is the cleaner global crawler, FaceCheck is the one you reach for when the subject is likely active on platforms PimEyes won't touch. The trade-off is obvious: broader, dirtier, and legally murkier in many jurisdictions. Treat its hits as leads, not evidence.

The Russian-internet stack: FindClone, Search4Faces, SearchFace.ru

If your subject has any plausible Russian or post-Soviet trace, the global engines underdeliver. Search4Faces indexes profile photos from VKontakte, Odnoklassniki, TikTok and Clubhouse — somewhere on the order of hundreds of thousands of main profile pictures plus a much larger pool of secondary VK photos. Bellingcat documents Search4Faces as a workhorse for VK-anchored investigations. SearchFace.ru has been a similar tool since 2019, and FindClone — the paid Russian-network engine — has been used in war-crimes attribution and protest doxxing alike, as documented in AlgorithmWatch's reporting on the dual-use reality of these services.

One operational note: VK has historically threatened legal action over scraped indexes. These tools come and go. Always check whether the engine is currently up before you build a workflow around it.

Lenso.ai — the new entrant worth testing

Lenso.ai launched out of Poland in 2024 with a reverse-face product, an API for OSINT platforms, and alerting on new matches. It's faster than most competitors on benchmark tests, but its index is younger and shallower than PimEyes. Useful as a second-engine vote when you want corroboration without paying for two heavy subscriptions.

Yandex — still the reverse-image king for faces

Yandex Images isn't a face-recognition engine in the strict sense — it's a reverse-image search that happens to be unusually good at finding the same human across different photos. For years it was the default OSINT recommendation because Google's reverse image search refuses to return obvious face matches and Yandex doesn't share that restraint. Coverage outside Russia and CIS has shrunk since 2022 sanctions, but it still surfaces matches the Western engines miss, especially when there's any Eastern European footprint.

Pair it with TinEye, Bing Images, and Baidu Images for full reverse-image coverage. They won't do face recognition — they do exact-image matching — but they're how you find the original source of a photo your face engine returned.

Clearview AI — the elephant you can't use

Clearview claims a 70+ billion image index sourced from public web and social scraping — the largest in the world. Reporting from The Record documents that law-enforcement search volume doubled into 2023. The catch: it's restricted to government and law enforcement clients. If you're not LE, it doesn't exist for you. Worth knowing it's out there because it shapes what's possible — not what you'll be running tonight.

The technique stack that actually keeps you out of trouble

The tools are the easy part. The discipline is the rest of this section.

Crop and normalise before you submit. Engines score better on tight, front-facing crops with the eyes roughly horizontal. Group shot? Crop the face out. Subject at a 30-degree angle? Most engines tolerate it; some don't. If you have multiple photos of the same person, run all of them — embeddings differ across pose and lighting, so different photos surface different parts of the index.

Multi-engine, always. One hit from one engine is a lead. Two independent hits across two engines, especially across different coverage zones (e.g., PimEyes + Search4Faces), is meaningful. If you're paying for one subscription, you're working with a blind spot.

Pivot off the source page, not the face. The face match opens a door. The page it sits on is what tells you who the person is — username, account name, captions, comments, contextual photos. The actual identification almost always comes from the second pivot, not the engine.

Verify with the background, not just the face. Run Yandex, TinEye, and SauceNAO on the photo itself once you have a candidate. If the background appears in another photo of the same person on a different platform, you've cross-validated. If the EXIF survives, run it through ExifTool. If you suspect the photo's been edited, drop it into FotoForensics.

Look for biometric tells the engine ignores. Bellingcat's reviewers explicitly look for moles, ear-lobe shape, freckle patterns, scars — features the embedding may not weight heavily but a human can confirm. If the engine returns a 92% match but the subject's left ear has a notch and the candidate's doesn't, the engine is wrong.

Demand at least two corroborating signals before you call it. Face match + matching username on a profile = lead. Face match + username + same background + known associate in a comment = identification. The number of stories that fell apart because someone treated a single PimEyes hit as gospel is not small.

The legal floor is uneven and getting lower

Different jurisdictions, different rules. The EU's GDPR and the new AI Act treat biometric search as high-risk processing. Illinois' BIPA has produced multi-million-dollar settlements against face-search vendors. Several US states are mid-legislation. The UK's ICO has sanctioned Clearview multiple times. None of this stops the tools from working — but it does shape what you can publish, what evidence holds up, and whether your client gets sued.

If you're operating commercially: get the legal review before the search, not after. If you're a journalist: document your methodology and your verification chain. The tools that scare people the most are the ones that look least defensible when read out loud in a courtroom.

Bottom line

Facial recognition isn't a button that returns identities. It's a probabilistic narrowing tool that points you at pages worth reading. The operators who use it well treat it as the cheapest part of the workflow — the work is in the corroboration. Engines change names, indexes shift, and a year from now half this list will look different. The technique stack — crop, multi-engine, pivot, corroborate, two signals minimum — is what survives.

Use it like that, and a face becomes a thread you can pull. Use it lazily, and you'll publish the wrong name. There is no third option.