Why detecting AI-generated images is difficult
Earlier generations of AI image tools — particularly early GAN models — produced visible artifacts: blurry text, warped backgrounds, extra fingers, and asymmetric faces. These heuristics spread widely, and many people still rely on them. But modern diffusion models like Stable Diffusion 3, DALL-E 3, Midjourney v6, and GPT-Image-2 produce images where these obvious tells have been largely eliminated.
The artifacts that remain are statistical and frequency-domain anomalies — differences in how pixel values are distributed, patterns in high-frequency components, and subtle inconsistencies in lighting physics. These are invisible to the eye but detectable by AI models trained specifically to identify them.
Key Stat
ScamAI research (arXiv:2602.07814) found that leading open-source AI image detectors achieve accuracy as low as 50–60% on out-of-distribution generated images — barely better than a coin flip.
Visual signs that may indicate AI generation
Visual inspection still catches some AI-generated images, particularly those from older tools or quick generations without manual refinement. These are clues, not proof — any of them can also appear in real photographs. Treat them as reasons to investigate further, not definitive conclusions.
- Hands and fingers — AI models historically struggle with hand anatomy. Extra, missing, or oddly bent fingers appear in lower-quality generations.
- Background physics — AI-generated backgrounds sometimes show physically impossible elements: text that is illegible or reversed, architectural details that do not make structural sense, reflections that do not match the scene.
- Skin texture uniformity — AI-generated faces often have unnaturally smooth or overly uniform skin with no natural imperfections like pores or fine lines.
- Hair rendering — individual hair strands often merge into each other or display an unnatural flow, particularly at the edges of the head.
- Teeth — AI-generated smiles sometimes show unnaturally uniform teeth or a strange blending where teeth meet gums.
- Earrings and jewelry — asymmetric or impossible jewelry is a common tell in AI-generated portraits.
Importantly, GPT-Image-2, Midjourney v6, and Stable Diffusion XL have substantially improved on all of these areas. A fully realistic AI image from a current model may pass all visual checks. This is why automated detection is essential for any platform where AI-generated images pose a fraud or authenticity risk.
Metadata and provenance checks
Some AI image generators embed metadata in the files they produce. Checking EXIF data can reveal that an image has no camera metadata (indicating digital creation, not photography) or that it was produced by a specific software tool. However, this is easily stripped by resaving, screenshotting, or uploading to social platforms, which compress and reprocess images.
The Content Credentials (C2PA) standard, promoted by Adobe and Truepic, aims to cryptographically sign images at the moment of capture. However, this requires C2PA-enabled cameras and editing software. As of 2026, C2PA adoption covers a small fraction of cameras in use, and it cannot retroactively authenticate existing images. AI-based detection is the only method that works on any image regardless of metadata.
Pro Tip
Reverse image search can help verify if a supposed real photo appears elsewhere with different context — but it will not detect a freshly generated AI image that has never been indexed.
Free tools to detect AI-generated images
Several free tools exist for checking individual images. They vary significantly in accuracy and the generation technologies they cover.
- ScamAI (app.scam.ai) — 200 free detections per month via API or web interface. Eva-v1 model covers GAN images, diffusion outputs (Stable Diffusion, DALL-E, Midjourney, Flux, GPT-Image-2), and face swaps. Returns a confidence score.
- Hive Moderation — free tier available for AI-generated content detection. Broad content moderation platform, less specialized for deepfakes specifically.
- Illuminarty — web-based tool focused on detecting AI-generated art. Better on Midjourney and Stable Diffusion outputs.
For individual one-off checks, any of these tools provide a starting point. For systematic screening — such as verifying every profile photo on a platform or every image submitted in an insurance claim — a free-tier web tool is not sufficient. Production use cases require an API.
API-based detection for developers and platforms
For any application where AI-generated images represent a fraud, trust, or authenticity risk, integrating a detection API is the appropriate solution. API-based detection runs automatically at scale, returns machine-readable confidence scores, and processes images in under 4 seconds.
ScamAI's detection API accepts image URLs or base64-encoded payloads and returns a JSON response. The integration is minimal — most platforms are up and running in under 10 minutes.
import requests
response = requests.post(
"https://api.scam.ai/v1/detect",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={"image_url": "https://example.com/profile-photo.jpg"}
)
result = response.json()
# {"is_deepfake": true, "confidence": 0.94, "model": "eva-v1-pro"}The confidence score lets you set your own threshold based on your use case. A KYC provider screening onboarding selfies might flag anything above 0.7 for manual review. A content moderation platform might auto-reject anything above 0.95 and queue lower-confidence results for human review.
Understanding confidence scores and false positives
No detection system achieves 100% accuracy. ScamAI's Eva-v1 achieves 95.3% accuracy, which means approximately 47 incorrect results per 1,000 images analyzed. These errors divide into false positives (real images flagged as AI-generated) and false negatives (AI images that pass detection).
The appropriate threshold depends on the cost of each error type in your application. If false positives cause significant friction for legitimate users (for example, blocking a real passport photo in a KYC flow), a higher confidence threshold is appropriate. If missing a deepfake represents a larger risk (for example, approving a fraudulent insurance claim), a lower threshold that flags more images for review is preferable.
Pro Tip
Use ScamAI's free tier of 200 images/month to calibrate confidence thresholds on your actual data before deploying at production scale.