Reconnaissance Active

Structured Intelligence
From Any URL

Send us a URL. We return clean, validated schema.org JSON‑LD — not raw HTML, not noisy markdown. Machine-readable facts your AI agent can use immediately. 200+ entity types identified and extracted.

jsonrecon — api response
// POST /extract
// GET /extract?url=acmecorp.com/about

{
  "@type": "Corporation",
  "name": "Acme Corporation",
  "founder": { "name": "Wile E. Coyote" },
  "contactPoint": [ { "telephone": "+1-800-555-1234" } ],
  "foundingDate": "1949",
  "confidence": "high",
  "source": "llm_extraction"
}
200+
Schema.org types detected
30+
Academic publishers covered
<3s
Average extraction time
$0
LLM cost on specialized domains

What We Extract

Every URL is a target. We deploy the optimal extraction strategy automatically — API integration, meta tag parsing, browser rendering, or AI analysis.

Entity Identification

Automatically detects the entity type on any page — Restaurant, Product, Event, Person, Article, MedicalCondition, SoftwareApplication, and 200+ more schema.org types.

Validated Schema.org Output

Every response is specification-compliant JSON-LD with proper @context, @type, and validated property names. Drop it directly into your knowledge graph or downstream pipeline.

Multi-Tier Scraping

Three-tier acquisition system: stealth browser rendering for bot-protected sites, fast headless rendering for standard pages, and direct HTTP for lightweight targets.

Token-Efficient Intelligence

Instead of feeding your AI agent 50KB of raw HTML, we deliver a compact JSON object with only the facts that matter. Save tokens, reduce latency, increase accuracy.

Confidence Scoring

Every extraction includes a confidence rating — high, medium, or low — based on extraction source and data quality. Know exactly how much to trust the intel.

Pre-Flight Assessment

Free difficulty check before paying. Know the expected schema type, scraping difficulty, and known blockers for any URL before committing funds.

Optimized Reconnaissance

For high-value domains, we bypass generic extraction entirely. Purpose-built modules deliver native-quality data at zero LLM cost.

Entity Specialization

Heuristic Trees
Local Businesses & Corporations

Bypass generic page types and drill down into rich organizational entities. We reliably extract nested contact details, founding data, geolocation, and hierarchies straight into clean JSON representations.

Corporation Organization LocalBusiness Hotel + Many More

Google Patents

Meta Tag Parsing
patents.google.com

Extracts patent numbers, inventors, assignees, filing dates, citations, related patents, and direct PDF links from 40+ citation meta tags — no AI required.

CreativeWork Patent Citations PDF Link + Many More

Scholarly Articles

Citation Parsing
PLOS • PubMed • Nature • arXiv • IEEE • Springer • +24 more

Universal citation meta tag parser covering 30+ academic publishers. Authors with affiliations, DOI, journal/volume/issue hierarchy, PDF links, references, and keywords.

ScholarlyArticle DOI Authors References + Many More

How It Works

Every URL runs through an intelligent pipeline that selects the fastest, most accurate extraction strategy.

Step 01

Target Acquired

URL validated, DNS checked, domain identified. Specialized fast-paths engaged if available.

Step 02

Data Acquisition

Optimal scraping tier deployed — API call, stealth browser, or direct HTTP based on target defenses.

Step 03

Intelligence Extraction

Existing JSON-LD analyzed. If insufficient, AI identifies entity types and extracts structured facts.

Step 04

Intel Delivered

Validated schema.org JSON-LD with confidence scoring. Cached for rapid subsequent retrieval.

Powered by x402

HTTP 402 Payment Protocol
Machine-native micropayments

JSON Recon is accessible via the x402 payment protocol — the open standard for machine-to-machine payments over HTTP. AI agents discover our service at /.well-known/x402 and pay per-request using cryptocurrency on Base. No API keys, no subscriptions, no human sign-up required. Our endpoints are also listed in the x402 Bazaar, the protocol's machine-readable service catalog for automated discovery.

Base (Live) More networks coming soon