Skip to main content
Text Extraction

Document OCR

Extract structured text from scanned documents, PDFs, and photographed forms with high fidelity. GeraLens handles multi-column layouts, rotated text, handwriting, and mixed scripts including Arabic, Armenian, and Cyrillic. Output is structured JSON with bounding-box coordinates and confidence scores per field.

Accuracy
97%+ on typed text, 89%+ on handwriting
Latency
<800ms per page

Use cases

  • Invoice data extraction for accounting
  • Contract clause identification
  • Form auto-fill from photographed documents
  • Archive digitisation at scale

Gera Systems integration

This capability is wired into GeraCompliance within the Gera Systems platform. It is available via the GeraLens API for third-party integrations.

Start using Document OCR

GeraLens provides this capability via a simple REST API. Join the beta to access it — 1,000 free API calls per month during beta.

Join the beta

Document OCR — questions & answers

How accurate is GeraLens Document OCR?
Document OCR runs at 97%+ on typed text, 89%+ on handwriting. Accuracy is tracked per class and published rather than averaged into a single headline number.
How fast is GeraLens Document OCR?
Typical inference latency is <800ms per page via the GeraLens REST API.
What can I use GeraLens Document OCR for?
Common use cases: Invoice data extraction for accounting; Contract clause identification; Form auto-fill from photographed documents; Archive digitisation at scale.
Which Gera Systems product does Document OCR integrate with?
Document OCR is wired into GeraCompliance and is also available standalone via the GeraLens API.