Text Extraction
Document OCR
Extract structured text from scanned documents, PDFs, and photographed forms with high fidelity. GeraLens handles multi-column layouts, rotated text, handwriting, and mixed scripts including Arabic, Armenian, and Cyrillic. Output is structured JSON with bounding-box coordinates and confidence scores per field.
Accuracy
97%+ on typed text, 89%+ on handwriting
Latency
<800ms per page
Use cases
- Invoice data extraction for accounting
- Contract clause identification
- Form auto-fill from photographed documents
- Archive digitisation at scale
Gera Systems integration
This capability is wired into GeraCompliance within the Gera Systems platform. It is available via the GeraLens API for third-party integrations.
Start using Document OCR
GeraLens provides this capability via a simple REST API. Join the beta to access it — 1,000 free API calls per month during beta.
Join the betaDocument OCR — questions & answers
- How accurate is GeraLens Document OCR?
- Document OCR runs at 97%+ on typed text, 89%+ on handwriting. Accuracy is tracked per class and published rather than averaged into a single headline number.
- How fast is GeraLens Document OCR?
- Typical inference latency is <800ms per page via the GeraLens REST API.
- What can I use GeraLens Document OCR for?
- Common use cases: Invoice data extraction for accounting; Contract clause identification; Form auto-fill from photographed documents; Archive digitisation at scale.
- Which Gera Systems product does Document OCR integrate with?
- Document OCR is wired into GeraCompliance and is also available standalone via the GeraLens API.