How GeraLens works
Computer vision for the real world — recognise, segment, decide.
Quick answers
- What does GeraLens do?
- Computer vision for real-world apps — object detection, segmentation, OCR, scene understanding, and visual reasoning. Available as REST or streaming WebSocket.
- How accurate is GeraLens?
- On standard public benchmarks we match or exceed the leading APIs. On real-world conditions (poor lighting, occlusion, multilingual OCR) we deliberately tune for robustness over benchmark gloss.
- What languages does the OCR support?
- 30+ scripts including Latin, Cyrillic, Arabic, Hebrew, Devanagari, Chinese, Japanese, Korean, Armenian, Georgian. We add languages on request when there is a real demand.
- Is there an on-device option?
- Yes — selected detection and OCR models ship as on-device bundles for offline and privacy-sensitive workloads.
The journey, step by step
- 1
Pick a capability
Object detection, segmentation, OCR, scene understanding, or visual reasoning. Each declares latency, accuracy, and pricing.
- 2
Send your image or video
REST or WebSocket. Files or live streams. Typed responses you can render or pipe to another step.
- 3
Reason and act
Combine outputs with an LLM (via Nexus) for visual reasoning, or trigger an action (alert, log, decision) directly.
Ready to start?
GeraLens is the computer-vision platform — object detection, segmentation, OCR, scene understanding, and visual reasoning — production-tuned for real-world conditions and integrated with the Gera agen