How GeraLens works

Computer vision for the real world — recognise, segment, decide.

Quick answers

What does GeraLens do?
Computer vision for real-world apps — object detection, segmentation, OCR, scene understanding, and visual reasoning. Available as REST or streaming WebSocket.
How accurate is GeraLens?
On standard public benchmarks we match or exceed the leading APIs. On real-world conditions (poor lighting, occlusion, multilingual OCR) we deliberately tune for robustness over benchmark gloss.
What languages does the OCR support?
30+ scripts including Latin, Cyrillic, Arabic, Hebrew, Devanagari, Chinese, Japanese, Korean, Armenian, Georgian. We add languages on request when there is a real demand.
Is there an on-device option?
Yes — selected detection and OCR models ship as on-device bundles for offline and privacy-sensitive workloads.

The journey, step by step

  1. 1

    Pick a capability

    Object detection, segmentation, OCR, scene understanding, or visual reasoning. Each declares latency, accuracy, and pricing.

  2. 2

    Send your image or video

    REST or WebSocket. Files or live streams. Typed responses you can render or pipe to another step.

  3. 3

    Reason and act

    Combine outputs with an LLM (via Nexus) for visual reasoning, or trigger an action (alert, log, decision) directly.

Ready to start?

GeraLens is the computer-vision platform — object detection, segmentation, OCR, scene understanding, and visual reasoning — production-tuned for real-world conditions and integrated with the Gera agen

Related