LingBot-Map

Hugging Face

Feed-forward 3D foundation model for real-time scene reconstruction from streaming video or image sequences. Uses a Geometric Context Transformer with anchor context, pose-reference window, and trajectory memory for drift correction. Runs at ~20 FPS on 518x378 inputs and stable over 10k+ frame sequences via paged KV-cache attention and keyframe strategies.

3d Local Latest LingBot Family
Type
3d
Source
Local
License
Apache 2.0

Local Model Specs

Architecture
Geometric Context Transformer
Runtime
pytorch
VRAM Usage
6 GB
Disk Size
4.63 GB

Details

Release Date
-
Knowledge Cutoff
-
Source
Local
License
Apache 2.0
Model ID
lingbot-map
Last updated: April 19, 2026