The visual localization system uses a-priori information about location and orientation of known visual landmarks. The system does not automatically build maps of the environment; instead, it relies on maps provided by upper-layer systems or on maps hand-crafted by a user. The maps are pre-loaded into the system's database of maps; also, a user process might dynamically create a new map in runtime and push it down to the localization system for immediate use. All maps are stored in a map database, a software component prepackaged with Skilligent Visual Localization System.
The visual localization system reports estimated position and attitude of a video camera in a global reference frame defined by a map. For navigation purposes, a user process can translate the camera pose into three-dimensional pose of the robot's base. The translation formula is straightforward and includes a product of two matrices. The method works even if the robot is equipped with a pan-tilt camera; in this case, precise information about orientation of the camera relative to the robot's base is required.
Optionally, a camera might be equipped with a pan & tilt servo mechanism. A pan/tilt mechanism enables a robot "to look around" when trying to localize itself. To achieve this, the robot would have to trigger a special behavior to rotate the camera around when a location fix is urgently required. The visual localization system does not provide this behavior; instead, an upper layer of control hierarchy is expected to initiate the behavior when location update hasn't been received in a certain period of time.
The visual localization system uses underlying Skilligent Robot Vision System for recognizing and tracking visual landmarks. A mobile robot or an unmanned vehicle needs to be equipped with at least one video camera to capture images of the environment. The video camera should be mounted on the robot's base in a way which maximizes chances of detecting visual landmarks. Forward-looking, upward-looking, downward-looking and backward-looking camera arrangements are possible. By looking at what objects can serve as landmarks, one can decide what camera arrangement is best for a particular environment. For example, a downward-looking camera arrangement makes most sense for terrain-matching localization of UAVs.
An upward-looking camera arrangement features a camera pointed vertically at the ceiling. This method has shown good results as it allows for high-precision indoor robot localization (a typical localization error is on a scale of a few inches). The method implies that artificial landmarks are attached to the ceiling - unless the ceiling has its own uniquely looking visual landmarks present due to specifics of the room/building.