Welcome to the VisionLib Engine Documentation. VisionLib is an augmented reality tracking library created by Visometry. It enables you to create augmented reality applications at industrial scale using computer vision tracking technologies.

VisionLib's Enhanced Model Tracking

With its so called Enhanced Model Tracking, VisionLib belongs to the most acknowledged AR tracking libraries for industrial and enterprise use. Computer vision and model tracking are key to any AR in which real physical objects, so called tracking targets, are augmented and extended with digital information. Be it for repair and maintenance, AR-based training cases, marketing or sales purposes - none of these would work without precise and reliable object detection and tracking.

We've mastered this computer vision technique, as we think it is the only suitable approach that is stable enough to track 3D objects. It helps overcoming typical "AR killers", like unstable light conditions or dynamic & changing elements in the real world.

The VisionLib Engine

In technical terms, VisionLib is a multi-platform AR tracking library that incorporates a whole box of algorithms needed for AR tracking. With those, it is determining the position of monocular or multiple cameras in respect to known objects in the real world. That is what we refer to as tracking: the device's camera targets an expected object, the camera's position is calculated in order to track those targets and thus enabling augmented reality applications.

Whatever tracking technique you use: VisionLib exposes this information for you to use it in your development environment. Here, VisionLib enables you to place your content precisely aligned to real objects with a high accuracy.

Getting Started

If you are familiar with augmented reality and computer vision tracking for AR, read on and either make your self familiar with VisionLib Engine, or get into development right away:

Background on Augmented Reality and Tracking

Augmented reality (AR) describes digital experiences where virtual content, such as 2D or 3D graphics, is blended with and aligned to the real world. In that sense, AR extends, enhances, or augments a user's view of the real world. When an app superimposes that content on a live camera image, the user experiences augmented reality: the illusion that virtual elements exist as part of the real world.

Working with the camera viewport of mobile devices in this way is called the video-see-through effect. On HoloLens and other mixed reality glasses, which let you see the reality through a transparent display instead of perceiving it through the video stream, this effect of blending is rather called the optical-see-through effect.

Computer Vision Tracking for Augmented Reality

Regardless of what device you are going to use, the core requirement for any AR experience is the ability to create and track correspondences between the real world of the user and a virtual space. VisionLib leverages computer vision techniques to enable this matching and tracking of correspondences.

With a multitude of tracking techniques available, from marker- over feature- to edge-based techniques, computer vision is an evolving research field, where a few popular techniques have found success and are considered state of art.

For you as a developer, creating AR experiences involves several layers. First, the tracking layer: VisionLib manages the real world acquisition through computer vision. The second layer is responsible for rendering visual elements, i.e. your virtual content. With VisionLib's API, you can choose from several development environments, where Unity3D is the most popular and easiest to get started. The first two layers are synchronized through VisionLib's API, which passes the math from tracking to rendering. The third layer is your specific application logic.

About Tracking – Of ›World Understanding‹ & ›Object Tracking‹

Odometry: "Bubble AR"

Earlier forms of Augmented Reality on mobile devices solely used inertial sensors (like compass, gyro and accelerometer) to put and align information into the real domain. However, content augments our view space but is not aligned to any specific object or location in the real world. It is just "pinned" there spatially and imprecisely. As inertial sensors alone tend to drift, they can miss important queues which prevents the AR view from presenting unambiguous allocations.

Computer Vision: Marker, Poster/Image Tracking

Tracking images, posters and markers with means of computer vision, all have the same foundation: using a (known) image or pattern for recognition and tracking it within the captured video stream. Based on what is called feature tracking, such 2D materials are good targets, because they result in fixed feature maps. Using these trackers was and is popular for many AR cases. E.g. when you have a printed product catalog, you can use images on particular pages to superimpose a depicted product in a 3D, AR view.

Image tracking usually enable precise augmentation results, but printed images are only 2D. You could "tag" your environment with images upfront and align actual physical 3D objects according to these "spatial markers" in order to mimic a true 3D object tracking. However, when things change - e.g. your aligned objects move - or markers are removed or placed differently, superimposition won't match your reality and the experience will break. In all cases, image markers need pre-preparation.

Computer Vision: AR with SLAM

SLAM (Simultaneous Localization and Mapping) has become a decent AR-enabler. This technique enables you to spontaneously reconstruct maps of your current environment with means of computer vision. It makes it possible to augment and blend content into reality quite stable and works fine for the placement of holograms with some basic environmental understanding.

But SLAM is not capable of precise model detection and as it only reconstructs maps of objects or spaces, it has problems with changing environment or light conditions and is not very stable over time. As a consequence, as a developer it is hard to almost impossible to pinpoint information to particular spots in reality upfront and if you work with stored or anchored SLAM information, it could eventually break.

Computer Vision: Unambiguity and Clarity with Model Tracking

Whenever you want to create AR apps, where you need information augmented precisely and unambiguously to a specific point or object in reality, model tracking is the state of art to work with. Model tracking localizes and tracks objects with means of 3D and CAD data and is a key enabler for all those AR applications that require to pin information and virtual content exactly to a certain point or location.

And because VisionLib's model tracking overcomes typical AR and computer vision obstacles, such as varying light or moving elements, there is no need for further preparations, i.e. either tagging objects or environments with markers or pre-acquiring SLAM - maps upfront.

Model Tracking as Game Changer

As a developer, this is a game changer for AR cases in which you have to rely on a stable tracking and detection. And by using CAD and 3D data as reference, you can reference your AR content in relation to the digital twin.

As a user, this is a game changer, too, because you get reliable and valuable AR apps that will support you in many different areas. For example AR-enhanced manuals, which guide you visually through a procedure step-by-step, putting torque or other special information on the screw they belong to. This puts AR views on a whole new scale: industrial scale.

Ready for VisionLib? Next, take a look at the › Technical Overview

documentation