VisionLib focuses particularly on tracking objects, using different tracking techniques, centered around what we call Enhanced Model Tracking.
In its origin, Model Tracking basically refers to a tracking technique. But since recent years, it has become a synonym for object tracking – and one of the most prominent techniques, especially in Industrial Augmented Reality.
Model Tracking combines 3D graphics with computer vision and let's you use 3D and CAD data for detecting, localizing and tracking physical objects in the video stream.
How this works and how you can use it your projects is subject of this section.
The majority of AR tracking techniques use a collection of computer vision and image processing methods to enable marker-, feature-, or geometry-based (object) tracking.
In essence, Model Tracking uses geometry (i.e. shape-related) information for detection, localization and tracking of physical objects in the video stream. It takes the geometry from 3D (e.g. CAD) data, derives image space edges from the geometry and matches these to edges found in video stream.
As such, the 3D model becomes the tracking reference for the physical objects. Once tracking is established, i.e. the object has been localized, VisionLib calculates and delivers all core information to the match the coordinate systems of tracking and 3D graphics.
This is crucial for Mixed- and Augmented Reality applications: Model tracking enables seamless and believable blending of 2D and 3D objects into a video stream, creating the illusion that they exist in reality.
Model Tracking's reliance on an object's geometry as information gives it advantages over SLAM – the state-of-art tracking technique and foundation behind AR APIs like ARKit, ARCore, HoloLens, etc.
These are edge cases for SLAM-based tracking since SLAM uses entirely image-derived features as tracking reference. Unfortunately, these are somewhat pixel-environment-based and tend not to be very stable over time. Movement or changes in lighting often cause these features to disappear in subsequent frames, causing SLAM to struggle.
Additionally, creating SLAM data as reference for object tracking is a very manual and somewhat inconvenient process. Especially registering the SLAM coordinate system with the one of the 2D/3D augmentation graphics takes time and is not very accurate. And given that (pre-acquired) SLAM data might break and become invalid rather fast, this process is not very reliable for industrial applications.
While Model Tracking outperforms e.g. SLAM in accuracy, persistence and with it's scalability, it needs at least the 3D model data as a setup step for tracking to work.
SLAM, however, is a great technique to blend 3D graphics and reality rather spontaneously, with no further knowledge of the scene-to-be-tracked (the so called 'SLAM map' can be created on the fly). E.g. by quickly moving around the camera, one can spontaneously, but also arbitrarily, place superimpositions in the space around him.
Combining Model Tracking and SLAM enables to fuse advantages of each: Model Tracking in order to detect and track the desired object in high accuracy, and SLAM, in order to acquire spontaneous and short-term information of the environment.
Having the latter, for instance, lets you cope with situations where the user user might turn camera/device abruptly in different or opposite directions. Because the object would then not be visible to the camera anymore, with sole Model Tracking, the tracking would stop. Having the SLAM environment information enables to look back at the object with continuing tracking and augmentation. Information about how you can enable this and take full advantage of this combination can be found in the [description of the Optional Tracking Parameters](vlUnitySDK_Article_UnderstandingTrackingParams).
Besides plain Model Tracking, VisionLib features a couple of unique functionalities, such as:
As a developer, getting into computer vision techniques is usually pretty much a dive into deep-tech.
At VisionLib, we are aiming at making the work and the handling with computer vision for XR as easy as possible, while keeping the core tracking engine accessible from outside.
Model Tracking uses edge- and geometry information of objects in order to track them. By design, some objects have an advantageous shape for this and have a well track-able geometric structure, which results in a nice match and good tracking results. Yet, others are almost symmetric, with round, smooth or blunt edges. So tracking these objects might not always be a one-click-success; especially in industrial AR cases.
We recommend to make yourself familiar with core concepts regarding VisionLib. All articles in this and the next section (› Tracking Essentials) are a good read for more details and background information.
Their appliance and usage is particularly covered in our tutorial section.