VisionLib focuses particularly on tracking objects, using different tracking techniques, centered around what we call Enhanced Model Tracking.
In its origin, Model Tracking basically refers to a tracking technique. But since recent years, it has become a synonym for object tracking – and one of the most prominent techniques, especially in Industrial Augmented Reality.
Model Tracking combines 3D graphics with computer vision and let's you use 3D and CAD data for detecting, localizing and tracking physical objects in the video stream.
How this works and how you can use it your projects is subject of this section.
The majority of AR tracking techniques use a collection of computer vision and image processing methods to enable marker-, feature-, or geometry-based (object) tracking.
In essence, Model Tracking uses geometry (i.e. shape-related) information for detecting, localizing and tracking physical objects in the video stream. To do so, it uses 3D and CAD data as tracking references to derive the edges of the 3D model and the edges in the video stream to make a match between them.
As such, the 3D model becomes the tracking reference for the physical objects, or tracking targets. Once tracking is established, i.e. the object has been localized, VisionLib calculates and delivers all core information to the match the coordinate systems of tracking and 3D graphics. That's the essence for Mixed- and Augmented Reality applications: Such mathematical descriptions enable to blend objects in the video stream seamlessly with 2D and 3D graphics to create the illusion, they exist in reality.
Because Model Tracking relies basically on an object's geometry as information, comparing it to e.g. SLAM – the state-of-art tracking technique and foundation behind ARKit, ARCore, HoloLens and others – it comes with some advantages:
These are edge cases for e.g. SLAM-based object tracking because with SLAM, so called features are used as tracking reference. Unfortunately, they are somewhat pixel-based and tend to be not very stable over (a longer) time. That is, why they can't handle movement or changing light very well.
Additionally, creating SLAM data as reference for object tracking is a very manual an somewhat inconvenient process. Especially registering the SLAM coordinate system with the one of the 2D/3D augmentation graphics takes time and is not very accurate. And given that (pre-acquired)SLAM data might break and become invalid rather fast, this process is not much reliable for industrial applications.
While Model Tracking outperforms e.g. SLAM in accuracy, persistence and with it's scalability, it needs at least the 3D model data as a setup step for tracking to work.
SLAM, however, is a great technique to blend 3D graphics and reality rather spontaneously, with no further knowledge of the scene-to-be-tracked (the so called 'SLAM map' can be created on-the-fly). E.g. by quickly moving around the camera, one can spontaneously, but also arbitrarily, place superimpositions in the space around him.
Combining Model Tracking and SLAM enables to fuse advantages of each: Model Tracking in order to detect and track the desired object in high accuracy, and SLAM, in order to acquire spontaneous and short-term information of the environment.
Having the latter, for instance, lets you cope with situations where the user user might turn camera/device abruptly in different or opposite directions. Because the object would then not be visible to the camera anymore, with sole Model Tracking, the tracking would stop. Having the SLAM environment information enables to look back at the object with continuing tracking and augmentation.
Besides plain Model Tracking, VisionLib features a couple of unique functionalities, such as:
As a developer, getting into computer vision techniques is usually pretty much a dive into deep-tech.
At VisionLib, we are aiming at making the work and the handling with computer vision for XR as easy as possible, while keeping the core tracking engine accessible from outside.
Model Tracking uses edge- and geometry information of objects in order to track them. By design, some objects have an advantageous shape for this and have a well track-able geometric structure, which results in a nice match and good tracking results. Yet, others are almost symmetric, with round, smooth or blunt edges. So tracking these objects might not always be a one-click-success; especially in industrial AR cases.
We recommend to make yourself familiar with core concepts regarding VisionLib. All articles in this and the next section (› Tracking Essentials) are a good read for more details and background information.
Their appliance and usage is particularly covered in our tutorial section.