documentation

Model Tracker

Tracking configuration parameters for setting up a model tracker.

The "modelTracker" uses model-based tracking for determining the camera pose relative to a real-world object.

Configuration File Parameters

The following parameters can be set inside the tracking configuration file:

Parameter Type Default value Runtime set-/getable

Example

modelURI string not defined YES "modelURI": "http://mysuperserver/mymodel.ply"

URI to the mandatory surface-model.

The following file formats are generally supported: 3D, 3DS, AC, ASE, B3D, BLEND, BVH, COB, CSM, DAE, DXF, FBX, GLB, GLTF, HMP, IFC, IRR, IRRMESH, LWO, LWS, LXO, MD2, MD3, MD5, MDC, MDL, MDL, MS3D, NDO, NFF, NFF, OBJ, OFF, OGEX, PK3, PLY, Q3D, Q3S, RAW, SCN, SMD, STL, TER, VTA, X, XGL, XML and ZGL.

Using FBX files can lead to problems, if the file uses an internal scaling factor. Depending on the model loader you are using, the resulting 3D-model may be scaled by that factor. Especially, the scaling factor has a different interpretation in our library than in Unity.

Usage of OBJ and PLY files is recommended, because these formats are well tested and they can be easily produced using MeshLab (http://meshlab.sourceforge.net/) and the data is stored binary (with PLY).

The usage of http URIs is currently not possible on UWP, which includes HoloLens. If you require this feature, please contact us.

lineModelURI string not defined NO "lineModelURI": "project-dir:/mylinemodel.obj"

URI to line-model. If this is not defined, then the line-model will be generated automatically at runtime using the surface-model. The supported file format is OBJ.

occlusionModelURI string not defined NO "occlusionModelURI": "project-dir:/box.obj"

Optional URI to an occlusion model. The occlusion model can be used to occlude parts of the generated line-model. This can be handy, if the 3D-model used to generate the line-model (modelURI) contains structures, which do not reliably match the real world geometry. For example the hubcaps of a car might always look different, because one can't control the rotation of the tires. In that case you can use a simply occlusion model to cover the problematic parts. This should improve the tracking results, because the generated line-model will not include those unreliable structures.

Supports the same file formats as the modelURI parameter.

initDataURI string not defined NO "initDataURI": "project-dir:InitData.binz"

URI to previously recorded initialization data. This allows an automatic initialization from multiple views.

initPose json not defined YES
"initPose": {
"type": "visionlib",
"t": [-0.1, -0.2, 6.1],
"r": [0.1, -0.1, 0.7, -0.2]
}

This defines the initial pose for the initialization phase.

A "visionlib" init pose consists of a translation $t$ (vector t) and a rotation $R$ (quaternion r). Please notice, that the transformation $(R,t)$ does not represent the position and orientation of the camera directly. Instead it represents the transformation of a 3D point $P_w$ from world coordinates into a 3D point $P_c$ in camera coordinates: $P_c = RP_w + t$. In the internal VisionLib coordinate system the x-axis points right, the y-axis points down and the z-axis points forward. Instead of specifying t and r directly in the configuration file, you can also set the "uri" property to the path of a separate JSON file (e.g. "uri": "project-dir:InitialPose.json"), which must contain one object with t and r.

metric string mm NO "metric": "cm"

This should be set to the corresponding unit size of your model. Valid values are metric scales ("mm", "cm", "dm", "m" or "km") and imperial scales ("in", "ft", "yd", "ch", "fur", "ml").

useColor boolean false NO "useColor": true

This option allows VisionLib to recognize edges between pixels with the same intensity value, if they differ in hue or saturation. It can increase the tracking quality, while having the drawback of needing more resources and processing power. Turn this setting on, if the object can not be distinguished from the background just by intensity or if internal edges should be used, that are easily separated by their hue.

keyFrameDistance float [0.001..100000.0] 100.0 YES "keyFrameDistance": 200
The minimum distance between keyframes in mm. The line-model will only be generated for certain keyframes. Therefore a higher value improves the performance at the cost of a lower precision. Lower values cost more performance but increase the precision.
keyFrameRotation float 20.0 YES "keyFrameRotation": 10.0

Depicts the sensibility of key frame generation regarding the rotation of the camera in degrees. You should not change this parameter unless you know what you are doing! Lower values cost more performance but increase the precision.

laplaceThreshold float [0.001..100000.0] 5.0 YES "laplaceThreshold": 7

Threshold in mm for generating the line-model. This value specifies the minimum depth distance between two neighboring pixels necessary to be recognized as an edge.

normalThreshold float [0.001..1000.0] 1000.0 YES "normalThreshold": 1000.0

Threshold for generating the line-model. This value specifies the minimum normal difference between two neighboring pixels necessary to be recognized as an edge. Usually we set this threshold very high, because from our experience normal-based lines can't be recognized very reliably. It might make sense to use lower values for certain models, though.

textureColorSensitivity float [0.0..1.0] 0.0 YES "textureColorSensitivity": 0.8

Sensitivity for generating the line-model based on the models color texture. It can only be used, if your model provides texture data. Using a high value will extract many edges from the texture; using low values will extract very few edges from the texture. The value 0.0 means that no edges will be extracted from the texture, thus the texture information won't be used for tracking at all.

lineGradientThreshold integer [0...765] 40 YES "lineGradientThreshold": 50

Threshold for edge candidates in the image. High values will only consider pixels with high contrast as candidates. Low values will also consider other pixels. This is a trade-off. If there are too many candidates the algorithm might choose the wrong pixels. If there are not enough candidates the line-model might not stick to the object in the image.

lineSearchLengthInit (Deprecating) integer [3...1000] 15 YES "lineSearchLengthInit": 17
The model-based tracker projects the 3D line-model into the camera image and searches for edge pixels orthogonal to the projected lines. The "lineSearchLengthInit" specifies the length of the orthogonal search lines in pixels during the initialization phase. Please use lineSearchLengthTrackingRelative and lineSearchLengthInitRelative in favor since these parameters work resolution independent.
lineSearchLengthTracking (Deprecating) integer [3...1000] 15 YES

"lineSearchLengthTracking": 17

Same as "lineSearchLengthInit", but used during the tracking phase.

lineSearchLengthInitRelative float [0.0...1.0] 0.05 YES "lineSearchLengthInitRelative": 0.07

The model-based tracker projects the 3D line-model into the camera image and searches for edge pixels orthogonal to the projected lines. The "lineSearchLengthInitRelative" specifies the length of the orthogonal search lines as a fraction of the shorter image side length during the initialization and tracking phase. If the parameter is set inside the range [0.0...1.0] but the resulting absolute length is less then three, the search-lines will have a length of three pixels. Please use lineSearchLengthTrackingRelative and lineSearchLengthInitRelative in favor to the non relative parameters since these parameters work resolution independent.

lineSearchLengthTrackingRelative float [0.0...1.0] 0.03125 YES "lineSearchLengthTrackingRelative": 0.03125

Same as "lineSearchLengthInitRelative", but used during the tracking phase.

minNumOfCorrespondences integer [5...100000] 50 YES "minNumOfCorrespondences": 100

The minimum number of found correspondences between the projected line-model and the edge pixels in the camera image. If there are not enough correspondences the tracking result will get marked as invalid. This is a trade-off. If the value is too high, you can't track objects which only take little space in the image (e.g. because the user is too far away from the object). If the value is too low, the algorithm might start tracking the wrong object.

maxNumOfCorrespondences integer -1 NO "maxNumOfCorrespondences": -1

This options restricts the VisionLib to use a certain number of Correspondences for optimization of the pose. In scenarios where models with many edges are tracked, you can restrict the processing effort using this parameter. If you experience lag due to large sized models, this parameter can help increasing runtime performance. Anyway decreasing the parameter too much, will lead to more imprecise pose determination. If set to below 1 (default), all hypotheses will be taken into account.

minInitQuality float [0.0..1.0] 0.6 YES "minInitQuality": 0.76

This is a quality threshold for validating the tracking result during the initialization phase. The value highly depends on your scenario. If the line-model matches the real object perfectly and there is no occlusion a high value is recommended. However, usually the line-model doesn't perfectly match the real object which is why a lower value might work better. Then again if the value is too low the algorithm might start tracking the wrong object. In our experience it is better if the "minInitQuality" value is higher than the value for "minTrackingQuality", because it's difficult for the algorithm to recover from a bad initialization.

minTrackingQuality float [0.0..1.0] 0.55 YES "minTrackingQuality": 0.76

This is a quality threshold for validating the tracking result after the initialization phase. The value highly depends on your scenario. If the line-model matches the real object perfectly and there is no occlusion a high value is recommended. However, usually the line-model doesn't perfectly match the real object which is why a lower value might work better. Then again if the value is too low the algorithm might start tracking the wrong object. In our experience it is better if the "minTrackingQuality" value is lower than the value for "minInitQuality", because we don't want the tracking to fail after the initialization due to effects like motion-blur.

maxFramesFeaturePrediction integer [1-5000] 30 YES, except when using ARKit "maxFramesFeaturePrediction": 25

This value defines the maximum number of frames which are used for predicting the pose when the model tracking step cannot be successfully validated. The prediction is useful for fast camera movements and motion blur. The model tracking step might fail during these cases, but the prediction will provide a rough solution. The prediction is not robust and might drift away. If the maximum number of predicted frames is exceeded the re-initialization step will be invoked.

extendibleTracking boolean false NO "extendibleTracking": true

If this value is set to true the model-based tracking will be extended with a SLAM-based tracking. This allows you to continue tracking even if the model isn't visible in the camera image anymore. The user needs to perform a SLAM dance, which means to translate and rotate the camera, so that there is enough baseline for the feature reconstruction. NOTE: This property can only be used with an external SLAM source like ARCore, ARKit, HoloLens or ARFoundation. It will be ignored otherwise.

legacyCameraMode boolean false NO "legacyCameraMode": true

Only on Android devices: If set, VisionLib will not use ARCore to acquire images, even if it is present. Intrinsic calibration will not be retrieved from ARCore. Use this, if ARCore is present but the performance of the device is to limited to use it together with VisionLib tracking.

poseFilteringSmoothness float [0.0..1000] 0.0 NO "poseFilteringSmoothness": 0.25

This value defines the smoothness of the pose filter. Lower values will make the filter very smooth. Higher values will make the filter less lagged. Setting the value to 0 turns off the filter.

debugLevel integer [0..1] 0 NO "debugLevel": 1

This value specifies the amount of visual output for debugging purposes. Debug level 1 produces some images for visualizing several debug information. Debug level 0 generates no debug information at all. This mode is faster and should be use for a final release. Debug level 2 is a mode for internal debugging purposes. Only use this if you know what you want to do with it. Enabling this feature can significantly harm the performance of the tracking pipeline.

showLineModel boolean false YES "showLineModel": true

This option allows you to draw the line-model into the camera image. The line-model will get drawn during all tracking states. If you need more fine-grained control, you can use additional parameters for visualization of three possible tracking states.

"showLineModel": {
"enabled": {
"tracked": true,
"critical": true
}
}

Enabling of:

  • "tracked" allows you to draw the line-model into the camera image while the objects gets tracked successfully.
  • "critical" allows you to draw the line-model into the camera image while the tracking is critical.
  • "lost" allows you to draw the line-model into the camera image while the tracked object is lost.

In order to change the color used for drawing the line-model of the corresponding tracking states, you should define the "color" parameter.

"showLineModel": {
"enabled": {
"tracked": true,
"critical": true
},
"color": {
"tracked": [
0,
0,
255
],
"critical": [
0,
255,
255
]
}
}

The color value must be an array with three numeric values between 0 and 255. The first value represents the red-component, the second value the green-component and the third value the blue-component ([red, green, blue]).

In order to enable the color-coded visualization of tracking state of each correspondency used for drawing the line-model, you should define the "color" : "perCorrespondency" parameter.

"showLineModel": {
"color": "perCorrespondency"
}
*Please notice, that the result might not be visually appealing and glitched graphics should be expected.* Therefore providing your own visualization should be preferred.

synchronous boolean false NO "synchronous": true

This parameter exists ONLY FOR TESTING PURPOSES. Don't set it to true unless you really know what your are doing! Usually the tracking utilizes multiple threads. If this parameter is set to true the whole tracking process will run in only one thread. This will reduce the performance, but in combination with the synchronous worker interface it allows you to get deterministic tracking results. This is useful if you want to get the same tracking results for a sequence of images independent from the current processor and GPU utilization.

enableTorch boolean false YES "enableTorch": true

If this value is set to true the light of the camera will be enabled (if available). This function can only be used on iOS for now.

lineModelBufferSize integer 10 YES "lineModelBufferSize": 100

This value depicts the buffer size for line models to be cached. The generated hypothesis are cached in this structure and can be serialized when enabling lineModelPersistence.

lineModelGeneration boolean true YES "lineModelGeneration": true

Controls the generation of the hypotheses used for tracking. If disabled, you will need to have at least loaded or generated hypotheses in memory in order to allow tracking. Please keep in mind, that hypotheses are position dependent and might change for every view point in respect to the tracked model.

lineModelRenderSize integer 1024 NO "lineModelRenderSize": 2048

Resolution (width and height) in pixel of the internally rendered hypothesis image. Set to 0 to use the resolution of the tracking image. Using a high resolution (e.g. 2048) will result in more detailed line models. This is especially noticeable in regions with many small details in close proximity. With a low resolution those details would simply get interpreted as noise and will get filtered out. Using a high resolution is a trade-off, because it might improve the line model appearance, but it will deteriorate the performance at the same time. Using a low resolution (e.g. 512) will result in worse line models, but with improved efficiency.

lineModelPersistence boolean false NO "lineModelPersistence": true

When enabled, the writeInitData and readInitData command additionally tries to write/read the generated hypotheses along with the init data. In conjunction with the lineModelGeneration parameter and the lineModelBufferSize you can pre-load/learn line models for the tracked object without the need to generate hypotheses on the fly.

learnTemplates boolean true YES "learnTemplates": false

Controls the algorithm to learn new init data at runtime.

staticScene boolean false NO "staticScene": false

When tracking unchangeable (static) scenes on devices with additional SLAM capabilities (iOS with extendibleTracking or HoloLens) it can be useful turning this flag on. It will help visionLib stabilizing the tracking with this knowledge. Modeltracking will run decoupled from SLAM when using "staticScene": true. Use in combination with "poseFilteringSmoothness": 0.25 for very smooth augmentations.

estimateWorldScale boolean false NO "estimateWorldScale": false

Experimental setting to estimate the scale factor between world and model coordinate system based on SLAM transformations and tracking results.

simulateExternalSLAM boolean false NO "simulateExternalSLAM": true

In case you have recorded an image sequence along with extendibleTracking, it is possible simulating the whole sequence on a desktop machine. By enabling this flag, a simulation around the use of the device recording can be made. The exact same pipeline, as it would be run on the device itself will be made. Anyway, this counts only for tracking. Additional data, like detected planes and ARWorldMap data cannot be used.

allowedNumberOfFramesSLAMPrediction integer NO "allowedNumberOfFramesSLAMPrediction": 120

If the tracking gets lost and extendibleTracking is activated, VisionLib predicts the pose of the object in the world via the SLAM transform. This parameter limits the number of frames for which purely SLAM pose predictions may keep tracking alive, i.e. prevent the transition from status "critical" to "lost". The counter starts when the tracking state transitions to "critical" and every following frame in which model tracking was unsuccessful increases the counter. When the allowedNumberOfFramesSLAMPrediction, the tracking state changes to "lost" and the counter is reset. If tracking is recovered (i.e. the state returns to "tracked") before this threshold is reached, the counter is also reset.

Note: If extendibleTracking is enabled and while tracking is "critical", both allowedNumberOfFramesSLAMPrediction and allowedNumberOfFramesSLAMPredictionObjectVisible are active. Tracking is "lost" as soon as the first of the two thresholds is reached. Both counters are then reset.

allowedNumberOfFramesSLAMPredictionObjectVisible integer NO "allowedNumberOfFramesSLAMPredictionObjectVisible": 60

This parameter behaves the same as allowedNumberOfFramesSLAMPrediction with one key difference. Here, the frame counter only increases while the SLAM predicted object location is inside the camera's field of view. If the predicted object location move out of the field of view, the counter is paused and counting is resumed at the same number when the predicted object location re-enters the field of view.

Note: If extendibleTracking is enabled and while tracking is "critical", both allowedNumberOfFramesSLAMPrediction and allowedNumberOfFramesSLAMPredictionObjectVisible are active. Tracking is "lost" as soon as the first of the two thresholds is reached. Both counters are then reset.

constraint (type: "1DRotation") object undefined YES*

"constraint":
{
"type": "1DRotation",
"parameters": {
"up_world": {
"x": 0,
"y": 1,
"z": 0
},
"up_model": {
"x": 0,
"y": 1,
"z": 0
},
"center_model": {
"x": 0,
"y": 0,
"z": 0
}
}
}

Makes sure that the up-Vector of the model and the up-Vector of the world are aligned.
Enabling this constraint will lead to any tracking result transformation CameraFromModel fulfilling up_world = WorldFromCamera * CameraFromModel * up_model where WorldFromCamera is the inverse Slam transformation.
When the model is initially rotated to fulfill the constraint, it will be rotated around center_model.
Note: This feature can only be used when externalSlam is available or simulateExternalSLAM is switched on.
The constraint can also be set or updated during runtime with the command set1DRotationConstraint. The param of this command is the same as the content of parameters above. To disable the constraint during runtime, you can use the command disableConstraints without any param.

You can find some additional parameters in the image recorder configuration description. This will allow you recording an image sequence to your device.

Runtime parameters

Of those parameters mentioned above the following can also be accessed at runtime using the "setAttribute" and "getAttribute" JSON commands of the Worker interface or by using the VLRuntimeParameters_ModelTracker_v1 prefab in Unity:

  • keyFrameDistance
  • keyFrameRotation
  • laplaceThreshold: Changing this will only have an effect for future keyframes.
  • normalThreshold: Changing this will only have an effect for future keyframes.
  • lineSearchLengthInit
  • lineSearchLengthTracking
  • lineSearchLengthInitRelative
  • lineSearchLengthTrackingRelative
  • minInitQuality
  • minTrackingQuality
  • debugLevel
  • ...

The "setInitPose" and "getInitPose" JSON commands can be used to change init pose as well.