Depth / LIDAR Data

While most of this documentation focuses on the visible-wavelength images that are typical of current-day machine learning, this article will focus on the depth information that is included with Limbo synthetic images. To get started, we’ll load our test dataset and choose an appropriate sample:

[1]:

from IPython.display import display
import limbo.data

dataset = limbo.data.Dataset("../data")

sample = dataset[2]

if sample.image:
    display(sample.image)

Imagine that you are training a robot to navigate around containers in the real world. In addition to visible-wavelength cameras, the robot has LIDAR sensors that provide distance information. How would you train a model to use both types of data? Synthetic Limbo Data Format images can contain per-pixel depth (distance to viewer) information that is perfect for this use-case:

[6]:

if sample.synthetic:
    display(sample.synthetic.depth)

… hmm, that doesn’t look perfect. Fear not, this is due to the much wider dynamic range of values in the depth information, which records the distance (in meters) from the content of each pixel to the camera. To make the data visible, we need to get a sense for the distribution of values in the image. limbo.data.Synthetic.depth() returns the depth information as an Imagecat imagecat.data.Image, so we need to do a little digging to get at the raw data:

[8]:

depth = sample.synthetic.depth.layers["Z"].data.squeeze()
depth.min(), depth.max()

[8]:

(4.8908067, 3690.8577)

Note that the furthest pixels in the image are over three kilometers away (!) and clearly must be sky. Let’s focus on everything within the nearest 50 meters, and remap the distance to a white-to-black colormap:

[19]:

import imagecat.color.brewer
import PIL.Image

palette = imagecat.color.brewer.palette("Greys", reverse=True)
visible = imagecat.color.linear_map(depth, palette, min=6, max=50)
visible = (visible * 255).astype("ubyte")
PIL.Image.fromarray(visible)

[19]:

Now we can see that the depth information makes sense, with brighter pixels representing surfaces that are closer to the camera.

Since lidar information is typically much lower-resolution than visible wavelength data, and may have a narrow range of detectable distances based on time-of-flight, we assume that researchers will need to spatially downsample the depth data to match whatever real-world sensors they are targeting.