Limbo Data
Limbo Data is a growing collection (~1000000 images and counting) of synthetic computer vision training data created for our research. The subject matter of the dataset is uranium hexaflouride containers that are part of the nuclear fuel cycle (click to enlarge). Can you guess which of the following images is real?
Answer: none! Every one of these images was created using computer generated images (CGI). And because these images are generated synthetically, they also include pixel-perfect matte, contour, and bounding box annotations, which can be accessed using the Limbo Software.
Campaigns
The data has been divided into the following campaigns … see the documentation on each campaign for details on its contents:
Downloads
Tip
In our testing, downloads using the Safari browser often failed due to “too many HTTP redirects”. Consider using a different browser instead.
To obtain the Limbo Data, you’ll need to do the following:
Visit https://bdc.lbl.gov and register a new account.
At the bottom of the BDC New User Registration form, be sure to select the snl-limbo organization.
Once your account is registered, visit https://bdc.lbl.gov and login.
At the top of the window, choose Data Workspace.
Click the Workspace button with the gear icon.
Under Projects enable the SNL-Limbo filter.
Under Data Collections choose the snl-limbo collection.
At the bottom of the Workspace Configuration window, choose Save followed by Close.
Click the Query Generator button.
Under Data Collections, enable the snl-limbo collection.
Click the Load Data Collections button.
Select the Individual non-HDF5 files tab.
In the Files dropdown, click the + icons until all of the files are visible.
- Select the files you wish to download.
Each campaign has been split into tar.gz files named campaignXX-YYYY.tar.gz that are roughly 5-8GB each.
We encourage you to explore the breadth of the data by downloading one tar.gz file from each campaign, before downloading entire campaigns.
Download times can be very long and institutional firewalls can be finicky, so we encourage you to start by downloading a single file before ramping-up to more, downloading only as many files as you can reasonably use for training.
Click the Submit Query button at the bottom of the window.
A popup window should open, to indicate that the query was submitted successfully.
Close the popup window.
Click the Download Manager at the top of the window.
Your download query will appear as a row in a table.
The Download column for your query will likely read “Unavailable” until your download is ready.
It may take a significant amount of time before your download is ready. Refresh your browser periodically to see whether the download is ready.
Once the download is ready, use the Download link. Be careful not to click Copy to Jupyterhub.
The files you selected will be delivered as a single ZIP file. Unzip the downloaded file, then untar the data files.
Usage
You’re ready to use the data! If you already use Python for your experiments, you’ll likely want to use the Limbo Software to integrate the data into your workflow; if not, you can read the Limbo Data Format to understand how the data is stored.
Out of the box, our .tar.gz files contain a PNG image for each sample plus a JSON metadata file containing contour, bounding box, and tag annotations ready to use for training object segmentation, object detection, and classification models. If you need high dynamic range, depth, matte (bitmap mask), or other information, you’ll need to extract them using the Limbo Software.
Terms of Use
The imagery and associated metadata within the Limbo repository was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government, nor any agency thereof, nor any of their employees, nor any of their contractors, subcontractors, or their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, data, apparatus, product, or process disclosed, or represent that its use would not infringe privately owned rights. Any reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement, recommendation, or favoring by the United States Government, any agency thereof, or any of their contractors or subcontractors. Any views and opinions expressed herein do not state or reflect those of the United States Government, any agency thereof, or any of their contractors. Neither the names of the copyright holders nor the United States Government may be used to endorse or promote products derived from the imagery and associated metadata without specific prior written permission.
THE IMAGERY AND ASSOCIATED METADATA WITHIN THE LIMBO REPOSITORY IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS OR THE U.S. GOVERNMENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THE IMAGERY AND ASSOCIATED METADATA WITHIN THE LIMBO REPOSITORY, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Researcher shall use the Limbo repository only for non-commercial research and educational purposes. Commercial use of the Limbo repository and/or any of its contents is strictly prohibited.
Researcher accepts full responsibility for his or her use of the Limbo repository and shall defend and indemnify the Limbo team, National Technology & Engineering Solutions of Sandia, LLC (NTESS), Lawrence Berkeley National Laboratory, the U.S. Department of Energy, and the U.S. Government, including their employees, Trustees, officers and agents, against any and all claims arising from Researcher’s use of the Database.
NTESS and Lawrence Berkeley National Laboratory reserve the right to terminate Researcher’s access to the Limbo repository at any time.
The images herein have been released as unclassified, unlimited release with the following release numbers:
Campaigns 2-5: SAND2021-9615 O
Campaigns 6-9: SAND2022-0008 O
Campaigns 10-12: SAND2022-6864 O
Campaigns 13-15: SAND2022-9272 O
Campaigns 16-20: SAND2022-13738 O
Reference images: SAND2022-0160 O
Reference-2 images: SAND2022-10310 O
Adversarial examples: SAND2022-10311 O
3D models and textures: SAND2024-08317O
Reference Images
The term “reference images” refers to a collection of real-world images collected from open sources. The licensing of each individual image is documented to the best of our ability. While the use of these images may fall under “fair use” for research and development projects, the publication of some may be limited depending on their license. It is the responsibility of each user of these images to cite and publish these images in accordance with their license requirements.
Synthetic Images
All synthetic images within Limbo belong to NTESS, and are released to the public as unclassified, unlimited release (UUR) information. When using these images, please cite:
Copyright 2022 National Technology & Engineering Solutions of Sandia, LLC. The U.S. Government has certain rights to the synthetic images.
Gastelum, Z.N., Shead, T.M., and Rushdi, A., “A Large Safeguards-Informed Hybrid Imagery Dataset for Computer Vision Research & Development.” Proceedings of the Institute of Nuclear Materials Management and European Safeguards Research and Development Association Joint Annual Meeting, September 2021. Available at: https://esarda.jrc.ec.europa.eu/esarda-43rd-joint-annual-meeting_en
We further recommend that use of individual images include its associated SAND number, provided by Campaign number above.
Annotations & Metadata
The annotations and other metadata provided with the real and synthetic images within Limbo are provided for research purposes only and are provided as-is. The metadata belong to NTESS, and are released to the public as unclassified, unlimited release (UUR) via the same SAND release number as the corresponding images.
Notice: This work was produced by National Technology & Engineering Solutions of Sandia, LLC under contract No. DE-NA0003525 with the U.S. Department of Energy. The United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce this work, or allow others to do so, for United States Government purposes. The Department of Energy provides public access to federally sponsored research in accordance with the DOE Public Access Plan https://www.energy.gov/downloads/doe-public-access-plan.