Spend less time wrangling data, and more time building the models you love.

SceneBox was built from the ground-up to save you time, using feature-rich toolsets to help you find and curate the right data to train your models in record time.

Data Discovery

Bring structure to your unstructured datasets, then sift through them for valuable data

Rich Interface for Discovery & Dashboards

Effortlessly explore and summarize your data in SceneBox's rich web interface. Use advanced queries, interactive dashboards, and flexible filters to curate training datasets, examine bias, and uncover data bugs. Searching and summarizing your data has never been easier.

Exploration with Embeddings

ML-generated embeddings are a powerful way to explore vast datasets. Use SceneBox's embeddings view to visualize your data, then cluster, select, curate, and spot outliers by diving into various corners of your dataset. You can either bring your own embedding spaces or use SceneBox's model zoo to index your data.

Synchronized Temporal Data

Managing and searching your multi-sensor temporal data has never looked so good. Perception logs are often composed of multiple streams of data from various sources including RGB cameras, Lidars, time-series, GPS, etc.

The SceneBox event engine synchronizes data across any number of sources to enable queries such as: "Find all the Lidar scenes when a car and a pedestrian were detected from a side camera, vehicle speed was >50 km/h, and it was raining, in Seattle, at night".

Similarity Search

Found a few samples of an interesting corner case, but don't have the right query or metadata to find more? SceneBox uses ML to search fully unstructured datasets. Think "Google Photos" for your data. SceneBox enables 1-to-N and M-to-N search across one or multiple embedding spaces.

Data-agnostic & Schema-free

Perception data is often rich, versatile, and multi-modal. SceneBox can index any type of data including images, videos, Lidars, point-clouds, along with any metadata, geo-location, embedding vectors, annotations and time-series data. In addition, SceneBox supports spatio-temporal composite data formats such as ROS, RTMaps, KITTI, MDF4, etc.  SeneBox's schema-free data-management empowers users with maximum freedom to search, enrich, edit, and summarize their datasets.

On Demand Extraction

Sift through large volumes of data and only extract the samples your model need. SceneBox smartly samples your large datasets and make them entirely searchable without the need of indexing every frame. This allows you to quickly filter out all the "uneventful" data, zoom into the action, and curate most valuable datasets.

Data Operations

Make data work for you with powerful automations and functions.

Annotation Integrations

Send your curated datasets to your labeler with a single click (or a single API call) and manage all the annotated data and annotation worflows from a unified interface. SceneBox's Annotation Hub provides integrations with best-in-class annotation platforms. SceneBox also provides a hosted CVAT to streamline your labeling workflows.

Simplified Workflow Management

Data operations often involve many interdependent moving pieces and full visibility to the entire process is key to effectively curate best datasets. Manage your data campaigns by viewing associated datasets, status of labeling operations, pre-tagging, labeling consensus, and much more.

Project: Automotive Vision
Labeler: CVAT
Collaborator: Kara Smith
Collaborator: Ben Clinton
Delete Project
Compare Models

Data Sharing & Collaboration

Add multiple users to your projects and adjust their access control to comment, curate, share, and collaborate with your team through SceneBox. Or, invite external users to review, audit for sharing.

Data-agnostic & Schema-free

You bring the data, and we’ll bring more. SceneBox will automatically enrich your metadata where possible. For example, if we know your GPS + time data, we’ll add weather, visibility, city, state, etc. SceneBox also integrates with PyTorch and Detectron2 model zoos. You can use any of these off-the-shelf models to pre-tag or embed your datasets.

Diagnosis & Remedy

Root out failure modes, visualize patterns, and fix errors to improve your models.

Model Comparison

Use powerful metrics such as mean intersection over union (IoU) to quickly compare models (including ground truths) or annotators. An IoU distribution and confusion matrix are provided to help you visually debug your model/data, find the corner cases where you need to collect more data for training, and identify labeling noise/errors. Then use similarity search to find more raw data.

Label Comparison & Debugger

Look at the embeddings of your data to easily identify label noise, or find discrepancies in annotations from multiple labelers or models.

In this example, a laptop is mislabeled as a TV. Looking at the image embeddings view, the mislabeled datapoint is noticed in a cluster of other laptops. The incorrect label is visually singled out with a color that does not match its cluster, making the error obvious and helping you debug your labels.

Integrations & Deployment

Minimize data integration efforts with SceneBox's overlay architecture and extensive APIs.

Data Overlay

You can deploy SceneBox your data lakes across multiple data sources (AWS, GCS, Azure, on-prem), acting as a window into your data without changing its residency or having to send large, raw data to other servers.

Deploy On-Premise or Over the Cloud

In addition to SceneBox SaaS, SceneBox can be deployed over any major cloud VPC (AWS, GCP, Azure), or on-premise for full control and privacy of your data. With SceneBox's cloud-agnostic microservices architecture, it only require a Kubernetes system to operate.

$ pip install scenebox
>> from scenebox import SceneEngineClient
>> sec=SceneEngineClient("my_token")
>> sec.add_image("s3://my-bucket/...")
>> sec.search_assets("images", filters=...)

Powerful Python Client & REST Interface

SceneBox allows programmatic interactions for custom integrations or the automation of data operations using Python and REST APIs.

Bring data to your fingertips with SceneBox today.