Introducing VDP: open-source visual data ETL

VDP is the single point of visual data integration, where users can sync visual data from anywhere into centralised warehouses or applications, just like how the modern data stack handles structured data.

A diagram to show how VDP works
Introducing VDP: a general and modularised ETL infrastructure for unstructured visual data
🎉 Check out VDP on GitHub

What is VDP

A few months ago, we introduced Visual Data Preparation, the missing piece of modern data stack to seamlessly integrate Vision AI.

When people say they are data-driven, most of the time it means they are driven by structured data. Although 80% of the world's data are unstructured, the reality is that unstructured data are more difficult to analyse and not a lot of companies know or have the resources to deal with them. We can help with this. Specifically, we have built VDP, a general and modularised ETL infrastructure for unstructured visual data, to effectively tackle the problem.

VDP streamlines the end-to-end visual data processing pipeline:

  • Extract unstructured visual data from pre-built data sources such as cloud/on-prem storage, or IoT devices
  • Transform it into analysable structured data by Vision AI models imported from various ML platforms
  • Load the transformed data into warehouses, applications, or other destinations

We believe VDP is the future for unstructured data ETL, where developers won't need to build their own data connectors, high-maintenance model serving platform or ELT pipeline automation tool.

Our mission is to make VDP the single point of visual data integration, so users can sync visual data from anywhere into centralised warehouses or applications and focus on gaining insights across all data sources, just like how the modern data stack handles structured data. Check out highlights and core concepts if you want to learn more about how VDP works.

To benefit a broader community, we release VDP under the open-source Apache license 2.0. Check it out here. We've made it easy to get started with VDP on your local machine and Kubernetes (coming soon). Click here to get started.

If you want to chat about VDP or share your use cases, come and hang out with us in our Discord community.

What is VDP not?

Many brilliant MLOps platforms/tools providing Vision AI solutions have emerged in the last few years. Most of the tools are built from a model-centric perspective and fall into the following categories:

  • General ML platforms for model training, experiment tracking, model deployment, etc.
  • Platforms that serve a specific vertical, such as E-commerce, and manufacturing.
  • Platforms that focus on a single component of MLOps, such as data labelling, dataset preparation, and model serving.

VDP is built from a data-driven perspective. Although the Vision AI model is the most critical component in a visual data ETL pipeline, the ultimate goal of VDP is to streamline the end-to-end visual data flow, with the transform component being able to flexibly import Vision AI models from different sources. Please see the detailed FAQ page.

Open-source and cloud versions

VDP is in Alpha and under active and heavy development. Check out our open roadmap. If you have any questions or feature requests, open a topic in the VDP Discussions or hop into our Discord to get help from an active and friendly community!

Our team is working hard to build out a fully-managed cloud product for VDP.

  • Painless setup
  • Maintenance-free infrastructure
  • Start for free, pay as you grow

Interested in trying it out? Join the waitlist today and we'll keep you posted on the progress!

Subscribe to Instill AI Blog

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe