The total amount of digital data generated worldwide is increasing at a rapid rate. Simultaneously, approximately 80% (and growing) of this newly generated data is unstructured data – data that does not conform to a table- or object-based model. Examples of unstructured data include text, images, protein structures, geospatial information, and IoT data streams. Despite this, the vast majority of companies and organizations do not have a way of storing and analyzing these increasingly large quantities of unstructured data. Embeddings – high-dimensional, dense vectors which represent the semantic content of unstructured data – can remedy this. In this tutorial, we’ll introduce embeddings and vector search from both an ML- and application-level perspective. We’ll start with a high-level overview of embeddings and discuss best practices around embedding generation and usage. We’ll then use this knowledge to build two systems: semantic text search and reverse image search. Finally, we’ll see how we can put our application into production using Milvus, the world’s most popular open-source vector database.
Join us for the second season of the Future of Data and AI conference in the summer of 2023, after our successful first conference with 10,000+ registered attendees. Fill out the form to participate.