Table of Contents

Introduction

The Media Search Engine

The Media Search Engine was the biggest project I built during my first internship at MiraVideo. The search engine was created for video editors, providing them a database of reusable videos, images, and clips, and a search engine for querying. All of these resources are useful for creating and editing short videos/video reels for promoting multi-media channels or simply advertising some products. In this blog, I am going to break down the components of the search engine, and how I built them.

Media Database and Data ETL Pipeline

The first component of the search engine is of course, the database that’s backing it up. We had to create our own internal database for the video editors to use, since we also had some sort of contracts going on with video streaming platforms like IQiYi.

Initially, the videos we got from the providers are just raw. hour long mp4 files. I had to come up with an data ETL pipeline for breaking these videos down into documents and vectors. The documents are indexed to Elasticsearch, which is used as our NoSQL database. The vectors are saved to Milvus, which is our vector database. Let’s break down this pipeline in stages.

Video uploaded to Cloud Object Storage, the pipeline is triggered to run when the upload completes.
We are now at the first stage of the pipeline. In this stage, the video is broken down into shorter clips based on a pre-configured variable that sets how long each clips is. These shorter clips are further broken down into images, based on a random frame selections.
The segmented clips and frames are now being sent to stage 3. In stage 3, we start extracting meta information of the clips and images, for example: title. resolution. length in seconds, aspect ratio, frame rate, bitrate, video codec, and audio codec.
Happening concurrently at stage 3, the images and clips are getting sent to PaddleVideo for inferencing categories. The categories are then stored as a list of tags, sorted by the corresponding confidence. (e.g. get all videos related to sports)
Also happening concurrently at stage 3, the images and clips are getting sent to MovieNet models for feature extraction. We are mainly extracting three visual features, human features, overall environmental features, and action features. These features are extracted as 2048 dim vectors, and stored into Milvus HNSW index for querying. (e.g. get all videos where people are shooting at each other)

This concludes the ETL pipeline and now we have the processed data, ready for querying.

<aside> 🛠️

Tools I used for this component

</aside>

Amazon S3 - Cloud Object Storage - AWS

Elasticsearch: The Official Distributed Search & Analytics Engine | Elastic

GStreamer: open source multimedia framework

FFmpeg

PaddlePaddle-Parallel Distributed Deep Learning, efficient and extensible deep learning framework.

MovieNet

The High-Performance Vector Database Built for Scale | Milvus

Building the Application

After a few weeks of ingesting data, we have some processed data ready to be queried. I was testing with CLI and Postman for querying videos and images, but obviously for our actual end users, I need to create an application for them. There are mainly two parts for building the search engine webapp.

Query Language

Since I was writing some pretty complicated Elasticsearch DSL and Milvus queries for getting the data, the first step is to design a simplified query language that all of the video editors could use easily.

After a few iterations of query language design, I had given up on the idea of letting the editors use queries. It seems like the nature of using a “language” to get the result seemed a bit unintuitive to them no matter how simple I made the language to be. I ended up converting this into a list of selectable tags. The actual ES and Milvus query language is automatically constructured based on the combinations of tags the users selected.

<aside> 🛠️

Tools I used for this component

</aside>

Welcome! - The Apache HTTP Server Project

Let's Encrypt

Welcome to Flask — Flask Documentation (3.0.x)

Docker: Accelerated Container Application Development

React

Frontend & Backend

The last step is to build the webapp and host it. Let’s break it down into steps.

Get a server on our cloud provider and set it up with Apache 2 for hosting.
Move all the Query function I wrote before into Flask API endpoints. Setup more Flask related backend stuff like Auth middleware, CORS middleware, flask blueprints, etc.
Create the UI and frontend logic with React.
Containerize the finalized app with Docker.
Deploy and run the container on the hosting server, setup wild apache 2 redirect so that the user could access search.mirav.com

Conclusion

That’s it! Now all the editors could go to search.mirav.come and start hunting for clips and images they want to use for their videos. I learned a lot while building the search engine and had lots of fun. As a part of my internship, I also shared how I felt my entire 1 year experience in this blog.

Thanks for reading! If you are interested, check out more contents like this here.