Keyword Prediction from Movie/TV Series Metadata

Project Details

Background: With the increasing amount of video content available to consumers, methods to assist in finding relevant content are of increasing importance. By labeling with keywords describing different aspects of the content, such as the setting or theme, movies can be sorted or searched with these new criteria. Labeling the descriptive keywords by hand is, however, both time-consuming and subjective.

Solution: This project developed and analyzed a method for predicting keywords for a movie or TV series automatically using commonly available descriptive metadata such as the cast list, genres, a plot description, and reviews. It can be used to assign keywords for an unlabeled movie or to find additional keywords for an already labeled movie.

Additional Applications: Many techniques employed in this project are suitable for other tasks, including: handling structured and unstructured input data, incorporating multiple modalities of data, and imputation of missing data.