Yawn, study, cuddle: IBM, MIT roll out video data set for AI model training
Correction: This article has been updated to reflect that the data set is for non-commercial and educational use.
Artificial intelligence has grown adept at processing still images, but interpreting videos is still a challenge. The MIT IBM Watson AI Lab launched the Moments in Time Dataset, a collection of one million three-second long video clips to help AI models identify action. The data set is publicly available for non-commercial and educational use.
Moments in Time manually classified the videos by verbs, offering "wide semantic coverage of English verbs," said Dan Gutfreund, video analytics scientist at IBM Research, in an interview with CIO Dive. While videos are only labeled by action right now, the lab hopes to later extend the vocabulary and add in subject and object identifiers, according to Gutfreund.
The data set fills a void because previous data sets were often human-centric and dealt with more complex events such as changing a tire, which includes many individual actions, said Gutfreund. Moments in Time identifies a larger variety of basic actions that can be used to identify such complex events. The data set also has more diversity and balance because actions are identified through many videos.
Implementing AI and ML are key to companies' success in the coming few years, but many organizations lack the resources, tools and talent to make this a reality.
Democratizing AI is necessary for many SMBs and non-tech companies to find the technology in their grasp. Accessible tools and platforms and automation are some examples, but market-based access to data is the first step.
Developers need access to large swathes of information to teach and test their models. There is no shortage of still images on the internet today, but an equivalent repository of videos has long been lacking, according to Gutfreund. In conjunction with open algorithm libraries offered by many tech companies, these tools offer developers the bones for an AI model.
With an expected $47 billion of worldwide revenue from AI by 2020, CIOs and other tech executives need to be bullish using these resources to integrate the technology with business operations. This is one race where catching up from last place will be hard to accomplish.
Follow Alex Hickey on Twitter