Is the time required to conduct emotion processing on a video roughly equal to that of the video?


I am using the Affectiva’s SDK for Linux to conduct emotion processing on videos. Even though I disabled the visual display which shows the tracking of facial movements, my processing time still seems to be equivalent to that of the video itself. I have been processing videos from 30 seconds to 5 minutes, but I plan to go up to larger videos in the future.

Is the processing time required to detect emotion in a video equivalent to the length of the video itself?

I guess this would make sense if the software is still playing back the entire video to analyze it. Is there a way to speed things up? Perhaps I am missing something?

Thank you for your time, consideration and help! :slight_smile:


Yes, the processing time for a video is proportional to video length, since the video is decoded into its constituent frames for analysis; however it’s not directly related to the video duration.

In other words, a 20 second video would typically take roughly twice as long to process as a 10 second video because the former has twice as many frames as the latter. However, the 20 second video would not necessarily take 20 seconds to process, except by coincidence.

You can potentially speed things up a few ways:

  • only activate the classifiers you need (each activated classifier incurs additional processing time)
  • decrease the value of the processFPS parameter you pass to the VideoDetector constructor
  • look at your callback methods to see if there is code there that can be optimized
  • and of course, you could run on a faster machine :slight_smile: