We live in an era where the internet is flourishing with image and video data. Several algorithms and architecture have been devised, making most of such data, and have been used to solve crucial problems. The number of features in image and video data can be extremely high, and such data can reach a dimensionality of thousands making the pre-processing step of feature selection extremely important. This work proposes using Evolutionary Computations to optimize the problem of Video Action Recognition or Classification. The VGG-16 architecture is used for extracting features from the images. The Binary Particle Swarm Optimization algorithm is devised to perform feature selection on the image frames extracted from the video. Two separate experiments are then performed to optimize hyper-parameter selections, using Particle Swarm Optimization and another Evolution Strategy. The robustness and consistency of the proposed methodology are tested on two popular datasets. The results show that the optimized implementations using Evolutionary Algorithms perform much better than the traditional technique with no optimization.
Supplementary notes can be added here, including code, math, and images.