VLAD (Vector of Locally Aggregated Descriptors) is an extension of the Bag of Words (BoW) model. This function computes VLAD descriptors using information such as visual words and image descriptors. The size of the visual words matrix is determined by no_of_words x no_of_dimensions_of_descriptors, where the number of dimensions depends on the descriptor used (e.g., SIFT has 128 dimensions, and SURF has 64). The imageDescriptors matrix has a size of no_of_descriptors_detected x no_of_dimensions_of_descriptors (as mentioned above).