An Efficient Modern Baseline for FloodNet VQA

Published in ICML NewInML Workshop, 2022

Recommended citation: Kane, Aditya and Sahil Khose. “An Efficient Modern Baseline for FloodNet VQA.” (2022). https://arxiv.org/pdf/2205.15025.pdf

Designing efficient and reliable VQA systems remains a challenging problem, more so in the case of disaster management and response systems. In this work, we revisit fundamental combination methods like concatenation, addition and element-wise multiplication with modern image and text feature abstraction models. We design a simple and efficient system which outperforms pre-existing methods on the FloodNet dataset and achieves state-of-the-art performance. This simplified system requires significantly less training and inference time than modern VQA architectures. We also study the performance of various backbones and report their consolidated results. Code is available at this https URL.

Download paper here

If you find our paper useful in your research, please consider citing:

@misc{https://doi.org/10.48550/arxiv.2205.15025,
  doi = {10.48550/ARXIV.2205.15025},
  url = {https://arxiv.org/abs/2205.15025},
  author = {Kane, Aditya and Khose, Sahil},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), Artificial Intelligence (cs.AI), Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {An Efficient Modern Baseline for FloodNet VQA},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}