CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning
Lindström, A. D. & Abraham, S. S. ORCID: 0000-0003-3902-2867 (2022).
CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning.
In:
Neural-Symbolic Learning and Reasoning.
16th International Workshop on Neural-Symbolic Learning and Reasoning (NeSy), 28-30 Sep 2022, Windsor, UK.
Abstract
We introduce CLEVR-Math, a multi-modal math word problems dataset consisting of simple math word problems involving addition/subtraction, represented partly by a textual description and partly by an image illustrating the scenario. The text describes actions performed on the scene that is depicted in the image. Since the question posed may not be about the scene in the image, but about the state of the scene before or after the actions are applied, the solver envision or imagine the state changes due to these actions. Solving these word problems requires a combination of language, visual and mathematical reasoning. We apply state-of-the-art neural and neuro-symbolic models for visual question answering on CLEVR-Math and empirically evaluate their performances. Our results show how neither method generalise to chains of operations. We discuss the limitations of the two in addressing the task of multi-modal word problem solving.
Publication Type: | Conference or Workshop Item (Paper) |
---|---|
Additional Information: | Copyright © 2022 for the individual papers by the papers' authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
Publisher Keywords: | Neuro-Symbolic, Visual Question Answering, Math Word Problem Solving, Multimodal Reasoning |
Subjects: | Q Science > QA Mathematics |
Departments: | School of Science & Technology School of Science & Technology > Department of Computer Science |
SWORD Depositor: |
Available under License Creative Commons: Attribution International Public License 4.0.
Download (3MB) | Preview
Export
Downloads
Downloads per month over past year