Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2020
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2012.00822 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866909220520067072 |
|---|---|
| author | Luo, Haozheng Qin, Ruiyang Xu, Chenwei Ye, Guo Luo, Zening |
| author_facet | Luo, Haozheng Qin, Ruiyang Xu, Chenwei Ye, Guo Luo, Zening |
| contents | In this paper, we introduce a robotic agent specifically designed to analyze external environments and address participants' questions. The primary focus of this agent is to assist individuals using language-based interactions within video-based scenes. Our proposed method integrates video recognition technology and natural language processing models within the robotic agent. We investigate the crucial factors affecting human-robot interactions by examining pertinent issues arising between participants and robot agents. Methodologically, our experimental findings reveal a positive relationship between trust and interaction efficiency. Furthermore, our model demonstrates a 2\% to 3\% performance enhancement in comparison to other benchmark methods. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2012_00822 |
| institution | arXiv |
| publishDate | 2020 |
| record_format | arxiv |
| spellingShingle | Open-Ended Multi-Modal Relational Reasoning for Video Question Answering Luo, Haozheng Qin, Ruiyang Xu, Chenwei Ye, Guo Luo, Zening Artificial Intelligence Human-Computer Interaction Robotics In this paper, we introduce a robotic agent specifically designed to analyze external environments and address participants' questions. The primary focus of this agent is to assist individuals using language-based interactions within video-based scenes. Our proposed method integrates video recognition technology and natural language processing models within the robotic agent. We investigate the crucial factors affecting human-robot interactions by examining pertinent issues arising between participants and robot agents. Methodologically, our experimental findings reveal a positive relationship between trust and interaction efficiency. Furthermore, our model demonstrates a 2\% to 3\% performance enhancement in comparison to other benchmark methods. |
| title | Open-Ended Multi-Modal Relational Reasoning for Video Question Answering |
| topic | Artificial Intelligence Human-Computer Interaction Robotics |
| url | https://arxiv.org/abs/2012.00822 |