Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li, Yinheng, Ding, Han, Chen, Hang
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2407.19180
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866913448839872512
author Li, Yinheng
Ding, Han
Chen, Hang
author_facet Li, Yinheng
Ding, Han
Chen, Hang
contents Data processing plays an significant role in current multimodal model training. In this paper. we provide an comprehensive review of common data processing techniques used in modern multimodal model training with a focus on diffusion models and multimodal large language models (MLLMs). We summarized all techniques into four categories: data quality, data quantity, data distribution and data safety. We further present our findings in the choice of data process methods in different type of models. This study aims to provide guidance to multimodal models developers with effective data processing techniques.
format Preprint
id arxiv_https___arxiv_org_abs_2407_19180
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Data Processing Techniques for Modern Multimodal Models
Li, Yinheng
Ding, Han
Chen, Hang
Computer Vision and Pattern Recognition
Data processing plays an significant role in current multimodal model training. In this paper. we provide an comprehensive review of common data processing techniques used in modern multimodal model training with a focus on diffusion models and multimodal large language models (MLLMs). We summarized all techniques into four categories: data quality, data quantity, data distribution and data safety. We further present our findings in the choice of data process methods in different type of models. This study aims to provide guidance to multimodal models developers with effective data processing techniques.
title Data Processing Techniques for Modern Multimodal Models
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2407.19180