Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Taniguchi, Takara, Furuta, Ryosuke
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Computer Vision and Pattern Recognition Multimedia
Online-Zugang:	https://arxiv.org/abs/2410.05935
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866912064314802176
author	Taniguchi, Takara Furuta, Ryosuke
author_facet	Taniguchi, Takara Furuta, Ryosuke
contents	We tackle one-shot object detection in Japanese Manga. The rising global popularity of Japanese manga has made the object detection of character faces increasingly important, with potential applications such as automatic colorization. However, obtaining sufficient data for training conventional object detectors is challenging due to copyright restrictions. Additionally, new characters appear every time a new volume of manga is released, making it impractical to re-train object detectors each time to detect these new characters. Therefore, one-shot object detection, where only a single query (reference) image is required to detect a new character, is an essential task in the manga industry. One challenge with one-shot object detection in manga is the large variation in the poses and facial expressions of characters in target images, despite having only one query image as a reference. Another challenge is that the frequency of character appearances follows a long-tail distribution. To overcome these challenges, we propose a data augmentation method in feature space to increase the variation of the query. The proposed method augments the feature from the query by adding Gaussian noise, with the noise variance at each channel learned during training. The experimental results show that the proposed method improves the performance for both seen and unseen classes, surpassing data augmentation methods in image space.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_05935
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga Taniguchi, Takara Furuta, Ryosuke Computer Vision and Pattern Recognition Multimedia We tackle one-shot object detection in Japanese Manga. The rising global popularity of Japanese manga has made the object detection of character faces increasingly important, with potential applications such as automatic colorization. However, obtaining sufficient data for training conventional object detectors is challenging due to copyright restrictions. Additionally, new characters appear every time a new volume of manga is released, making it impractical to re-train object detectors each time to detect these new characters. Therefore, one-shot object detection, where only a single query (reference) image is required to detect a new character, is an essential task in the manga industry. One challenge with one-shot object detection in manga is the large variation in the poses and facial expressions of characters in target images, despite having only one query image as a reference. Another challenge is that the frequency of character appearances follows a long-tail distribution. To overcome these challenges, we propose a data augmentation method in feature space to increase the variation of the query. The proposed method augments the feature from the query by adding Gaussian noise, with the noise variance at each channel learned during training. The experimental results show that the proposed method improves the performance for both seen and unseen classes, surpassing data augmentation methods in image space.
title	Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga
topic	Computer Vision and Pattern Recognition Multimedia
url	https://arxiv.org/abs/2410.05935

Ähnliche Einträge