Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.11626 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866911266352660480 |
|---|---|
| author | Yoshida, Ryo Hayashi, Yoshihiro Furuya, Hidemine Hosoya, Ryohei Kaneko, Kazuyoshi Sugisawa, Hiroki Kaneko, Yu Takahashi, Aiko Noguchi, Yoh Nanjo, Shun Shinoda, Keiko Hamakawa, Tomu Ohno, Mitsuru Kitamura, Takuya Yonekawa, Misaki Wu, Stephen Ohnishi, Masato Liu, Chang Tsurimoto, Teruki Arifin Wakiuchi, Araki Noda, Kohei Morikawa, Junko Hayakawa, Teruaki Shiomi, Junichiro Naito, Masanobu Shiratori, Kazuya Nagai, Tomoki Tomotsu, Norio Inoue, Hiroto Sakashita, Ryuichi Ishii, Masashi Kuwajima, Isao Furuichi, Kenji Hiroi, Norihiko Takemoto, Yuki Ohkuma, Takahiro Yamamoto, Keita Kowatari, Naoya Suzuki, Masato Matsumoto, Naoya Umetani, Seiryu Ikebata, Hisaki Shudo, Yasuyuki Nagao, Mayu Kamada, Shinya Kamio, Kazunori Shomura, Taichi Nakamura, Kensaku Iwamizu, Yudai Abe, Atsutoshi Yoshitomi, Koki Horie, Yuki Koike, Katsuhiko Iwakabe, Koichi Gima, Shinya Usui, Kota Usuki, Gikyo Tsutsumi, Takuro Matsuoka, Keitaro Sada, Kazuki Kitabata, Masahiro Kikutsuji, Takuma Kamauchi, Akitaka Iijima, Yusuke Suzuki, Tsubasa Goda, Takenori Takabayashi, Yuki Imai, Kazuko Mochizuki, Yuji Doi, Hideo Okuwaki, Koji Nitta, Hiroya Ozawa, Taku Kamijima, Hitoshi Shintani, Toshiaki Mitamura, Takuma Zamengo, Massimiliano Sugami, Yuitsu Akiyama, Seiji Murakami, Yoshinari Betto, Atsushi Matsuo, Naoya Kagao, Satoru Kobayashi, Tetsuya Matsubara, Norie Kubo, Shosei Ishiyama, Yuki Ichioka, Yuri Usami, Mamoru Yoshizaki, Satoru Mizutani, Seigo Hanawa, Yosuke Kunieda, Shogo Yambe, Mitsuru Nakamura, Takeru Murashima, Hiromori Takahashi, Kenji Wada, Naoki Kawano, Masahiro Harada, Yosuke Fujita, Takehiro Fujita, Erina Himeno, Ryoji Kino, Hiori Fukumizu, Kenji |
| author_facet | Yoshida, Ryo Hayashi, Yoshihiro Furuya, Hidemine Hosoya, Ryohei Kaneko, Kazuyoshi Sugisawa, Hiroki Kaneko, Yu Takahashi, Aiko Noguchi, Yoh Nanjo, Shun Shinoda, Keiko Hamakawa, Tomu Ohno, Mitsuru Kitamura, Takuya Yonekawa, Misaki Wu, Stephen Ohnishi, Masato Liu, Chang Tsurimoto, Teruki Arifin Wakiuchi, Araki Noda, Kohei Morikawa, Junko Hayakawa, Teruaki Shiomi, Junichiro Naito, Masanobu Shiratori, Kazuya Nagai, Tomoki Tomotsu, Norio Inoue, Hiroto Sakashita, Ryuichi Ishii, Masashi Kuwajima, Isao Furuichi, Kenji Hiroi, Norihiko Takemoto, Yuki Ohkuma, Takahiro Yamamoto, Keita Kowatari, Naoya Suzuki, Masato Matsumoto, Naoya Umetani, Seiryu Ikebata, Hisaki Shudo, Yasuyuki Nagao, Mayu Kamada, Shinya Kamio, Kazunori Shomura, Taichi Nakamura, Kensaku Iwamizu, Yudai Abe, Atsutoshi Yoshitomi, Koki Horie, Yuki Koike, Katsuhiko Iwakabe, Koichi Gima, Shinya Usui, Kota Usuki, Gikyo Tsutsumi, Takuro Matsuoka, Keitaro Sada, Kazuki Kitabata, Masahiro Kikutsuji, Takuma Kamauchi, Akitaka Iijima, Yusuke Suzuki, Tsubasa Goda, Takenori Takabayashi, Yuki Imai, Kazuko Mochizuki, Yuji Doi, Hideo Okuwaki, Koji Nitta, Hiroya Ozawa, Taku Kamijima, Hitoshi Shintani, Toshiaki Mitamura, Takuma Zamengo, Massimiliano Sugami, Yuitsu Akiyama, Seiji Murakami, Yoshinari Betto, Atsushi Matsuo, Naoya Kagao, Satoru Kobayashi, Tetsuya Matsubara, Norie Kubo, Shosei Ishiyama, Yuki Ichioka, Yuri Usami, Mamoru Yoshizaki, Satoru Mizutani, Seigo Hanawa, Yosuke Kunieda, Shogo Yambe, Mitsuru Nakamura, Takeru Murashima, Hiromori Takahashi, Kenji Wada, Naoki Kawano, Masahiro Harada, Yosuke Fujita, Takehiro Fujita, Erina Himeno, Ryoji Kino, Hiori Fukumizu, Kenji |
| contents | Developing large-scale foundational datasets is a critical milestone in advancing artificial intelligence (AI)-driven scientific innovation. However, unlike AI-mature fields such as natural language processing, materials science, particularly polymer research, has significantly lagged in developing extensive open datasets. This lag is primarily due to the high costs of polymer synthesis and property measurements, along with the vastness and complexity of the chemical space. This study presents PolyOmics, an omics-scale computational database generated through fully automated molecular dynamics simulation pipelines that provide diverse physical properties for over $10^5$ polymeric materials. The PolyOmics database is collaboratively developed by approximately 260 researchers from 48 institutions to bridge the gap between academia and industry. Machine learning models pretrained on PolyOmics can be efficiently fine-tuned for a wide range of real-world downstream tasks, even when only limited experimental data are available. Notably, the generalisation capability of these simulation-to-real transfer models improve significantly as the size of the PolyOmics database increases, exhibiting power-law scaling. The emergence of scaling laws supports the "more is better" principle, highlighting the significance of ultralarge-scale computational materials data for improving real-world prediction performance. This unprecedented omics-scale database reveals vast unexplored regions of polymer materials, providing a foundation for AI-driven polymer science. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2511_11626 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Omics-scale polymer computational database transferable to real-world artificial intelligence applications Yoshida, Ryo Hayashi, Yoshihiro Furuya, Hidemine Hosoya, Ryohei Kaneko, Kazuyoshi Sugisawa, Hiroki Kaneko, Yu Takahashi, Aiko Noguchi, Yoh Nanjo, Shun Shinoda, Keiko Hamakawa, Tomu Ohno, Mitsuru Kitamura, Takuya Yonekawa, Misaki Wu, Stephen Ohnishi, Masato Liu, Chang Tsurimoto, Teruki Arifin Wakiuchi, Araki Noda, Kohei Morikawa, Junko Hayakawa, Teruaki Shiomi, Junichiro Naito, Masanobu Shiratori, Kazuya Nagai, Tomoki Tomotsu, Norio Inoue, Hiroto Sakashita, Ryuichi Ishii, Masashi Kuwajima, Isao Furuichi, Kenji Hiroi, Norihiko Takemoto, Yuki Ohkuma, Takahiro Yamamoto, Keita Kowatari, Naoya Suzuki, Masato Matsumoto, Naoya Umetani, Seiryu Ikebata, Hisaki Shudo, Yasuyuki Nagao, Mayu Kamada, Shinya Kamio, Kazunori Shomura, Taichi Nakamura, Kensaku Iwamizu, Yudai Abe, Atsutoshi Yoshitomi, Koki Horie, Yuki Koike, Katsuhiko Iwakabe, Koichi Gima, Shinya Usui, Kota Usuki, Gikyo Tsutsumi, Takuro Matsuoka, Keitaro Sada, Kazuki Kitabata, Masahiro Kikutsuji, Takuma Kamauchi, Akitaka Iijima, Yusuke Suzuki, Tsubasa Goda, Takenori Takabayashi, Yuki Imai, Kazuko Mochizuki, Yuji Doi, Hideo Okuwaki, Koji Nitta, Hiroya Ozawa, Taku Kamijima, Hitoshi Shintani, Toshiaki Mitamura, Takuma Zamengo, Massimiliano Sugami, Yuitsu Akiyama, Seiji Murakami, Yoshinari Betto, Atsushi Matsuo, Naoya Kagao, Satoru Kobayashi, Tetsuya Matsubara, Norie Kubo, Shosei Ishiyama, Yuki Ichioka, Yuri Usami, Mamoru Yoshizaki, Satoru Mizutani, Seigo Hanawa, Yosuke Kunieda, Shogo Yambe, Mitsuru Nakamura, Takeru Murashima, Hiromori Takahashi, Kenji Wada, Naoki Kawano, Masahiro Harada, Yosuke Fujita, Takehiro Fujita, Erina Himeno, Ryoji Kino, Hiori Fukumizu, Kenji Chemical Physics Materials Science Soft Condensed Matter Machine Learning Developing large-scale foundational datasets is a critical milestone in advancing artificial intelligence (AI)-driven scientific innovation. However, unlike AI-mature fields such as natural language processing, materials science, particularly polymer research, has significantly lagged in developing extensive open datasets. This lag is primarily due to the high costs of polymer synthesis and property measurements, along with the vastness and complexity of the chemical space. This study presents PolyOmics, an omics-scale computational database generated through fully automated molecular dynamics simulation pipelines that provide diverse physical properties for over $10^5$ polymeric materials. The PolyOmics database is collaboratively developed by approximately 260 researchers from 48 institutions to bridge the gap between academia and industry. Machine learning models pretrained on PolyOmics can be efficiently fine-tuned for a wide range of real-world downstream tasks, even when only limited experimental data are available. Notably, the generalisation capability of these simulation-to-real transfer models improve significantly as the size of the PolyOmics database increases, exhibiting power-law scaling. The emergence of scaling laws supports the "more is better" principle, highlighting the significance of ultralarge-scale computational materials data for improving real-world prediction performance. This unprecedented omics-scale database reveals vast unexplored regions of polymer materials, providing a foundation for AI-driven polymer science. |
| title | Omics-scale polymer computational database transferable to real-world artificial intelligence applications |
| topic | Chemical Physics Materials Science Soft Condensed Matter Machine Learning |
| url | https://arxiv.org/abs/2511.11626 |