Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Johnny Jingze, George, Vivek Kurien, Silva, Gabriel A.
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2407.19044
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909447864975360
author	Li, Johnny Jingze George, Vivek Kurien Silva, Gabriel A.
author_facet	Li, Johnny Jingze George, Vivek Kurien Silva, Gabriel A.
contents	Emergence in machine learning refers to the spontaneous appearance of complex behaviors or capabilities that arise from the scale and structure of training data and model architectures, despite not being explicitly programmed. We introduce a novel yet straightforward neural network initialization scheme that aims at achieving greater potential for emergence. Measuring emergence as a kind of structural nonlinearity, our method adjusts the layer-wise weight scaling factors to achieve higher emergence values. This enhancement is easy to implement, requiring no additional optimization steps for initialization compared to GradInit. We evaluate our approach across various architectures, including MLP and convolutional architectures for image recognition and transformers for machine translation. We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization. The simplicity, theoretical innovation, and demonstrable empirical advantages of our method make it a potent enhancement to neural network initialization practices. These results suggest a promising direction for leveraging emergence to improve neural network training methodologies. Code is available at: https://github.com/johnnyjingzeli/EmergenceInit.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_19044
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme Li, Johnny Jingze George, Vivek Kurien Silva, Gabriel A. Machine Learning Computer Vision and Pattern Recognition Emergence in machine learning refers to the spontaneous appearance of complex behaviors or capabilities that arise from the scale and structure of training data and model architectures, despite not being explicitly programmed. We introduce a novel yet straightforward neural network initialization scheme that aims at achieving greater potential for emergence. Measuring emergence as a kind of structural nonlinearity, our method adjusts the layer-wise weight scaling factors to achieve higher emergence values. This enhancement is easy to implement, requiring no additional optimization steps for initialization compared to GradInit. We evaluate our approach across various architectures, including MLP and convolutional architectures for image recognition and transformers for machine translation. We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization. The simplicity, theoretical innovation, and demonstrable empirical advantages of our method make it a potent enhancement to neural network initialization practices. These results suggest a promising direction for leveraging emergence to improve neural network training methodologies. Code is available at: https://github.com/johnnyjingzeli/EmergenceInit.
title	Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme
topic	Machine Learning Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2407.19044

Similar Items