Підхід до прискорення навчання згорткової нейронної мережі за рахунок налаштуування гіперпараметрів навчання

Радюк, П.М.; Radiuk, P.

Підхід до прискорення навчання згорткової нейронної мережі за рахунок налаштуування гіперпараметрів навчання

dc.contributor.author	Радюк, П.М.
dc.contributor.author	Radiuk, P.
dc.date.accessioned	2021-02-05T16:58:28Z
dc.date.available	2021-02-05T16:58:28Z
dc.date.issued	2020
dc.description.abstract	За останні десятиліття впровадження методів глибокого навчання, зокрема згорткових нейронних мереж (ЗНМ) призвело до вражаючого успіху у задачах обробки статичних зображень та відео. Проте, навчання ЗНМ здебільшого ґрунтується на застосуванні наборів квазіоптимальних гіперпараметрів архітектури та навчання. Подібний підхід потребує тривалого часу навчання мережі та не гарантує задовільного результату. Тим не менш, налаштування гіперпараметрів має вирішальне значення для ефективності ЗНМ, оскільки різні гіперпараметри призводять до моделей із суттєво різними характеристиками. Невдало підібрані гіперпараметри зазвичай призводять до низької продуктивності моделі. На сьогодні, питання оптимального підбору гіперпараметрів для ЗНМ все ще невирішене. Подана робота пропонує кілька практичних підходів до налаштування гіперпараметрів, що дає змогу скоротити час навчання та підвищити точність роботи моделі. У статті розглядається функція втрат валідації тренувань під час недо- та перенавчання та наводяться вказівки щодо досягнення точки оптимізації. В роботі також розглядається питання регуляції кроку та імпульсу навчання для прискорення навчання мережі. Усі експерименти базуються на відомих наборах даних CIFAR-10 та CIFAR-100.	uk_UA
dc.description.abstract	Over the last decade, a set of machine learning algorithms called deep learning has led to significant improvements in computer vision, natural language recognition and processing. This has led to the widespread use of a variety of commercial, learning-based products in various fields of human activity. Despite this success, the use of deep neural networks remains a black box. Today, the process of setting hyperparameters and designing a network architecture requires experience and a lot of trial and error and is based more on chance than on a scientific approach. At the same time, the task of simplifying deep learning is extremely urgent. To date, no simple ways have been invented to establish the optimal values of learning hyperparameters, namely learning speed, sample size, data set, learning pulse, and weight loss. Grid search and random search of hyperparameter space are extremely resource intensive. The choice of hyperparameters is critical for the training time and the final result. In addition, experts often choose one of the standard architectures (for example, ResNets and ready-made sets of hyperparameters. However, such kits are usually suboptimal for specific practical tasks. The presented work offers an approach to finding the optimal set of hyperparameters of learning ZNM. An integrated approach to all hyperparameters is valuable because there is an interdependence between them. The aim of the work is to develop an approach for setting a set of hyperparameters, which will reduce the time spent during the design of ZNM and ensure the efficiency of its work. In recent decades, the introduction of deep learning methods, in particular convolutional neural networks (CNNs), has led to impressive success in image and video processing. However, the training of CNN has been commonly mostly based on the employment of quasi-optimal hyperparameters. Such an approach usually requires huge computational and time costs to train the network and does not guarantee a satisfactory result. However, hyperparameters play a crucial role in the effectiveness of CNN, as diverse hyperparameters lead to models with significantly different characteristics. Poorly selected hyperparameters generally lead to low model performance. The issue of choosing optimal hyperparameters for CNN has not been resolved yet. The presented work proposes several practical approaches to setting hyperparameters, which allows reducing training time and increasing the accuracy of the model. The article considers the function of training validation loss during underfitting and overfitting. There are guidelines in the end to reach the optimization point. The paper also considers the regulation of learning rate and momentum to accelerate network training. All experiments are based on the widespread CIFAR-10 and CIFAR-100 datasets	uk_UA
dc.identifier.citation	Радюк П. М. Підхід до прискорення навчання згорткової нейронної мережі за рахунок налаштуування гіперпараметрів навчання / П. М. Радюк // Комп’ютерні системи та інформаційні технології. – 2020. – № 2. – С. 31-36.	uk_UA
dc.identifier.uri	https://elar.khmnu.edu.ua/handle/123456789/9958
dc.language.iso	uk	uk_UA
dc.publisher	Хмельницький національний університет	uk_UA
dc.subject	швидкість навчання	uk_UA
dc.subject	розмір підвиборки набору даних	uk_UA
dc.subject	імпульс навчання	uk_UA
dc.subject	зниження ваги	uk_UA
dc.subject	гіперпараметри	uk_UA
dc.subject	згорткова нейронна мережа	uk_UA
dc.subject	точність валідації	uk_UA
dc.subject	learning rate	uk_UA
dc.subject	batch size	uk_UA
dc.subject	momentum	uk_UA
dc.subject	weight decay	uk_UA
dc.subject	hyperparameters	uk_UA
dc.subject	convolutional neural network	uk_UA
dc.subject	validation accuracy	uk_UA
dc.subject.udc	004.023+004.93	uk_UA
dc.title	Підхід до прискорення навчання згорткової нейронної мережі за рахунок налаштуування гіперпараметрів навчання	uk_UA
dc.title.alternative	An approach to accelerate the training of convolutional neural networks by tuning the hyperparameters of learning	uk_UA
dc.type	Стаття	uk_UA

Файли

Контейнер файлів

Зараз показуємо 1 - 1 з 1

Назва:: 22-Текст статті-74-1-10-20201128.pdf
Розмір:: 987.87 KB
Формат:: Adobe Portable Document Format
Опис:

Завантажити

Ліцензійна угода

Зараз показуємо 1 - 1 з 1

Назва:: license.txt
Розмір:: 4.26 KB
Формат:: Item-specific license agreed upon to submission
Опис:

Завантажити

Зібрання

CSIT - 2020 рік