Methods of analysis of multimodal data to increase the accuracy of classification

Boyko, Nataliya; Бойко, Наталія Іванівна; Бойко, Наталия Ивановна; Muzyka, Mykhaylo; Музика, Михайло Васильович; Музыка, Михаил Васильевич

eONPUIR
→
1. Періодичні видання національного університету "Одеська політехніка"
→
Applied Aspects of Information Technology = Прикладні аспекти інформаційних технологій
→
2022, Vol. 5, № 2
→
Посмотреть элемент

dc.contributor.author	Boyko, Nataliya
dc.contributor.author	Бойко, Наталія Іванівна
dc.contributor.author	Бойко, Наталия Ивановна
dc.contributor.author	Muzyka, Mykhaylo
dc.contributor.author	Музика, Михайло Васильович
dc.contributor.author	Музыка, Михаил Васильевич
dc.date.accessioned	2022-08-08T18:48:06Z
dc.date.available	2022-08-08T18:48:06Z
dc.date.issued	2022-07-04
dc.identifier.citation	Boyko, N., Muzyka, М. (2022). Methods of analysis of multimodal data to increase the accuracy of classification. Аpplied Aspects of Information Technology, Vol. 5, N 2, р. 147–160.	еn
dc.identifier.citation	Boyko, N. Methods of analysis of multimodal data to increase the accuracy of classification / N. Boyko, М. Muzyka // Аpplied Aspects of Information Technology = Прикладні аспекти інформ. технологій. – Оdesa, 2022. – Vol. 5, N 2. – P. 147–160.	еn
dc.identifier.issn	2617-4316
dc.identifier.issn	2663-7723
dc.identifier.uri	http://dspace.opu.ua/jspui/handle/123456789/12919
dc.description.abstract	This paper proposes methods for analyzing multimodal data that will help improve the overall accuracy of the results and plans for classifying K-Nearest Neighbor (KNN) to minimize their risk. The mechanism of increasing the accuracy of KNN classification is considered. The research methods used in this work are comparison, analysis, induction, and experiment. This work aimed to improve the accuracy of KNN classification by comparing existing algorithms and applying new methods. Many literary and media sources on the classification according to the algorithm k of the nearest neighbors were analyzed, and the most exciting variations of the given algorithm were selected. Emphasis will be placed on achieving maximum classification accuracy by comparing existing and improving methods for choosing the number k and finding the nearest class. Algorithms with and without data analysis and preprocessing are also compared. All the strategies discussed in this article will be achieved purely practically. An experimental classification by k nearest neighbors with different variations was performed. Data for the experiment used two different data sets of various sizes. Different classifications k and the test sample size were taken as classification arguments. The paper studies three variants of the algorithm k nearest neighbors: the classical KNN, KNN with the lowest average and hybrid KNN. These algorithms are compared for different test sample sizes for other numbers k. The article analyzes the data before classification. As for selecting the number k, no simple method would give the maximum result with great accuracy. The essence of the algorithm is to find k closest to the sample of objects already classified by predefined and numbered classes. Then, among these k objects, you need to count how often the class occurs and assign the most common class to the selected object. If two classes' occurrences are the largest and the same, the class with the smaller number is assigned.	en
dc.description.abstract	У цій роботі запропоновані методи аналізу мультимодальних методів даних, які сприябть підвищенню загальної точності результатів, а також методи класифікації K-найближчого сусіда (KNN) для мінімізації їх ризику. Розглядається механізм підвищення точності класифікації KNN. Методами дослідження, які використовуються в даній роботі, є порівняння, аналіз, індукція, експеримент. Ця робота була спрямована на підвищення точності класифікації KNN шляхом порівняння вже існуючих алгоритмів та застосування нових методів. Було проаналізовано багато літературних та медійних джерел на тему класифікації за алгоритмом k найближчих сусідів та обрано найцікавіші, варіації поданого алгоритму. Акцент буде зроблено на досягненні максимальної точності класифікації шляхом порівняння існуючих і їх удосконалення існуючих методів вибору числа k і знаходження найближчого класу. Також порівнюються алгоритми з аналізом і попередньою обробкою даних і без них. Усі стратегії, які розгляндаються в цій статті, будуть досягнуті суто практичним шляхом. Проведено експериментальну класифікацію за k найближчими сусідами з різними варіаціями. Даними для експерименту використовувались два різних набори даних різного розміру. В якості аргументів класифікації були взяті різні класифікації k і розмір тестової вибірки. В роботі вивчаються три варіанти алгоритму k найближчих сусідів: класичний KNN, KNN з найменшим середнім і гібридний KNN. Здійснюється порівняння цих алгоритмів для різних розмірів тестової вибірки для інших чисел k. У статті аналізуються дані перед класифікацією. Що стосується підбору числа k, то не існує простого методу, який би дав максимальний результат з великою точністю. Суть алгоритму полягає в тому, щоб знайти k найближчих до вибірки об'єктів, які вже класифіковані за попередньо заданими та пронумерованими класами. Потім серед цих k об’єктів потрібно порахувати, скільки разів зустрічається клас, і призначити обраному об’єкту найпоширеніший клас.	en
dc.language.iso	en	en
dc.publisher	Odessa National Polytechnic University	en
dc.subject	Method	en
dc.subject	algorithm	en
dc.subject	analysis	en
dc.subject	machine learning	en
dc.subject	multimodal data	en
dc.subject	classification	en
dc.subject	K-Nearest Neighbor	en
dc.subject	метод	en
dc.subject	алгоритм	en
dc.subject	аналіз	en
dc.subject	машинне навчання	en
dc.subject	мультимодальні дані	en
dc.subject	класифікація	en
dc.subject	K-найближчий сусід	en
dc.title	Methods of analysis of multimodal data to increase the accuracy of classification	en
dc.title.alternative	Методи машинного навчання для класифікації мультимодальних даних	en
dc.type	Article	en
opu.citation.journal	Applied Aspects of Information Technology	en
opu.citation.volume	2	en
opu.citation.firstpage	147	en
opu.citation.lastpage	160	en
opu.citation.issue	5	en