eONPUIR

VacancySBERT: the approach for representation of titles and skills for semantic similarity search in the recruitment domain

Показать сокращенную информацию

dc.contributor.author Bocharova, Maiia
dc.contributor.author Бочарова, Майя Юріївна
dc.contributor.author Бочарова, Майя Юрьевна
dc.contributor.author Malakhov, Eugene
dc.contributor.author Малахов, Євген Валерійович
dc.contributor.author Малахов, Евгений Валерьевич
dc.contributor.author Mezhuyev, Vitaliy
dc.contributor.author Межуєв, Віталій Іванович
dc.contributor.author Межуев, Виталий Иванович
dc.date.accessioned 2023-05-03T20:07:21Z
dc.date.available 2023-05-03T20:07:21Z
dc.date.issued 2023-04-10
dc.identifier.citation Bocharova, M., Malakhov, E., Mezhuyev, V. (2023). VacancySBERT: the approach for representation of titles and skills for semantic similarity search in the recruitment domain. Аpplied Aspects of Information Technology, Vol. 6, N 1, р. 52–59. en
dc.identifier.citation Bocharova, M. VacancySBERT: the approach for representation of titles and skills for semantic similarity search in the recruitment domain / M. Bocharova, E. Malakhov, V. Mezhuyev // Аpplied Aspects of Information Technology = Прикладні аспекти інформ. технологій. – Оdesa, 2023. – Vol. 6, N 1. – P. 52–59. en
dc.identifier.issn 2617-4316
dc.identifier.issn 2663-7723
dc.identifier.uri http://dspace.opu.ua/jspui/handle/123456789/13461
dc.description.abstract The paper focuses on deep learning semantic search algorithms applied in the HR domain. The aim of the article is developinga novel approach to training a Siamese network to link the skills mentioned in the job ad with the title.It has been shown that the title normalization process can be based either on classification or similarity comparison approaches. While classification algorithms strive to classify a sample into predefined set of categories, similarity search algorithms take a more flexible approach, since they are designed to find samples that are similar to a given query sample, without requiring pre-defined classes and labels. In this article semantic similarity search to find candidates for title normalization has been used. A pre-trained language model has been adapted while teaching it to match titles and skills based on co-occurrence information. For the purpose of this research fifty billion title-descriptions pairs had been collected for training the model and thirty three thousand title-description-normalized title triplets, where normalized job title was picked up manually by job ad creator fortesting purposes. As baselines FastText, BERT, SentenceBert and JobBert have been used. As a metric of the accuracy of the designed algorithm is Recall in top one, five and ten model’s suggestions. It has been shown that the novel training objective lets it achieve significant improvement in comparison to other generic and specific text encoders. Two settings with treating titles as standalone strings, and with included skills as additional features during inference have been used and the results have beencompared in this article. Improvements by 10% and 21.5% have been achieved using VacancySBERT and VacancySBERT (with skills) respectively. The benchmark has been developed as open-source to foster further research in the area. en
dc.language.iso en en
dc.publisher Nauka i Tekhnika en
dc.subject Natural language processing en
dc.subject document representation en
dc.subject semantic similarity search en
dc.subject entence embeddings en
dc.subject deep neural networks en
dc.subject data mining en
dc.title VacancySBERT: the approach for representation of titles and skills for semantic similarity search in the recruitment domain en
dc.title.alternative VacancySBERT: підхід до представлення назв посад та навичок для семантичного пошуку в домені підбору персоналу uk
dc.type Article en
opu.citation.journal Applied Aspects of Information Technology en
opu.citation.volume 1 en
opu.citation.firstpage 52 en
opu.citation.lastpage 59 en
opu.citation.issue 6 en


Файлы, содержащиеся в элементе

Этот элемент содержится в следующих коллекциях

Показать сокращенную информацию