CRZP - detail kniha

DDP - Zverejnená diplomová práca

Rozpoznávanie emócií v texte pomocou neurónových sietí

Autor: Jancura, Martin
Školiteľ: Hládek, Daniel
Oponent: Ondáš, Stanislav
Škola: Technická univerzita v Košiciach 1040 104005
Rok odovzdania: 2023
Počet strán: 62s.
Trvalý odkaz - CRZP: https://opac.crzp.sk/?fn=detailBiblioForm&sid=E4E659F3575B0C5BCF0C726CCD36

Primárny jazyk: slovenčina

Typ práce: Diplomová práca

Študijný odbor: 2508 | *informatika

Dátum zaslania práce do CRZP: 24.04.2023

Dátum vytvorenia protokolu: 24.04.2023

Dátum doručenia informácií o licenčnej zmluve: 07.06.2023

Práca je zverejniteľná od: 24.04.2023

Elektronická verzia
: Prehliadať

Kľúčové slová (ostatné):

neurónové siete web scraping bert transformery jazykové modely gpt-3 overovacie množiny pre slovenský jazyk

Abstrakt v primárnom jazyku

Táto diplomová práca sa zaoberá problematikou rozpoznávania sentimentu v texte pomocou neurónových sietí. Cieľom tejto práce je vybrať vhodný model na rozpoznávanie sentimentu v slovenských textoch a otestovať jeho presnosť na rôznych testovacích datasetoch. V úvodných kapitolách práce budú vysvetlené základné druhy sentimentu, ktoré je možné v texte klasifikovať. Ďalej sa popisujú teoretické základy a druhy neurónových sietí, princíp ich činnosti a rôzne spôsoby akými riešia rozpoznávanie sentimentu v texte. Budú predstavené aj populárne jazykové modely založené na neurónových sieťach typu transformer. Pôjde hlavne o rôzne variácie modelov BERT a GPT-3. Uvedú sa ich vlastnosti a aj metódy, akými vykonávajú klasifikáciu textu. Praktická časť sa zaoberá predstavením viacerých overovacích množín pre rozpoznávanie sentimentu v slovenských textoch. Pričom jedna z nich, bude vytvorená pomocou techniky zvanej web scraping a knižnice puppeteer. Niektoré texty overovacích množín sa potom použijú pri testovaní vybraných jazykových modelov na úlohe klasifikácie sentimentu. Model, ktorý dosiahne najlepšie výsledky z hľadiska správnosti predpovedí, bude zvolený za najvhodnejší pre klasifikáciu sentimentu v slovenských textoch.

Abstrakt v sekundárnom jazyku

This diploma thesis deals with the issue of sentiment recognition in text using neural networks. The aim of this work is to select a suitable model for sentiment recognition in slovak texts and to test its accuracy on various test datasets. In the introductory chapters of the work, the basic types of sentiment, which can be classified in the text, will be explained. Next, the theoretical foundations and types of neural networks, the principle of their operation and the various ways in which they solve the recognition of sentiment in the text are described. Popular language models based on transformer neural networks will also be presented. It will mainly be different variations of the BERT and GPT-3 models. Their properties and the methods by which they perform text classification will be presented. The practical part deals with the presentation of several verification sets for sentiment recognition in slovak texts. And one of them will be created using a technique called web scraping and the puppeteer library. Some texts of the validation sets are then used in testing the selected language models on the sentiment classification task. The model that achieves the best results in terms of prediction accuracy will be chosen as the most suitable for sentiment classification in slovak texts.