Метод формалізованої процедури синтезу та обчислення ознак для виявлення фейкових новин

Шупта, Андрій; Shupta, Andrii

Метод формалізованої процедури синтезу та обчислення ознак для виявлення фейкових новин

dc.contributor.author	Шупта, Андрій
dc.contributor.author	Shupta, Andrii
dc.date.accessioned	2025-10-18T17:56:06Z
dc.date.available	2025-10-18T17:56:06Z
dc.date.issued	2025
dc.description.abstract	У роботі запропоновано новий метод, що формалізує процедуру виявлення фейкових новин, яка ґрунтується на можливостях великих мовних моделей (LLM) для синтезу підозрілих текстових атрибутів та їхнього перетворення на числові вектори, що придатні для класифікації. Завдання дослідження полягає в уточненні процесу перетворення текстових сигналів на числові ознаки, що покращує інтеграцію лінгвістичних сигналів з глибокими контекстуальними векторами ознак. Експерименти проводилися за англомовним (FakeNewsNet) та україномовним (Ukrainian news) наборами даних, де запропонований метод перевершив базові підходи, досягнувши точності до 89.6% для англійської та 88.3% для української мови. Ключові результати показують, що поєднання числових індикаторів (наприклад, коефіцієнтів перефразування та тональності) з генерацією за LLM забезпечує вищу повноту виявлення оманливих новинних статей. Запропонована процедура обчислення ознак успішно підвищує точність виявлення, зберігаючи прозорість прийняття рішень моделлю. Дослідження підкреслює важливість систематично розроблених числових ознак, які доповнюють генерації за LLM, пропонуючи шлях до більш надійних, адаптивних та пояснюваних систем виявлення фейкових новин
dc.description.abstract	The pervasive and evolving nature of digital disinformation necessitates the development of sophisticated detection systems that are accurate, transparent, and adaptable to novel deceptive strategies. While Large Language Models (LLMs) have demonstrated considerable prowess in discerning nuanced textual patterns, their application in fake news detection often results in “black-box” systems, limiting trust and hindering the ability to respond to emergent manipulative techniques. This paper introduces a novel method designed to bridge this gap. We present a structured procedure for systematically synthesizing suspicious textual attributes, guided by LLM-driven insights, and their subsequent transformation into a robust set of quantifiable, interpretable numerical features. These features, encompassing aspects such as paraphrase intensity, sentiment polarity, stylistic anomalies, and fact-checking congruity, are then synergistically integrated with the deep contextual embeddings generated by LLMs. Rigorous experimental validation was conducted on diverse English (FakeNewsNet) and Ukrainian (Ukrainian news) datasets. The proposed method outperformed established baseline approaches, achieving substantial accuracy improvements, with figures reaching up to 89.6% for English and 88.3% for Ukrainian language texts. Key findings reveal that explicitly incorporating these engineered numeric indicators significantly enhances recall rates for deceptive articles, a critical factor in mitigating the societal impact of misinformation. Furthermore, the method’s modularity fosters adaptability, enabling the incorporation of newly identified deceptive patterns as additional numeric features without necessitating the complete retraining of the foundational LLM. This study unequivocally underscores the significant value of systematically engineered, interpretable numeric features as a vital complement to the powerful, yet often opaque, embeddings of LLMs
dc.identifier.citation	Шупта А. Метод формалізованої процедури синтезу та обчислення ознак для виявлення фейкових новин / А. Шупта // Herald of Khmelnytskyi National University. Technical Sciences. – 2025. – Vol. 355, No. 4. – P. 719-723.
dc.identifier.uri	https://elar.khmnu.edu.ua/handle/123456789/19658
dc.language.iso	uk
dc.publisher	Хмельницький національний університет
dc.subject	виявлення фейкових новин
dc.subject	великі мовні моделі
dc.subject	процедура обчислення ознак
dc.subject	обробка природної мови
dc.subject	класифікація текстів
dc.subject	fake news detection
dc.subject	large language models
dc.subject	feature computation procedure
dc.subject	natural language processing
dc.subject	text classification
dc.subject.udc	004.85:004.912:025.4.03
dc.title	Метод формалізованої процедури синтезу та обчислення ознак для виявлення фейкових новин
dc.title.alternative	Method of formalized procedure for synthesis and computation of features for fake news detection
dc.type	Стаття

Файли

Контейнер файлів

Зараз показуємо 1 - 1 з 1

Назва:: 1+(12).pdf
Розмір:: 926.11 KB
Формат:: Adobe Portable Document Format

Завантажити

Ліцензійна угода

Зараз показуємо 1 - 1 з 1

Назва:: license.txt
Розмір:: 4.26 KB
Формат:: Item-specific license agreed upon to submission
Опис:

Завантажити

Зібрання

Вісник ХНУ. Технічні науки - 2025 рік