Lupa

Show document Help

A- | A+ | Print
Title:Dataset of sentiment tagged language resources for Macedonian language
Authors:ID Kochovska, Sofija (Author)
ID Vičič, Jernej (Author)
ID Kavšek, Branko (Author)
Files:.pdf RAZ_Kochovska_Sofija_2026.pdf (251,79 KB)
MD5: 5832CFA5E6E8777E84143B994BAEE2E4
 
URL https://www.sciencedirect.com/science/article/pii/S2352340925010972?via%3Dihub
 
Language:English
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FAMNIT - Faculty of Mathematics, Science and Information Technologies
Abstract:Macedonian is a South Slavic language spoken by about 2 million people, primarily in North Macedonia and among diaspora communities worldwide. It’s known for a few distinctive features. Most notably, it uses definite articles attached to the end of nouns, for example, kniga (a book) becomes knigata (the book). Furthermore, it doesn’t use grammatical cases, which makes its grammar relatively straightforward compared to other Slavic languages. The dataset comprises two lists of sentiment annotated words that present the core of the Macedonian sentiment-annotated lexicon, a list of the stopwords, and a list of Affirmative and non-Affirmative words (AnAwords) composed mostly of intensifiers and diminishers, and a list of polarity shifters. The main usage of the presented materials is in rule-based sentiment analysis, but the usage of some of the lists can be much broader.
Keywords:Macedonian language, sentiment analysis, sentiment lexicon, sentiment analys, rule-based methods, natural language processing, low-resource languages, AnA words, stopwords, intensifiers, diminishers, polarity shifters
Publication version:Version of Record
Publication date:12.12.2025
Year of publishing:2026
Number of pages:str. 1-6
Numbering:Vol. 64, article 112384
PID:20.500.12556/RUP-22497 This link opens in a new window
UDC:004.65:811.163.3
ISSN on article:2352-3409
DOI:10.1016/j.dib.2025.112384 This link opens in a new window
COBISS.SI-ID:261524739 This link opens in a new window
Publication date in RUP:20.01.2026
Views:52
Downloads:2
Metadata:XML DC-XML DC-RDF
:
Copy citation
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.
Share:Bookmark and Share


Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Record is a part of a journal

Title:Data in brief
Publisher:Elsevier
ISSN:2352-3409
COBISS.SI-ID:32117977 This link opens in a new window

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.

Secondary language

Language:Slovenian
Keywords:makedonski jezik, analiza čustev, leksikon čustev, metode, ki temeljijo na pravilih, obdelava naravnega jezika, jeziki z omejenimi viri, besede AnA, stop besede, ojačevalniki, pomanjševalniki, spreminjalci polarnosti


Comments

Leave comment

You must log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica