1. Dataset of Uzbek base words : extraction and data analysis based on the school corpusKhabibulla Madatov, Surayyo Khajibaeva, Jernej Vičič, 2026, original scientific article Abstract: The article presents a dataset of Uzbek base words extracted from a purposefully prepared corpus using the Synonym Thesaurus Support method. This method identifies base words for each school-grade by analysing a large text corpus comprising 142 textbooks intended for school education in Uzbekistan. The definition of the base word used in this article and in the proposed dataset is a word within a synonymic series that: - is the most widely used. - is distinguished by semantic clarity and stability. - has stylistic neutrality. Based on the proposed approach, school textbooks were analysed by dividing them into Primary (school grades 1 - 4), Basic Secondary (school grades 5 - 9), and Secondary (school grades 10 - 11) blocks. Base words that stand out from the general corpus were identified for each school-grade. This method extracted new base words not found in previous school grades and specific to the observed grade. The main idea of the method is to extract base words from the lemma sset of each school-grade using a corpus of synonyms. This allows analysing the level of lexical complexity and class-specific vocabulary richness of texts intended for schoolchildren. The final results are lists of base words specifically extracted from primary (school-grades 1 - 4), basic secondary (school-grades 5 - 9), and secondary (school-grades 10 - 11) school texts; 17,599,48,203, and 20,491 base words, respectively. Keywords: school corpus, base word, basic vocabulary, Uzbek language Published in RUP: 20.05.2026; Views: 25; Downloads: 4
Full text (2,42 MB) This document has more files! More... |
2. Roma in the healthcare system : experiences of healthcare professionalsDebora Levstik Jašarevič, Andrej Kočevar, Daša Petrič, Mirko Prosen, Sabina Ličen, 2026, original scientific article Abstract: Introduction. The Roma community faces numerous challenges in accessing health services, including language barriers, discrimination, low health literacy, and social exclusion. Aim. The aim of the research was to examine the experiences of health professionals in treating Roma patients, with a focus on communication, cultural differences, and access to health services. Special emphasis was also placed on the presence of antigypsyism in the healthcare system. Methods. The research was based on a qualitativedescriptive design. The sample included 15 healthcare workers with experience in treating Roma patients, primarily from the Dolenjska region. Data were collected through semi-structured individual interviews conducted between November and December 2024. The results were analyzed using thematic analysis. Results. Thematic analysis identified four themes: (1) Cultural characteristics of the Roma community, (2) Healthcare of Roma, (3) Interpersonal relationships between healthcare workers and Roma, (4) Communication with Roma. The results show that language barriers, low health literacy, and cultural differences are the main challenges in the treatment of Roma patients. Healthcare workers highlighted the use of “Roma helpers” – cultural mediators, communication adaptations, and educational workshops as successful strategies. Conclusion. The research highlights the importance of intercultural competences for improving healthcare for the Roma community. Healthcare professionals identified key strategies for addressing antigypsyism, such as patience, building trust, involving Roma cultural mediators, and organising targeted workshops. These strategies align with current guidelines, as they are based on respect, inclusion, and co-design of services with the Roma community. The findings can contribute to the development of tailored programmes that promote inclusion and reduce health inequalities. Keywords: Roma community, cultural differences, language barriers, healthcare workers, health literacy Published in RUP: 17.05.2026; Views: 77; Downloads: 0
Full text (116,07 KB) This document has more files! More... |
3. Inferring a Mobile User’s Valence and Arousal through On-Screen Text AnalysisEdita Džubur, Veljko Pejović, 2025, independent scientific component part or a chapter in a monograph Abstract: Understanding a user’s emotional state is critical for building adaptive and intelligent mobile applications. In this paper we investigate the feasibility of inferring valence and arousal from the text displayed on smartphone screens. We developed AV-Sense, a mobile application that combines the Experience Sampling Method, a technique that prompts users to report their feelings in the moment, with passive screentext logging. In a two-week study with 12 participants, we collected 787 ESM responses and over 650,000 screentext entries. Data analysis revealed meaningful temporal and individual patterns in reported affect. We then explored the use of large language models to predict valence and arousal from screentext, but results indicated limited predictive power in this setting. Our findings highlight both the potential and current challenges of screentext-based affect inference, laying the groundwork for future research on emotion-aware applications and naturalistic psychological studies. Keywords: text analysis, experience sampling method, screentext sensing, valence, arousal, large language models Published in RUP: 30.01.2026; Views: 525; Downloads: 3
Full text (350,11 KB) |
4. Transparent Persona Generation With LLMs : An Evidence-based and Traceable Method for User-centred DesignBojan Blažica, Manca Topole, Marko Debeljak, 2025, independent scientific component part or a chapter in a monograph Abstract: Personas are a cornerstone of user-centred design, but traditional methods for developing them are difficult to validate, prone to bias and labour-intensive. Data-driven approaches have improved scalability, but often lack the narrative richness and empathy that make personas effective. We present a methodology that uses large language models (LLMs) to accelerate the creation of personas while underpinning and constraining the results with contextual and empirical data. Our approach emphasises transparency and traceability: each generated persona attribute can be linked to its source material, including project documentation, workshop transcripts, survey results or other contextual corpora. By combining the narrative strengths of LLMs with the rigour of an evidence-based foundation, the method generates personas that are both descriptive and verifiable. We present a five-step workflow methodology: (1) generation of persona candidates from contextual data using LLMs, (2) iterative refinement to ensure representativeness of personas, (3) selection of the most relevant profiles through expert evaluation, (4) design of detailed persona profiles, and (5) enrichment with empirical evidence to ensure traceability and validation. The methodology is illustrated with a case study from the field of soil health, but can also be applied to other design contexts where alignment between different stakeholders is crucial. We argue that this approach positions LLMs not as a substitute for human expertise, but as an accelerator of persona work that improves accountability, reduces bias and facilitates communication in collaborative design processes. Keywords: personas, large language model, traceability, user-centered design, decision support systems Published in RUP: 30.01.2026; Views: 542; Downloads: 5
Full text (275,89 KB) |
5. Role of music therapy in the development of language skills in children with autism spectrum disorder : a systematic literature reviewLucija Mlakar, Vesna Posavčević, 2026, review article Abstract: Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition that typically emerges in early childhood, marked by difficulties in communication, social interaction, behaviour, and emotional regulation. Despite these challenges, many children with ASD demonstrate exceptional musical abilities, making music a powerful medium for enhancing self-expression, fostering social bonds, and supporting neurological development crucial for speech and social skills. Historically, minimally verbal children with ASD were often excluded from research due to the difficulty of assessment using standardised tools; however, recent advancements have enabled more inclusive studies. Over the past decade, naturalistic approaches have gained prominence, with music therapy emerging as a particularly promising intervention. A systematic literature review, based on original research sourced from PubMed, Sage, and ScienceDirect, examined six studies involving children aged two to twelve years with minimal verbal abilities and a clinical diagnosis of autism. These studies consistently found that music therapy significantly supports the development of language and social communication skills, while also enhancing fronto-temporal brain connectivity. The review contributes valuable insights into the current state of research, underscores the importance of early intervention and parental involvement, and lays the groundwork for further exploration into the role of music therapy in language development for children with ASD. Keywords: autism spectrum disorder, children, minimal language abilities, social communication, fronto-temporal brain connectivity, music therapy, non-music therapy Published in RUP: 28.01.2026; Views: 599; Downloads: 20
Full text (3,34 MB) This document has more files! More... |
6. Psychosis as a Transformation of the Flesh : Some Merleau-Pontian Musings on MadnessAdnan Sivić, 2025, original scientific article Abstract: Psychosis is often understood in one of two ways: as a breakdown of cognitive circuitry, which has nothing to teach us as far as phenomenology is concerned and that can be treated only by focusing on the underlying causal processes that bring it about (reductionism and the ‘madness-as-nonsense’ view), or, alternatively, as a different interpretation of reality, one with nothing distinctly pathological about it (relativism). In this paper, I outline a different approach, drawing largely on Merleau-Ponty’s work, which aims to encompass both the properly unintelligible (pathological) and intelligible (expressive, phenomenologically informative) aspects of psychosis. By applying Merleau-Ponty’s analysis of expression to the problem of psychosis and psychotic language, the latter can be understood as an attempt at expression – a kind of speech without language that is most often incomplete, but that can under specific circumstances be made intelligible to others, often to significant therapeutic benefit. The present paper thus aims to complement and conceptually elucidate recent work in phenomenological psychiatry, which has demonstrated the clinical significance of enabling patients to express various aspects of their psychotic episodes. Keywords: psychosis, phenomenology, philosophy of psychiatry, Merleau-Ponty, phenomenology of language Published in RUP: 22.01.2026; Views: 320; Downloads: 2
Full text (157,99 KB) |
7. Dataset of sentiment tagged language resources for Macedonian languageSofija Kochovska, Jernej Vičič, Branko Kavšek, 2026, original scientific article Abstract: Macedonian is a South Slavic language spoken by about 2 million people, primarily in North Macedonia and among diaspora communities worldwide. It’s known for a few distinctive features. Most notably, it uses definite articles attached to the end of nouns, for example, kniga (a book) becomes knigata (the book). Furthermore, it doesn’t use grammatical cases, which makes its grammar relatively straightforward compared to other Slavic languages. The dataset comprises two lists of sentiment annotated words that present the core of the Macedonian sentiment-annotated lexicon, a list of the stopwords, and a list of Affirmative and non-Affirmative words (AnAwords) composed mostly of intensifiers and diminishers, and a list of polarity shifters. The main usage of the presented materials is in rule-based sentiment analysis, but the usage of some of the lists can be much broader. Keywords: Macedonian language, sentiment analysis, sentiment lexicon, sentiment analys, rule-based methods, natural language processing, low-resource languages, AnA words, stopwords, intensifiers, diminishers, polarity shifters Published in RUP: 20.01.2026; Views: 467; Downloads: 4
Full text (251,79 KB) This document has more files! More... |
8. The Digital Competence of Foreign Language Teachers at HEI in Serbia : A Study based on the European framework DigCompEduDanijela Ljubojević, Nikoleta Gutvajn, 2025, independent scientific component part or a chapter in a monograph Abstract: In recent years, considerable literature has grown up around the theme of digital education at higher education institutions. The issue has grown in importance in light of Covid-19 pandemic, which made teachers face rapidly changing demands. However, the question remains what this abrupt change has brought when it comes to the development of digital competences of teachers. This study therefore set out to determine the level of digital competence of foreign language teachers working at HEI in Serbia, as well as to (self-) assess their strengths and identify areas of improvement. In order to carry out this study, a questionnaire of 33 questions was implemented based on the DigCompEdu framework. The European Framework for the Digital Competence of Educators (DigCompEdu) lists 22 competences organised in six areas: Professional engagement, Digital resources, Teaching and learning, Assessment, Empowering learners, Facilitating Learners’ Digital Competence. The findings of this study show that the actual level of digital competences of teachers is A2+/B1. They also outline concrete and feasible national, institutional and interinstitutional policy recommendations to enhance the development of digital competences in higher education. Keywords: foreign language teachers, digital competences, DigCompEdu Published in RUP: 22.12.2025; Views: 374; Downloads: 1
Full text (397,27 KB) |
9. TF-IDF-based classification of Uzbek educational textsKhabibulla Madatov, Sapura Sattarova, Jernej Vičič, 2025, original scientific article Abstract: This paper presents a baseline study on automatic Uzbek text classification. Uzbek is a morphologically rich and low-resource language, which makes reliable preprocessing and evaluation challenging. The approach integrates Term Frequency–Inverse Document Frequency (TF–IDF) representation with three conventional methods: linear regression (LR), k-Nearest Neighbors (k-NN), and cosine similarity (CS, implemented as a 1-NN retrieval model). The objective is to categorize school learning materials by grade level (grades 5–11) to support improved alignment between curricular texts and students’ intellectual development. A balanced dataset of Uzbek school textbooks across different subjects was constructed, preprocessed with standard NLP tools, and converted into TF–IDF vectors. Experimental results on the internal test set of 70 files show that LR achieved 92.9% accuracy (precision = 0.94, recall = 0.93, F1 = 0.93), while CS performed comparably with 91.4% accuracy (precision = 0.92, recall = 0.91, F1 = 0.92). In contrast, k-NN obtained only 28.6% accuracy, confirming its weakness in high-dimensional sparse feature spaces. External evaluation on seven Uzbek literary works further demonstrated that LR and CS yielded consistent and interpretable grade-level mappings, whereas k-NN results were unstable. Overall, the findings establish reliable baselines for Uzbek educational text classification and highlight the potential of extending beyond lexical overlap toward semantically richer models in future work. Keywords: Uzbek language, text classification, low-resource languages, TF-IDF, cosine similarity, linear regression, k-Nearest Neighbors Published in RUP: 17.10.2025; Views: 657; Downloads: 4
Full text (286,87 KB) This document has more files! More... |
10. Is open source the future of AI? : a data-driven approachDomen Vake, Bogdan Šinik, Jernej Vičič, Aleksandar Tošić, 2025, original scientific article Abstract: Large language models (LLMs) have become central to both academic research and industrial applications, fueling debates on their accuracy, usability, privacy, and potential misuse. While proprietary models benefit from substantial investments in data and computing resources, open-sourcing is often suggested as a means to enhance trust and transparency. Yet, open-sourcing comes with its own challenges, such as risks of illicit applications, limited financial incentives, and intellectual property concerns. Positioned between these extremes are hybrid approaches—including partially open models and licensing restrictions—that aim to balance openness with control. In this paper, we adopt a data-driven approach to examine the open-source development of LLMs. By analyzing contributions in model improvements, modifications, and methodologies, we assess how community efforts impact model performance. Our findings indicate that the open-source community can significantly enhance models, demonstrating that community-driven modifications can yield efficiency gains without compromising performance. Moreover, our analysis reveals distinct trends in community growth and highlights which architectures benefit disproportionately from open-source engagement. These insights provide an empirical foundation to inform balanced discussions among industry experts and policymakers on the future direction of AI development. Keywords: large language models, artificial intelligence, open source, data science, HuggingFace Published in RUP: 25.09.2025; Views: 1176; Downloads: 4
Full text (606,12 KB) This document has more files! More... |