Lupa

Izpis gradiva Pomoč

A- | A+ | Natisni
Naslov:Dataset of vocabulary in Uzbek primary education : extraction and analysis in case of the school corpus
Avtorji:ID Madatov, Khabibulla (Avtor)
ID Sattarova, Sapura (Avtor)
ID Vičič, Jernej (Avtor)
Datoteke:.pdf RAZ_Madatov_Khabibulla_2025.pdf (342,87 KB)
MD5: B099D0590099A4FB7D1438D190B9CE01
 
URL https://www.sciencedirect.com/science/article/pii/S2352340925000812
 
Jezik:Angleški jezik
Vrsta gradiva:Članek v reviji
Tipologija:1.01 - Izvirni znanstveni članek
Organizacija:FAMNIT - Fakulteta za matematiko, naravoslovje in informacijske tehnologije
Opis:The main goal of this research work is to determine the number of new words that a primary school pupil should know/acquire during each academic year. To accomplish this, we have created two datasets. The first dataset was compiled based on the "Explanatory Vocabulary of the Uzbek Language" (EDUL). The second dataset was created from 35 primary school textbooks for grades 1-4 approved by the Ministry of Preschool and School Education of the Republic of Uzbekistan, and it was named the "Uzbek Primary School Corpus" (UPSC) by authors. Using the "Comparative Lemma Extraction Method" (CLEM) proposed by the authors of the article, a vocabulary for grades 1-4 was created, and the problem of determining the number of new words (disregarding word forms as Uzbek is a morphologically rich language) that primary school pupils should learn each academic year was solved.
Ključne besede:Uzbek language, primary school, corpus construction, natural language processing (NLP), comparative Lemma extraction method
Datum objave:03.02.2025
Leto izida:2025
Št. strani:str. 1-12
Številčenje:Vol. 59, article 111349
PID:20.500.12556/RUP-21537 Povezava se odpre v novem oknu
UDK:004.65:811.5
ISSN pri članku:2352-3409
DOI:10.1016/j.dib.2025.111349 Povezava se odpre v novem oknu
COBISS.SI-ID:225129475 Povezava se odpre v novem oknu
Datum objave v RUP:08.08.2025
Število ogledov:499
Število prenosov:3
Metapodatki:XML DC-XML DC-RDF
:
Kopiraj citat
  
Skupna ocena:(0 glasov)
Vaša ocena:Ocenjevanje je dovoljeno samo prijavljenim uporabnikom.
Objavi na:Bookmark and Share


Postavite miškin kazalec na naslov za izpis povzetka. Klik na naslov izpiše podrobnosti ali sproži prenos.

Gradivo je del revije

Naslov:Data in brief
Založnik:Elsevier
ISSN:2352-3409
COBISS.SI-ID:32117977 Povezava se odpre v novem oknu

Gradivo je financirano iz projekta

Financer:EC - European Commission
Številka projekta:739574
Naslov:Renewable materials and healthy environments research and innovation centre of excellence
Akronim:InnoRenew CoE

Financer:EC - European Commission
Številka projekta:610170-EPP-1-2019-1-ES-EPPKA2-CBHE-JP
Naslov:Establishment of training and research centers and Courses development on Intelligent BigData Analysis in CA

Licence

Licenca:CC BY 4.0, Creative Commons Priznanje avtorstva 4.0 Mednarodna
Povezava:http://creativecommons.org/licenses/by/4.0/deed.sl
Opis:To je standardna licenca Creative Commons, ki daje uporabnikom največ možnosti za nadaljnjo uporabo dela, pri čemer morajo navesti avtorja.

Sekundarni jezik

Jezik:Slovenski jezik
Ključne besede:uzbeški jezik, osnovna šola, konstrukcija korpusa, obdelava naravnega jezika (NLP), metoda primerjalne ekstrakcije lem


Komentarji

Dodaj komentar

Za komentiranje se morate prijaviti.

Komentarji (0)
0 - 0 / 0
 
Ni komentarjev!

Nazaj
Logotipi partnerjev Univerza v Mariboru Univerza v Ljubljani Univerza na Primorskem Univerza v Novi Gorici