Metadata design for the first electronic learner corpus of Romanian

Carmen Mîrzea Vasile; Elena Irimia

doi:10.62229/rst/7.1/3

Autori

Carmen Mîrzea Vasile University of Bucharest Autor
Elena Irimia Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy Autor

DOI:

https://doi.org/10.62229/rst/7.1/3

Cuvinte cheie:

corpus metadata, individual differences, variables in language learning, Romanian as L2/FL, learner corpus of Romanian

Rezumat

This paper introduces an ongoing work of collecting, annotating and documenting the first digital Romanian Learner Corpus (LECOR), focusing on its metadata. We shortly describe the institutional context of the project, the current state of the art in the field, the objectives in terms of structure, dimensions and annotations and what work has already been done at this stage of the project. Then we present the modular structure of the metadata scheme and a detailed account of all the metadata fields and their possible values, from general metadata concerning the whole corpus (Section 3.1), to metadata organised around the student/learner (Section 3.2) and text/composition (Section 3.3). We
will also give some examples of how metadata has been dealt with in various researches (including based on LECOR corpus).

Biografii autori

Carmen Mîrzea Vasile, University of Bucharest

University of Bucharest

“Iorgu Iordan – Alexandru Rosetti” Institute of Linguistics,
Romanian Academy
Elena Irimia, Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy

Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy
University of Bucharest

Metadata design for the first electronic learner corpus of Romanian

Autori

DOI:

Cuvinte cheie:

Rezumat

Biografii autori

Descărcări

Publicat

Număr

Secțiune

Limbă

##plugins.block.information.link##

##plugins.generic.webfeed.blockTitle##