Romsa
Workshop for grammar model building in the Giella infrastructure
The workshop has participants working with (at least) Faroese, Kven, Mansi and Mari.
Program
The days will consist of common lectures before lunch, and individual
Monday morning will start with a common intro session where participants
Possible afternoon topics will vary from language to language:
Specific issues in morphophpnology (twolc), morphology (lexc), syntax (cg),
Common lecture topics:
- Infrastructure: The giella file structure and how it works
- Basic Unix command line course: How to get around in the file structure + central commands
- Writing finite state transducers
- Writing constraint grammar rules
- Testing
Days
We start at 0900, at A3019 (other rooms: A3018, A3012, E2004, A1018)
Monday
- Morning 1:
- Intro + Machines +
- infra tree structure (Trond)
- Intro + Machines +
- Morning 2: Unix I: navigation + basic commands ()
- Morning 3: FST
- Evening: Hands-on
Tuesday
- Morning 1: Unix II:
- Morning 2: FST
- Morning 3: CG
- Evening: Hands-on
Wednesday-Thursday
- Morning: Let us see
- Evening: Hands-on
Friday
- Morning: Summary, repetition
- Evening: Planning forward
Overview, languages
FSTs
Language | code | stems | affixes (lines) | yaml tests | fails | cg rules | focus |
---|---|---|---|---|---|---|---|
Faroese | fao | 91000 | 2700 | 10955 | 359 | 293 | fst, proofing |
Kven | fkv | 41000 | 2500 | 8467 | 547 | 1585 | fst, dialect split |
Mansi | mns | 10000 | 13800 | 1922 | 848 | 0 | fst |
Mari | mhr | 55000 | 1200 | 6612 | 6 | 51 | cg |
Groups:
- Faroese: John Mikkelsen (programmer + linux, BA fao, svn ok)
- Kven: Anna-Kaisa Räisänen, Aili Eriksen, Mari Keränen, Sindre Trosterud (mac, svn ok)
- Mansi: Csilla Horvath, Veronika Vincze, Agaston Nagy (mac, svn ok)
- Mari: Jeremy Bradley, Sasha Simonenko (mac, svn ok)
- Russian: Uliana Petrunina, Svetlana Sokolova (mac, svn ok)
Preparations:
-
Have a look at the code if you do not know it (click on your language code)
-
Here is the language documentation
- We will give an introduction to unix, according to these lines:
- http://giellalt.uit.no/tools/newunix.html (see: Unix for linguists/лингвистов/lingvistar)