Documentation on North Saami
Source file documentation
Using the analysers
Projects involving North Saami
Tags used for analysis
Discussions on improving our linguistic analysis
Morphophonology, morphology and syntax
- Documentation of the twol-sme.txt rule file
- Documentation of the lexicon files
- The use of flag diacritics
- Partly obsolete Documentation of the disambiguation file
- CG meetings 2013: 11.02 // 18.02
- Syntax regression testing: run sh test/src/syntax/disambiguation_developertest.sh (you may eventually have to adjust the path following $GTBIG, the files are in $GTBIG/gt/sme/corp)
- See also the general disambiguation page.
Pre- and postprocessing
- Documentation of the preprocessing of running text
- Documentation of inituppercase.regex, (initial capitalisation) and allcaps.xfst, the file for words written in all-caps. Note: The latter is presently not in use.
- Translating from xerox-style to vislcg3-style is done with the script lookup2cg
There is a separate page on speller optimisations for SME.
Obsolete test reports, for reference
- A test plan for sme (obsolete)
- A test diary for sme (obsolete)
- Bug report sheet from the days before we got a bug report system) (obsolete)
- Our earlier treatment of foreign words (obsolete)