Both usme and usmeNorm accept the compound luossačuopma, while the correct form is luosačuopma. For compounds consisting of Ani+Ani/Anipart/Food we should only accept Akk/Gen of the first part of the compound. ~ $ usme 0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100% luosačuopma luosačuopma luossa+Ani+N+SgGenCmp+Cmp#čuopma+N+Sg+Nom luosačuopma luosačuopma+N+Sg+Nom luossačuopma luossačuopma luossa+Ani+N+SgNomCmp+Cmp#čuopma+N+Sg+Nom ~ $ usmeNorm 0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 100% luossačuopma luossačuopma luossa+Ani+N+SgNomCmp+Cmp#čuopma+N+Sg+Nom luosačuopma luosačuopma luossa+Ani+N+SgGenCmp+Cmp#čuopma+N+Sg+Nom luosačuopma luosačuopma+N+Sg+Nom
I think this is a great way to make use of the semantic tags for the speller. We have already got +Ani and +Food tags, we could introduce +AniPart
(In reply to comment #1) > I think this is a great way to make use of the semantic tags for the speller. > We have already got +Ani and +Food tags, we could introduce +AniPart Who should write the rule? Where should they be written?
add Linda, one two three
hello, this is the bug that i was talking about in uppsala at the beginning i thought it was a thing for the grammarchecker, but if it is possible i rather have it in speller this should always be in SgGen: boazu+CmpN/SgN+CmpN/SgG+CmpN/PlG+Sem/Ani:boah'cu BOAZU "reindeer N" ; when one of these is in second part: juolgi+Sem/Body:juol'gi AIGI ; juolut+CmpN/SgN+CmpN/SgG+CmpN/DefPlGen+Sem/Plant:juoluh GAHPIRLONGSHORT ; márfi+Sem/Food:már'fi GOAHTI-I ; Compoundtagging? Like CmpN/AniSgG Or flags?
this is an amazing bug. CmpN/LeftAniSgG This is how we do it!? boazu + juolgi When "joulgi" has the tag CmpN/LeftAniSgG, the speller rules out *boazojuolgi, and only suggest bohccojuolgi
Sjur, what is the status of compound-tags in hfst-speller?
(In reply to comment #6) > Sjur, what is the status of compound-tags in hfst-speller? ok, I found this: So — the only reasonable way to handle this is by using flag diacritics. But these tags are not flags, and can't be turned into flags either (that would break the PLX conversion). What is needed is a flag diacritic system parallel to the existing tag system, implementing the same semantics that way. It will NOT be pretty, and we need to devote some time to it to get it right. But it is the only practical way to solve this that I can see.
No need to have Biret Ánne, Berit Merete, Inga and Ritva on the CC list anymore. Added Sandra.