Bug 2714 - the correct form is not among the spellchecker suggestions
Summary: the correct form is not among the spellchecker suggestions
Status: ASSIGNED
Alias: None
Product: HFST spellers
Classification: Unclassified
Component: Suggestions, common (show other bugs)
Version: unspecified
Hardware: Macintosh Other
: P5 - Later normal
Assignee: Kevin Brubeck Unhammer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-30 18:23 CET by Linda Wiechetek
Modified: 2021-12-07 13:04 CET (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Linda Wiechetek 2020-12-30 18:23:26 CET
While testing the grammarchecker and working with the yaml tests, I notice that there are  a number of cases without suggestions at all or the correct suggestions.

For example here:

Don čuoččut gárdde siste, luite ealu, eallu manai, ja olbmot vuoddjájedje skohteriiguin {maái}${maŋŋái}.

"<maái>"
        "mađi" Adv <W:31.9512> <WA:11.9512> <spelled> "mađi"S PROTECT:34
70 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mađi" CS <W:31.9512> <WA:11.9512> <spelled> "mađi"S PROTECT:347
0 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mađđi" N Sem/Dummytag Sg Acc <W:31.9512> <WA:11.9512> <spelled>
 "mađi"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spel
led
typo
        "mađđi" N Sem/Dummytag Sg Gen <W:31.9512> <WA:11.9512> <spelled>
 "mađi"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spel
led
typo
        "mađi" Po <W:31.9512> <WA:11.9512> <spelled> "mađi"S PROTECT:347
0 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mai" Adv <W:32> <WA:12> <spelled> "mai"S PROTECT:3470 SELECT:36
85 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mái" Adv <W:34.3018> <WA:14.3018> <spelled> "mái"S PROTECT:3470
 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mái" N Sem/Time Sg Acc <W:34.3018> <WA:14.3018> <spelled> "mái"
S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mái" N Sem/Time Sg Nom <W:34.3018> <WA:14.3018> <spelled> "mái"
S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mái" N Sem/Time Sg Gen <W:34.3018> <WA:14.3018> <spelled> "mái"
S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "manni" N Sem/Food Sg Acc <W:35.3018> <WA:15.3018> <spelled> "ma
ni"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "manni" N Sem/Food Sg Gen <W:35.3018> <WA:15.3018> <spelled> "ma
ni"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mannat" V <AG-Nom-Any> <gitta> <eret> <rasta> <badjel> <birra> <oktii><RF-Com-Any> <oktii> <TH-Nom-*Ani><MA-Adv-Manner> <MA-Adv-Manner> <IN-Com-Veh> <XT-Acc-Measure> <SO-luhtte-Ani> <DE-Ill-Plc> <DE-sisa-Build> <DE-lusa-Ani> <PT-Gen-Plc><DE-Ill-Any> <PT-Gen-Plc> <PT-rastá-Plc> <PT-meaddel-Plc> <PT-čađa-Plc> <PT-bokte-Plc> <SO-Loc-*Ani><DE-Ill-*Ani> <SO-Loc-*Ani> <CO-mielde-Ani> <LO-luhtte-Any> <LO-Loc-Plc> <DE-Ill-Plc><PU-Inf> <BE-Ill-Ani><PU-Ess-Any> <PU-Inf> <PU-AktioEss> <RO-Ess-Any> IV Ind Prt Sg3 <W:35.9385> <WA:9.93848> <spelled> "manai"S SUBSTITUTE:2769 SUBSTITUTE:2929 SUBSTITUTE:2978 SUBSTITUTE:2987 SUBSTITUTE:2999 SUBSTITUTE:3138 SUBSTITUTE:3171 SUBSTITUTE:3720 SUBSTITUTE:3801 SUBSTITUTE:3805 SUBSTITUTE:3874 SUBSTITUTE:3876 SUBSTITUTE:3881 SUBSTITUTE:3886 SUBSTITUTE:3888 SUBSTITUTE:3890 SUBSTITUTE:3973 SUBSTITUTE:3975 SUBSTITUTE:3982 SUBSTITUTE:3984 SUBSTITUTE:4012 SUBSTITUTE:4093 SUBSTITUTE:4166 SUBSTITUTE:4182 SUBSTITUTE:4614 SUBSTITUTE:4616 SUBSTITUTE:4668 SUBSTITUTE:4677 SUBSTITUTE:4683 SUBSTITUTE:4709 SUBSTITUTE:4744 SUBSTITUTE:4879 PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 SETCHILD:4881 SETCHILD:4972 SETCHILD:4881 SETCHILD:4972 SETCHILD:4881 SETCHILD:4972 ADD:8753:spelled
typo
        "mun" Pron Pers Du1 Nom <W:38.7979> <WA:12.7979> <spelled> "moai"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "málli" N Sem/Food Sg Acc <W:41.3018> <WA:15.3018> <spelled> "máli"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "málli" N Sem/Food Sg Gen <W:41.3018> <WA:15.3018> <spelled> "máli"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "masai" N Sem/Hum Sg Nom <W:41.3018> <WA:15.3018> <spelled> "masai"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "miinai" Pron Indef Ill <W:41.3018> <WA:15.3018> <spelled> "masai"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "miinai" Pron Indef Loc <W:41.3018> <WA:15.3018> <spelled> "masai"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "maya" N Sem/Hum Sg Ill <W:41.3018> <WA:15.3018> <spelled> "mayai"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mii" Pron Indef Sg Nom <W:45.2939> <WA:5.29395> <spelled> "mii"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mii" Pron Interr Sg Nom <W:45.2939> <WA:5.29395> <spelled> "mii"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mun" Pron Pers Pl1 Nom <W:45.2939> <WA:5.29395> <spelled> "mii"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
        "mii" Pron Rel Sg Nom <W:45.2939> <WA:5.29395> <spelled> "mii"S PROTECT:3470 SELECT:3685 &SUGGESTWF &typo #16->16 ADD:8753:spelled
typo
;       "maái" ? SELECT:3685
Comment 1 Linda Wiechetek 2021-01-04 12:17:22 CET
In this sentence "homopárada" ending in -a should be suggested, instead it suggests "homopárade" ending in -e based on the assumption that it is a compound šith "párra" and a a-á error (Spellrelax is doing the thing here). Since Spellrelax steps in, we do not get a regular suggestion from the spellchecker anymore.


dahje bonju beivviid oktavuođas homoparade, gos sámi lesbiskat ja homofiilat vuosttaš háve čájehedje iežaset sámi duogáža.




"<homoparade>"
        "párra" N Sem/Ani_Group_Hum Sg Acc PxDu2 Err/Spellrelax <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<OBJ MAP:23728:IfNoTransV> &typo #5->5 ADD:8761:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgGen Cmp <W:0.0> #5->5
typo
        "párra" N Sem/Ani_Group_Hum Sg Acc PxDu2 <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<OBJ MAP:23728:IfNoTransV> &typo &SUGGEST #5->5 ADD:8761:Err/Orth-any COPY:8770:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgGen Cmp <W:0.0> #5->5
homo+N+Cmp/SgGen+Cmp#párra+N+Sg+Acc+PxDu2       homopárade
        "párra" N Sem/Ani_Group_Hum Sg Gen PxDu2 Err/Spellrelax <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<ADVL MAP:23141:r521 &typo #5->5 ADD:8761:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgGen Cmp <W:0.0> #5->5
typo
        "párra" N Sem/Ani_Group_Hum Sg Gen PxDu2 <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<ADVL MAP:23141:r521 &typo &SUGGEST #5->5 ADD:8761:Err/Orth-any COPY:8770:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgGen Cmp <W:0.0> #5->5
homo+N+Cmp/SgGen+Cmp#párra+N+Sg+Gen+PxDu2       homopárade
        "párra" N Sem/Ani_Group_Hum Sg Acc PxDu2 Err/Spellrelax <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<OBJ MAP:23728:IfNoTransV> &typo #5->5 ADD:8761:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgNom Cmp <W:0.0> #5->5
typo
        "párra" N Sem/Ani_Group_Hum Sg Acc PxDu2 <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<OBJ MAP:23728:IfNoTransV> &typo #5->5 ADD:8761:Err/Orth-any COPY:8770:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgNom Cmp <W:0.0> #5->5
typo
        "párra" N Sem/Ani_Group_Hum Sg Gen PxDu2 Err/Spellrelax <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<ADVL MAP:23141:r521 &typo #5->5 ADD:8761:Err/Orth-any
                "homo" N Sem/Hum Cmp/SgNom Cmp <W:0.0> #5->5
typo
        "párra" N Sem/Ani_Group_Hum Sg Gen PxDu2 <W:0.0> <cohort-with-dynamic-compound> ADD:2208 @<ADVL MAP:23141:r521 &typo #5->5 ADD:8761:Err/Orth-any COPY:8770:Err/Orth-any
               "homo" N Sem/Hum Cmp/SgNom Cmp <W:0.0> #5->5
typo
Comment 2 Linda Wiechetek 2021-01-04 16:27:17 CET
Here is another example, and I don't know why there are no suggestions:

{Lávvárdaga}${Lávvardaga} geassemánu 19. beaivve álgá ges kursa gos oahpat speallat golffa.


"<Lávvárdaga>"
        "Lávvárdaga" ? <firstCohort> &typo #1->1 ADD:8865:uncorrected-typos
typo
:
Comment 3 Sjur Nørstebø Moshagen 2021-10-27 21:53:58 CEST
(In reply to Linda Wiechetek from comment #2)
> Here is another example, and I don't know why there are no suggestions:
> 
> {Lávvárdaga}${Lávvardaga} geassemánu 19. beaivve álgá ges kursa gos oahpat
> speallat golffa.
> 
> 
> "<Lávvárdaga>"
>         "Lávvárdaga" ? <firstCohort> &typo #1->1 ADD:8865:uncorrected-typos
> typo
> :

Eg har ingen aning - det er stavekontrollen i grammatikkontrollpakka sol ikkje gjev forslag. Alle "vanlege" stavekontrollar gjev rett forslag på fyrsteplass:


echo Lávvárdaga | divvunspell suggest -a tools/spellcheckers/se-desktop.zhfst 
Reading from stdin...
Input: Lávvárdaga		[INCORRECT]
Lávvardaga		43.35547
Lávvordaga		60.3018
Lávevárdaga		72.3018


echo Lávvárdaga | hfst-ospell -S tools/spellcheckers/se-desktop.zhfst 
"Lávvárdaga" is NOT in the lexicon:
Corrections for "Lávvárdaga":
lávvardaga    38.355469
lávvordaga    55.301800
básvárdaga    67.301804


echo '5 Lávvárdaga' | hfst-ospell-office tools/spellcheckers/se-desktop.zhfst 
@@ hfst-ospell-office is alive
&	Lávvardaga	Lávvordaga	Básvárdaga	Lávivárdaga	Gávuvárdaga


Så for meg ser det ut som om dette er ein teknisk feil som Kevin bør sjå på. Eg sender feilmeldinga til han.
Comment 4 Kevin Brubeck Unhammer 2021-12-06 15:58:27 CET
Åh, det er <LastCohort> / <FirstCohort> som gjer at cg-spell ikkje ser at det er eit ukjend ord, skal få fiksa
Comment 5 Kevin Brubeck Unhammer 2021-12-07 13:04:59 CET
Skal vera retta i https://github.com/divvun/libdivvun/commit/2d4086bfb3001deec458892daf0ed5bdef92ca96 , nye pakker kjem vel i morgon