Bug 2145 - Downcasing of derived proper nouns broken by hyperminimisation
Summary: Downcasing of derived proper nouns broken by hyperminimisation
Status: ASSIGNED
Alias: None
Product: smj lexicon
Classification: Unclassified
Component: stem lexica (show other bugs)
Version: unspecified
Hardware: All All
: P5 - Later normal
Assignee: Sjur Nørstebø Moshagen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-11 14:22 CET by Sjur Nørstebø Moshagen
Modified: 2018-05-09 12:52 CEST (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sjur Nørstebø Moshagen 2016-01-11 14:22:44 CET
./configure --with-hfst --enable-hyperminimisation

gives the following:

[ 1/18][FAIL] Narvijkka+N+Prop+Sem/Plc+Sg+Gen+Der/k+N+Sg+Nom => Missing results: narvijkak
[ 1/18][FAIL] Narvijkka+N+Prop+Sem/Plc+Sg+Gen+Der/k+N+Sg+Nom => Unexpected results: Narvijkak

This is because hyperminimisation inserts an extra symbol at the very beginning of the net: @P.LEXNAME.Root@. This symbol breaks the context requirements of the downcasing regex.

Hyperminimisation is not used very much, but can be turned on for speller optimisations without people being aware of or remembering this bug. It should thus be fixed.
Comment 1 Thomas Omma 2016-01-12 08:54:24 CET
hyperminimisation! :O
Comment 2 Sjur Nørstebø Moshagen 2018-05-09 12:52:55 CEST
This is now finally fixed - mostly. There are still a few regressions for the spellers for words with CmpN/*Left tags, so will keep this open until it is completely fixed.