When applying the pipeline hfst-tokenise --giella-cg -W $GTHOME/langs/sms/tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst | vislcg3 -g $GTHOME/langs/sms/src/syntax/disambiguator.cg3 to the sms text: 'De suu teâđast ... son käggõõđi teä suu koʹdde, pačču di koʹdde.' undesired output occurs in the form of introduced Modifier Letter Apostrophe +U02BC. (This is one of the characters in the sms orthography.) "<De>" "de" Adv Sem/Time "de" CC : "<suu>" "son" Pron Pers Sg3 Acc "son" Pron Pers Sg3 Gen : "<teâđast>" "teâtt" N Sg Loc "teâđast" Adv "teâđsted" V Ind Prs Sg3 : "<...>" "..." CLB "<>" "ʼ" N Symbol : "<son>" "son" Pron Pers Sg3 Nom : "<käggõõđi>" "käggõõđi" ? : "<teä>" "teä" Adv Sem/Time Sem/Time : "<suu>" "son" Pron Pers Sg3 Acc "son" Pron Pers Sg3 Gen : "<koʹdde>" "kåʹdded" V Ind Prt Pl3 "<,>" "," CLB "<>" "ʼ" N Symbol : "<pačču>" "pääččad" V Ind Prt Pl3 : "<di>" "di" CC : "<koʹdde>" "kåʹdded" V Ind Prt Pl3 "<.>" "." CLB "<>" "ʼ" N Symbol :\n "<>" "ʼ" N Symbol In the analysis epsilon has become Modifier Letter Apostrophe +U02BC in seemingly random places. One workaround is to comment "Modifier Letter Apostrophe +U02BC" out in the "src/morphology/generated_files/symbols.lexc" file locally. This immediately disallows the unwanted output. Of course, the introduction of the Modifier Letter Apostrophe might be associated with the spellrelax in: ʼ (->) 0 , # U+02BC MODIFIER LETTER APOSTROPHE, accept also ZERO. Should native symbols be removed or commented out of the symbols.lexc?