UIT The arctic university of Norway > Giellatekno



Moments for further work

(1) Fuzzy search for examples (sentences, or word in context):

(pointer to Microsoft Research ASL Assistant by Michael Gamon and colleagues: http: //research.microsoft.com/en-us/projects/msreslassistant/ http: //msr-waypoint.com/pubs/81027/NAACL09-final.pdf The idea is to show the learner multiple alternatives of expressing something - in the Microsoft case related to "errors", but it could also allow the learner to explore alternative options of expressing what they want to express.

MA Thesis on discovering subphrasal repeats

(2) software engineering to support modular code(re)use and further development:

  • reorganise oahpa-code
    • modularise the code
    • testscripts
    • tags from lexc/FST
    • lemmas and translations from dict

(3) scaffolding feedback for learners

  • based on forms that are close, but not what the precise fst would provide
  • relate existing linguistic forms to the different reasons that they diverge from the precise target forms (e.g., close in phonetic form, orthographic form, semantically similar (or antonym), related through L2 process such as overgeneralization or transfer)i
  • more work on L2 FST
  • make a more robust system (in different ways..)

(4) Improving vocabulary learning

  • Evaluating existing open systems: Anki http://ankisrs.net/, cf. also Flashcard system for vocabulary learning http://gielese.no/
  • look at frequency, timing of repetition (work by Hedderik van Rijn), ...
  • Possibility for teachers and pupils to add words and word-lists
  • This is an area for improvement, with two components:
    • (a) using the Giellatekno resources to provide vocabulary lists that can be loaded into standard tools, such as Anki
    • (b) integrating language use (frequency) as ranking or filtering of output of Giellatekno tools for users (esp. for learners, who would e.g. get the output for

(5) Combining the perspective of language as a system (grammar) with language in use

  • Requires: collecting frequency data for words, based on a representative.
  • Current North-Saami corpus is good but not really balanced, esp. for beginning learners
  • Developing complexity analysis for Saami would make it possible to

(6) Learner Modeling

  • course login
    • testing, improvements
    • student modeling (both vocabulary and morphology training)
    • teachers can choose lemmas to tasks
    • Research on usage, logs (Trond's talk in Oulu, many other problems, involving programming)

(7) Visual Input Enhancement of authentic texts

  • Konteaksta
    • testing
    • Firefox-plugin
    • bookmark-plugin? - model from the dictionary
  • NDS - (online dictionary, currently not making use of sentence context)
    • use the Konteaksta-model to do the disambiguation
    • (one possible option for a good first task for a programmer)
  • Question-generation + answer evaluation for authentic texts
    • supports combination of form-learning in authentic, functional contexts ("incidental focus on form")
    • Michael Heilmann's dissertation (in Noah's ARK)
    • BA and MA thesis by Tobias Kolditz (MA), Eran Raveh (BA)
    • (could be a bigger project in the startup phase of the programmer)

(8) Small improvements, bugs, and programming tasks

  • Oahpa
    • text-to-speech for Leksa?
    • Sg option in morfa
    • level options in leksa
    • linguistic terminology in feedback vs course material
    • change localisation so it doesn't follow the choice done in the operating system (there is a dropdown menu, but currently what one choses is not reflected in the system)
  • Korp: info about that Korp doesn't function for Internet Explorer
  • Web-service (cgi-scripts)
    • tag explanations for generator
    • better sorting of generation
  • Text-to-speech to the students