Steps for Extending RGL to a Large Scale Translation Grammar We will add Dutch to the system of big translation grammars. =The Translate grammar= This is where we are $ pwd /Users/aarne/GF/lib/src/translator We start from files for German $ ls -l *Ger.gf -rw-r--r-- 1 aarne staff 1615550 Apr 10 23:38 DictionaryGer.gf -rw-r--r-- 1 aarne staff 3042 Jan 22 15:39 ExtensionsGer.gf -rw-r--r-- 1 aarne staff 662 Apr 9 11:14 TranslateGer.gf We make copies of these ones $ cp -p ExtensionsGer.gf ExtensionsDut.gf $ cp -p TranslateGer.gf TranslateDut.gf Then we change Ger->Dut in these files We take the common parts of a dictionary ; Ger doesn't have them this way but Spa does $ grep "L\." DictionarySpa.gf >DictionaryDut.gf $ grep "S\." DictionarySpa.gf >>DictionaryDut.gf Then we add a header, copying from DictionarySpa and changing Spa->Dut. And of course a "}" to the end! concrete DictionarySpa of Dictionary = CatSpa ** open ParadigmsSpa, MorphoSpa, IrregSpa, (L=LexiconSpa), (S=StructuralSpa), Prelude in { We can now try compile this, using -s to suppress 60k warnings about missing linearizations: $ gf -s DictionaryDut.gf This goes fine - but what about the translator itself? $ gf -s TranslateDut.gf File TenseDut.gf does not exist. Just change it to TenseX as in many other languages, as Dutch has no special tenses. Try again (in GF shell): > i TranslateDut.gf File ConstructionDut.gf does not exist. Let us just comment this inheritance out from TranslateDut, like in some other languages where this module is not yet available. The same with DocumentationDut. ---- ConstructionDut, ---- DocumentationDut, I use four dashes for comments meaning "to be fixed soon". Try again: > i TranslateDut.gf File ChunkDut.gf does not exist. This is more critical, since we want a robust translator! Let's fix this: $ cd ../chunk/ $ cp -p ChunkGer.gf ChunkDut.gf $ cd ../translator/ Again, go to ChunkDut.gf and change Ger->Dut. Also look for double quotes and change strings in them. E.g. copula_inf_Chunk = ss "sein" --> copula_inf_Chunk = ss "zijn" Now try again (in GF): > i TranslateDut.gf Warning: In inherited module Extensions, ... no occurrence of element BaseVPI Now we notice that ExtraDut is just a dummy module. We comment out all references to it in ExtensionsDut; of course we will fix ExtraDut later. E.g. ---- BaseVPI = E.BaseVPI ; We could continue commenting out things that don't compile. We could just give up and comment out ExtensionsDut from TranslateDut. It doesn't use many functions anyway... ---- ExtensionsDut [CompoundCN,AdAdV,UttAdV,ApposNP,MkVPI, MkVPS, PredVPS, PassVPSlash, PassAgentVPSlash], Unfortunately, ChunkDut also needs it. So let's at least make it compile by commenting out all offensive functions. There is not much left, and in ChunkDut we also comment out whatever the compiler complains about, with four dashes. We obtain concrete ChunkDut of Chunk = CatDut ---- , ExtensionsDut ** ChunkFunctor - [UseVC, VPS_Chunk, emptyNP, VPI_Chunk] with (Syntax = SyntaxDut), (Extensions = ExtensionsDut) ** open SyntaxDut, (E = ExtensionsDut), Prelude, ResDut, (P = ParadigmsDut) in { Et voilà: > i TranslateDut.gf linking ... OK Languages: TranslateDut Let us try it: > gr | l -treebank Translate: ChunkPhr (PlusChunk fullstop_Chunk (OneChunk refl_SgP1_Chunk)) TranslateDut: * . mij zelf Let us make it compilable in GF/lib/src/Makefile by adding entries for TranslateDut and Translate11 - since we now have 11 languages. Again, we can look for TranslateGer and make a copy beside it, as well as Translate10: TranslateGer: TranslateGer.pgf TranslateDut: TranslateDut.pgf TranslateDut.pgf:: ; $(GFMKT) -name=TranslateDut translator/TranslateDut.gf # Without dependencies: Translate11: $(GFMKT) -name=Translate11 $(TRANSLATE11) +RTS -K32M # With dependencies: Translate11.pgf: $(TRANSLATE10) $(GFMKT) -name=Translate11 $(TRANSLATE11) +RTS -K32M Since we have everything up to date in Translate10, let us just add the necessary new things to include Dut: $ pwd /Users/aarne/GF/lib/src $ make TranslateDut.pgf $ make Translate11 We can first try it in the plain C runtime: $ pgf-translate Translate11.pgf Phr TranslateEng TranslateDut > what is this 0.07 sec [18.070923] ChunkPhr (OneChunk (QS_Chunk (UseQCl (TTAnt TPres ASimul) PPos (QuestIComp (CompIP whatSg_IP) (DetNP (DetQuant this_Quant NumSg)))))) * wat is dit wat is dit > can we translate now 0.19 sec [35.258053] ChunkPhr (OneChunk (QS_Chunk (UseQCl (TTAnt TPres ASimul) PPos (QuestCl (PredVP (UsePron we_Pron) (AdvVP (ComplVV can_1_VV (UseV translate_V)) now_Adv)))))) * kunnen we nu [translate_V] kunnen we nu [translate_V] What about the web application? First make the new grammar accessible: cd GF/src/www/robust/ $ ls App10.pgf Translate10.pgf Translate8.pgf $ ln -s /Users/aarne/GF/lib/src/Translate11.pgf Then update the reference to this grammar - change Translate10 to Translate11 in one place: $ cd .. $ grep Translate10 */*.js js/gftranslate.js:gftranslate.jsonurl="/robust/Translate10.pgf" Try start the gf server gf -server --document-root=/Users/aarne/GF/src/www/ Point your browser to http://localhost:41296/wc.html Wait a bit, and you will see Dutch among the available languages! =Building the Android app= Navigate to the App directory and create AppDut; also change Ger->Dut as before $ pwd /Users/aarne/GF/examples/app $ cp -p AppGer.gf AppDut.gf Extend the Makefile as before: TRANSLATE11=$(TRANSLATE10) AppDut.pgf # Without dependencies: App11: $(GFMKT) -name=App11 $(TRANSLATE11) +RTS -K200M Make it: $ make AppDut.pgf $ make App11 Check that all languages are consistently included: $ gf +RTS -K200M App11.pgf Languages: AppBul AppChi AppDut AppEng AppFin AppFre AppGer AppHin AppIta AppSpa AppSwe App> l house_N къща 房 子 huis house talo maison Haus शाला casa casa hus Now follow the instructions in README in the app/ directory. You also need to add to Translator.java, in a place near AppGer reference, new Language("nl-NL", "Dutch", "AppDut", R.xml.qwerty), =The TopDictionary= Once you have DictionaryDut, go to GF/lib/src/translate/ and do $ ghci Prelude> :l CheckDict.hs *Main> createConcrete "Dut" This creates the file GF/lib/src/translate/todo/tmp/TopDictionaryDut.gf, which has words in frequency order. Copy this one level up, to GF/lib/src/translate/todo/TopDictionaryDut.gf, and follow the instructions in http://www.grammaticalframework.org/lib/src/translator/todo/check-dictionary.html to improve the dictionary in frequency order.