[01:40:45] *** Quits: spectie (~fran@ilazki.thinkgeek.co.uk) (Ping timeout: 268 seconds) [06:05:23] *** Joins: Eidel_ (~eidel@c83-249-247-253.bredband.comhem.se) [06:08:50] *** Quits: Eidel (~eidel@c83-249-247-253.bredband.comhem.se) (Ping timeout: 255 seconds) [08:50:26] *** Joins: bakibour (d41d2907@gateway/web/freenode/ip.212.29.41.7) [08:50:30] hiho [08:50:41] *** bakibour is now known as bourbaki [09:19:03] *** Joins: chru (~cunger@ppp-2-84-175-110.home.otenet.gr) [09:55:07] hi bourbaki! [09:55:08] 11:48:46 < bakibour> Can i add some rule like UnknownToken = {s = #someregularexpression#} ; ? [09:55:11] 11:48:55 < chru> using regular expressions will turn out impossible, i think, because all strings need to be known at compile time [09:55:14] 11:49:24 < bakibour> Hm ok, that makes it really hard to use with vague texts though. [09:55:17] 11:49:36 < bakibour> vague as in if you start with a small dictionary [09:55:19] there is a category Symbol [09:56:35] and it produces stuff like this: SUnmarked (PhrUtt NoPConj (UttPrS (UseCl_none (PredVP_none (UsePron it_Pron) (ComplVV_none (UseV_v ASimul TPres PPos (LiftVV must_VV)) (InfVP_none (UseNP_none ASimul ? PPos (UsePN (SymbPN (MkSymb "James"))))))))) NoVoc) [09:57:29] but the string "James" needs to be in the grammar, no? [09:57:32] no [10:03:34] http://www.grammaticalframework.org/lib/doc/synopsis.html#toc117 [10:06:16] *** Joins: spectie (~fran@ilazki.thinkgeek.co.uk) [10:08:06] chru: sometimes when there are hundreds of completely stupid parses it's because the grammar is treating completely normal words like "it" as Symbol [10:09:17] I parsed "él es James" [10:09:18] PhrUtt NoPConj (UttQS (UseQCl (TTAnt TPres ASimul) PPos (QuestCl (PredVP (UsePN (SymbPN (MkSymb "\233l"))) (UseComp (CompNP (UsePN (SymbPN (MkSymb "James"))))))))) NoVoc [10:09:21] PhrUtt NoPConj (UttQS (UseQCl (TTAnt TPres ASimul) PPos (QuestCl (PredVP (UsePron he_Pron) (UseComp (CompNP (UsePN (SymbPN (MkSymb "James"))))))))) NoVoc [10:09:24] PhrUtt NoPConj (UttQS (UseQCl (TTAnt TPres ASimul) PPos (QuestCl (PredVP (UsePron it_Pron) (UseComp (CompNP (UsePN (SymbPN (MkSymb "James"))))))))) NoVoc [10:09:53] mh, i see. [10:12:31] so in case of non-numerical strings, symbols will always be parsed as PNs? and those PNs are combined by UsePN + UseComp etc.? looks like a lot is parseable then (although the parses are probably not that informative). [10:12:53] I think so yeah [10:13:24] hmm I'll check [10:14:14] PN, NP, Card, Ord and S [10:17:38] interesting [10:22:31] *** Quits: jmvanel (~jmvanel@204.57.127.78.rev.sfr.net) (Quit: Quitte) [11:08:11] inariksit: hey sorry was in a meeting, reading back [11:13:45] So i can get my RegEx in there also? [11:14:19] What parser does GF use? Early? [11:24:14] bourbaki: no you can't parse by regex; the Symbol grammar uses GF's string literal whcih is hardcoded in the runtime [11:27:06] GF uses some kind of chart parsing, but I don't know if it's strictly Earley [11:27:48] Can i extend built in functions like that symb table etc? [11:28:09] Can i use GF to infer parts of a sentence? [11:28:17] eg like [11:28:28] i googoo you [11:29:00] googoo might be infered as a verb [11:29:06] to be :)? [11:29:56] sounds like you want the robust parser, which can parse chunks and give you a tree with metavariable holes in it [11:30:07] that's part of the C runtime (ie not the Haskell runtime) [11:30:25] you can use it from Python too via bindings [11:30:41] C and Haskell are fine, not the py guy ;) [11:31:04] Does the book cover all this stuff? [11:31:30] chru: I was thinking about to purchase your book not sure if it helps me with the stuff i intended to do though :) [11:32:09] i'm not sure either ;) [11:32:41] it has lots of the stuff on the semantic side, but nothing about robustness or wide-coverage [11:32:46] Do you get at least royalties? [11:33:18] yes, but not a lot [11:33:38] I wrote an article some years back and got like 15 USD every 3 years :) [11:33:48] Cant make a living on writing stuff... [11:34:26] Well i thought that if i was able to parse "holes" and infer types that the user would be able to extend the system [11:34:27] at least not on writing academic stuff. computational semantics is not a bestseller ;) [11:34:34] hehe [11:34:42] It should be [11:35:02] I still think that once stuff like this works everyone wants to jump the bandwagon. [11:36:25] one thing i thought about once: have a dummy expression for each semantic type (predicate, relation, entity, or NP, VP, whatever) that is linearized in all forms as a particular string, e.g. 'XXX'. then when a sentence cannot be parsed, try replacing substrings by 'XXX' -- once it can be parsed you found a hole (and know the semantic type of that hole). [11:37:33] in your example 'I XXX you' would parse if you have a verb dummy [11:37:51] Sounds cool, would that be much work to get into GF? [11:38:25] you just add the dummy expressions to your grammar. all the rest should be in the outside application that calls GF [11:39:03] but there are a lot of possibilities to replace one or more substrings, so this might explode a bit... [11:39:35] Yep :) [11:39:50] This is what i tried to get with the regexp [11:40:23] Every leaf node can be mapped to a regexp *, which would resemble the same [11:40:34] The same outcome as you described that is. [11:41:23] yes [11:43:58] *** Parts: chru (~cunger@ppp-2-84-175-110.home.otenet.gr) () [12:41:23] *** Joins: chru (~cunger@ppp-2-84-175-110.home.otenet.gr) [13:59:26] *** Quits: spectie (~fran@ilazki.thinkgeek.co.uk) (Ping timeout: 252 seconds) [14:37:35] *** Joins: spectie (~fran@ilazki.thinkgeek.co.uk) [14:56:07] *** Quits: bourbaki (d41d2907@gateway/web/freenode/ip.212.29.41.7) (Ping timeout: 246 seconds) [15:10:12] *** Parts: chru (~cunger@ppp-2-84-175-110.home.otenet.gr) () [18:43:32] *** Quits: spectie (~fran@ilazki.thinkgeek.co.uk) (Ping timeout: 245 seconds) [21:37:09] *** Joins: spectie (~fran@ilazki.thinkgeek.co.uk) [23:31:32] *** Quits: spectie (~fran@ilazki.thinkgeek.co.uk) (Ping timeout: 245 seconds) [23:37:39] *** Joins: spectie (~fran@ilazki.thinkgeek.co.uk)