Recently, there has been some interesting updates to the way how grammars can be built and tested with MontySolr. But because the old (existing) features were not presented here, I shall include them in the pack as well.

Let’s start with tests – inside the contrib/antlrqueryparser/grammars and as well contrib/invenio/grammars there is a new spreadsheet. It contains the gunit tests for the grammar(s). The first sheet is for adding and reorganizing the tests, the second sheet for exporting them. So usually what I do is (inside the second sheet)

open in an editor StandardLuceneGrammar.gunit
open StandardLuceneGrammar.xls
edit Sheet1
switch to Sheet
select all and copy (Ctr+a, Ctrl+c)
paste into StandardLuceneGrammar.gunit
save
run "ant gunit" (inside contrib/antlrqueryparser)

This way you will obtain:

gunit:
     [echo] 
     [echo]             Running GUNIT: StandardLuceneGrammar        
     [echo]             
     [java] -----------------------------------------------------------------------
     [java] executing testsuite for grammar:StandardLuceneGrammar with 179 tests
     [java] -----------------------------------------------------------------------
     [java] 36 failures found:
     [java] test144 (atom, line159) - 
     [java] expected: (MODIFIER (TMODIFIER (FIELD (QNORMAL this))))
     [java] actual: (MODIFIER (TMODIFIER (FIELD (QNORMAL th\\*is))))
     .......
     [java] test179 (atom, line194) - 
     [java] expected: (QTRUNCATED *t*\\a)
     [java] actual: (MODIFIER (TMODIFIER (FIELD (QTRUNCATED *t*\\a))))
     [java] 
     [java] 
     [java] Tests run: 179, Failures: 36
     [java] Java Result: 36

Gunit found errors. It identifies the test with the line number, so I go to the spreadsheet, to the line (e.g. line 194) and check the test. Either update the result, or fix the grammar if it was wrong. This way edits are much faster than editing the gunit file by hand.

And because ANTLRWorks (the popular editor for ANTLR grammars) is not always reliable in reporting the mistakes, gunit tests are crucial – I run them after every significant change into the grammar.

By default, the Ant (inside contrib/antlrqueryparser) will assume that the debugged grammar is called “StandardLuceneGrammar”, but nothing prevents you from creating a new grammar, place it inside ./grammars (both .g and .gunit file) and call ant like:

ant gunit -Dgrammar=MyNewGrammar

The other handy feature for developing/debugging a grammar is to generate a graph of the parse tree for every gunit test.

ant generate-html

This will produce a html page with a chart for every gunit line (a valid gunit line):

To use this feature, your system must have a xdot package that generates SVG images from .dot files (e.g. on Ubuntu “sudo apt-get install xdot”) and if it is installed in non-standard location, then you can edit the contrib/antlrqueryparser/build.properties.

For testing individual cases, you don’t want to wait to see hundreds of graphs, but rather see the individual queries. There if a few of interesting commands

rchyla@diana antlrqueryparser> ant view -Dquery="x (a or b)"

Will produce:

If instead, you want to regenerate a grammar, you can do:

ant genererate-antlr -Dgrammar=<name>

Or, if you are debugging, you can use the following commands

rchyla@diana antlrqueryparser> ant try-view -Dquery="a -c"

Which will recompile the grammar before display the chart.

And …

rchyla@diana antlrqueryparser> ant try-tree -Dquery="a -c"

…will show just the parse tree (as you get familiar with ANTLR, this output will become much faster to decode).

tree:
     [echo] 
     [echo]                 Generating TREE: StandardLuceneGrammar  
     [echo]                 Query: a -c 
     [echo]                 Rule: mainQ       
     [echo]             
     [java] Grammar: StandardLuceneGrammar rule:mainQ
     [java] query: a -c
     [java] (DEFOP (MODIFIER (TMODIFIER (FIELD (QNORMAL a)))) (MODIFIER - (TMODIFIER (FIELD (QNORMAL c)))))
Advertisements