NOTE: There is a new Parsing API underway. This will be something equivalent to the Lexer API, but applied to parsing. The plan is for GSF to remove its own Parser registration and interfaces and replace it with the standard Parser API. This will allow embedding and coordination not just among GSF-based languages, but for all languages that implement the parsing API (such as Java, and C/C++, which are not GSF based).
You need to implement the Parser interface, and register it with GSF. Once you've done that, your Parser will be called whenever some feature needs a parse tree, and one isn't available or up to date already.
Typically, you'll probably just wrap an existing parser here - for Ruby we're using the JRuby parser, for JavaScript we're using Rhino, for Groovy we're using groovyc, etc. However, you can obviously write a parser from scratch as well - I think that's the approach the PHP editor team has taken (though I'm not sure of the details).
The key thing is that you parse the file, and then return a ParserResult
from your Parser. The ParserResult is typically going to be your
own subclass of ParserResult
where you store
additional state, such as your own AST (abstract syntax tree).
Then, when you're implementing the various features, you're
handed back your own ParserResult
, you can cast it
to your own result class and pull out the AST which you can then
use to do semantic code analysis.
One of the trickiest part about wrapping an existing parser, is handling error recovery. Most parsers aren't used to having to deal with erroneous source - well, they can abort with an error message. In the IDE however, parsing broken source is the norm. If the user is trying to do code completion, the source code may look like this:
def foo(bar) bar.| endthe user is trying to complete methods on the bar symbol - but obviously the source is currently broken. Your parser needs to be able to handle this scenario!
There are a number of strategies for attacking this problem. You can take a look at the RubyParser and the JsParser implementations for Ruby and JavaScript for some inspiration.
GSF supports incremental parsing. This is described in the separate incremental parsing document.