
概念
以文法文件作为 DSL,驱动语法分析器的构建。语法分析器生成器可以根据文法文件生成语法分析器。要更新语法分析器,只要更新文法并重新生成。
顾名思义,有一些成熟的语法分析器生成器可以帮助我们生成语法分析器代码,我们需要做的工作就是定义文法文件。
下面以 ANTLR 文法为例,只做概念了解即可
ANTLR 文法示例
待分析的文本
1 2 3 4
| greetings.txt... hello Rebecca hello Neal hello Ola
|
定义文法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| Greetings.g... grammar Greetings;
@header { package helloAntlr; }
@lexer::header { package helloAntlr; }
script : greeting* EOF; greeting : 'hello' Name;
Name : ('a'..'z' | 'A'..'Z')+;
WS : (' ' |'\t' | '\r' | '\n')+ {skip();} ; COMMENT : '#'(~'\n')* {skip();} ; ILLEGAL : .;
|
构建语法分析器
即根据文法生成词法分析器和语法分析器相关的 Java 源文件代码。
以下是 Ant 的构建脚本,做了解即可
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| <property name="dir.src" value="src"/> <property name="dir.gen" value="gen"/> <property name="dir.lib" value="lib"/> <path id="path.antlr"> <fileset dir="${dir.lib}"> <include name="antlr*.jar"/> <include name="stringtemplate*.jar"/> </fileset> </path> <target name="gen"> <mkdir dir="${dir.gen}/helloAntlr"/> <java classname="org.antlr.Tool" classpathref="path.antlr" fork="true" failonerror="true"> <arg value="-fo"/> <arg value="${dir.gen}/helloAntlr"/> <arg value="${dir.src}/helloAntlr/Greetings.g"/> </java> </target>
|
使用生成语法分析器
- 使用生成分析器代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| class GreetingsLoader { private Reader input; public GreetingsLoader(Reader input) { this.input = input; } public List<String> run() { try { GreetingsLexer lexer = new GreetingsLexer(new ANTLRReaderStream(input)); GreetingsParser parser = new GreetingsParser(new CommonTokenStream(lexer)); parser.script(); return guests; } catch (IOException e) { throw new RuntimeException(e); } catch (RecognitionException e) { throw new RuntimeException(e); } } private List<String> guests = new ArrayList<String>(); }
@Test public void readsValidFile() throws Exception { Reader input = new FileReader("src/helloAntlr/greetings.txt"); GreetingsLoader loader = new GreetingsLoader(input); loader.run(); }
|
- 定义测试输入文件
1 2 3 4
| invalid.txt... hello Rebecca XXhello Neal hello Ola
|
为文法添加行为代码
使用委托的方式,覆写默认的错误处理函数 reportError
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| Greetings.g... @members { GreetingsLoader helper; public void reportError(RecognitionException e) { helper.reportError(e); } }
class GreetingsLoader { private List errors = new ArrayList(); void reportError(RecognitionException e) { errors.add(e); } public boolean hasErrors() {return !isOk();} public boolean isOk() {return errors.isEmpty();} private String errorReport() { if (isOk()) return "OK"; StringBuffer result = new StringBuffer(""); for (Object e : errors) result.append(e.toString()).append("\n"); return result.toString(); }
public void run() { try { GreetingsLexer lexer = new GreetingsLexer(new ANTLRReaderStream(input)); GreetingsParser parser = new GreetingsParser(new CommonTokenStream(lexer)); parser.helper = this; parser.script(); if (hasErrors()) throw new RuntimeException("it all went pear-shaped\n" + errorReport()); } catch (IOException e) { throw new RuntimeException(e); } catch (RecognitionException e) { throw new RuntimeException(e); } } }
|
使用钩子添加行为代码
即在文法中定义方法,但在手写超类中实现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| Greetings.g... grammar Greetings; options {superClass = BaseGreetingsParser;} @header { package subclass; } @lexer::header { package subclass; } script : greeting * EOF; greeting : 'hello' n=Name {recordGuest($n);}; Name : ('a'..'z' | 'A'..'Z')+; WS : (' ' |'\t' | '\r' | '\n')+ {skip();} ; COMMENT : '#'(~'\n')* {skip();} ; ILLEGAL : .;
|
自定义子类
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| abstract public class BaseGreetingsParser extends Parser { public BaseGreetingsParser(TokenStream input) { super(input); } void recordGuest(Token t) {guests.add(t.getText());} List<String> getGuests() { return guests; } private List<String> guests = new ArrayList<String>(); private List errors = new ArrayList(); public void reportError(RecognitionException e) { errors.add(e); } public boolean hasErrors() {return !isOk();} public boolean isOk() {return errors.isEmpty();} }
|