Heres the situation. Theres a programming language I've been using called JASS (For warcraft3 map editor, if you care) and very few tools exist for it. So I want to create something that will syntax highlight it.
Good news: I have a grammar for the language in Extended Backus-Naur Form. Bad news: I can't figure out what to do with it! I played around with PRD for a while, but the grammar it takes certainly isn't EBNF and I can't seem to convert the grammar I have in to one it will recognize.
To reiterate, all I really want to do is syntax highlight the stupid language, since I had the EBNF grammar it seemed the easiest route to take, but now I'm not so sure.
The long grammar lists follow this point. You have been warned:
This is the EBNF form as it was given to me ( at least it claims it's EBNF ):
//--------------------------------------------------------------------
+--
// Global Declarations
//--------------------------------------------------------------------
+--
program := file+
file := newline? (declr newline)* func*
declr := typedef | globals | native_func
typedef := 'type' id 'extends' ('handle' | id)
globals := 'globals' newline global_var_list 'endglobals'
global_var_list := ('constant' type id '=' expr newline
| var_declr newline)*
native_func := 'constant'? 'native' func_declr
func_declr := id 'takes' ('nothing' | param_list)
'returns' (type | 'nothing')
param_list := type id (',' type id)*
func := 'constant'? 'function' func_declr newline
local_var_list statement_list 'endfunction' newline
//--------------------------------------------------------------------
+--
// Local Declarations
//--------------------------------------------------------------------
+--
local_var_list := ('local' var_declr newline)*
var_declr := type id ('=' expr)? | type 'array' id
//--------------------------------------------------------------------
+--
// Statements
//--------------------------------------------------------------------
+--
statement_list := (statement newline)*
statement := set | call | ifthenelse | loop | exitwhen | return
| debug
set := 'set' id '=' expr | 'set' id '[' expr ']' '=' expr
call := 'call' id '(' args? ')'
args := expr (',' expr)*
ifthenelse := 'if' expr 'then' newline statement_list
else_clause? 'endif'
else_clause := 'else' newline statement_list
| 'elseif' expr 'then' newline statement_list
else_clause?
loop := 'loop' newline statement_list 'endloop'
exitwhen := 'exitwhen' expr
// must appear in a loop
return := 'return' expr?
debug := 'debug' (set | call | ifthenelse | loop)
//--------------------------------------------------------------------
+--
// Expressions
//--------------------------------------------------------------------
+--
expr := binary_op | unary_op | func_call | array_ref | func
+_ref
| id | const | parens
binary_op := expr ([+-*/><]|'=='|'!='|'>='|'<='|'and'|'or') expr
unary_op := ('+'|'-'|'not') expr
// expr must be integer or real when used with unary '
++'
func_call := id '(' args? ')'
array_ref := id '[' expr ']'
func_ref := 'function' id
const := int_const | real_const | bool_const | string_const
+| 'null'
int_const := decimal | octal | hex | fourcc
decimal := [1-9][0-9]*
octal := '0'[0-7]*
hex := '$'[0-9a-fA-F]+ | '0'[xX][0-9a-fA-F]+
fourcc := ''' .{4} '''
real_const := [0-9]+'.'[0-9]* | '.'[0-9]+
bool_const := 'true' | 'false'
string_const := '"' .* '"'
// any double-quotes in the string must be escaped wit
+h \
parens := '(' expr ')'
//--------------------------------------------------------------------
+--
// Base RegEx
//--------------------------------------------------------------------
+--
type := id | 'code' | 'handle' | 'integer' | 'real' | 'bool
+ean'
| 'string'
id := [a-zA-Z]([a-zA-Z0-9_]* [a-zA-Z0-9])?
newline := '\n'+
Thats nice and lovely isn't it? Too bad I can't seem to figure out how to use it.
I tried to munge it so PRD would take it, and this is what I came up with:
program : file(s)
file : newline(?) declr_newline(s?) func(s?)
declr_newline : declr newline
declr : typedef | globals | native_func
typedef : 'type' id 'extends' ('handle' | id)
globals : 'globals' newline global_var_list 'endglobals'
global_var_list : tmp_g_v_l(s)
tmp_g_v_l : 'constant' type id '=' expr newline
| var_declr newline
native_func : constant(?) 'native' func_declr
func_declr : id 'takes' ('nothing' | param_list)
'returns' (type | 'nothing')
param_list : type id tmp_p_l(s)
tmp_p_l : ',' type id
func : 'constant'(?) 'function' func_declr newline
local_var_list statement_list 'endfunction' newline
local_var_list : tmp_l_v_n(s)
tmp_l_v_n : 'local' var_declr newline
var_declr : type id tmp_e_e(?) | type 'array' id
tmp_e_e : '=' expr
statement_list : tmp_stm_nl(s)
tmp_stm_nl : statement newline
statement : set | call | ifthenelse | loop | exitwhen | return
| debug
set : 'set' id '=' expr | 'set' id '[' expr ']' '=' expr
call : 'call' id '(' args? ')'
args : expr (',' expr)(s)
ifthenelse : 'if' expr 'then' newline statement_list
else_clause? 'endif'
else_clause : 'else' newline statement_list
| 'elseif' expr 'then' newline statement_list
else_clause?
loop : 'loop' newline statement_list 'endloop'
exitwhen : 'exitwhen' expr
return : 'return' expr?
debug : 'debug' (set | call | ifthenelse | loop)
expr : binary_op | unary_op | func_call | array_ref | func_
+ref
| id | const | parens
binary_op : expr (/[+-*/><]/|'=='|'!='|'>='|'<='|'and'|'or') exp
+r
unary_op : ('+'|'-'|'not') expr
func_call : id '(' args(?) ')'
array_ref : id '[' expr ']'
func_ref : 'function' id
const : int_const | real_const | bool_const | string_const |
+ 'null'
int_const : decimal | octal | hex | fourcc
decimal : /[1-9][0-9]*/
octal : /0[0-7]*/
hex : /\$[0-9a-fA-F]+/ | /0[xX][0-9a-fA-F]+/
fourcc : "'" /.{4}/ "'"
real_const : /[0-9]+\.[0-9]*/ | /\.[0-9]+/
bool_const : 'true' | 'false'
string_const : /".*"/
parens : '(' expr ')'
type : id | 'code' | 'handle' | 'integer' | 'real' | 'boole
+an'
| 'string'
id : /[a-zA-Z]([a-zA-Z0-9_]* [a-zA-Z0-9])?/
newline : /\n+/
However when I attempt to create a PRD object using this grammar, the new method returns undef and no error messages are set anywhere I can find. It just prints out about 300 semi-colons.
Does anyone see a good solution forward?