It isn't entirely clear to me what you are trying to achieve. Most importantly, are you not worried about operator associativity? I'll just show simple answers based on using right-recursion - this leads to left-associative operators being parsed.
The straight answer to your visible question would be to juggle a fusion::vector2<char, ast::expression>
- which isn't really any fun, especially in Phoenix lambda semantic actions. (I'll show below, what that looks like).
Meanwhile I think you should read up on the Spirit docs
- here in the old Spirit docs (eliminating left recursion); Though the syntax no longer applies, Spirit still generates LL recursive descent parsers, so the concept behind left-recursion still applies. The code below shows this applied to Spirit Qi
- here: the Qi examples contain three
calculator
samples, which should give you a hint on why operator associativity matters, and how you would express a grammar that captures the associativity of binary operators. Obviously, it also shows how to support parenthesized expressions to override the default evaluation order.
Code:
I have three version of code that works, parsing input like:
std::string input("1/2+3-4*5");
into an ast::expression
grouped like (using BOOST_SPIRIT_DEBUG):
<expr>
....
<success></success>
<attributes>[[1, [2, [3, [4, 5]]]]]</attributes>
</expr>
The links to the code are here:
First thing, I'd get rid of the alternative parse expressions per operator; this leads to excessive backtracking1. Also, as you've found out, it makes the grammar hard to maintain. So, here is a simpler variation that uses a function for the semantic action:
1check that using BOOST_SPIRIT_DEBUG!
static ast::expression make_binop(char discriminant,
const ast::expression& left, const ast::expression& right)
{
switch(discriminant)
{
case '+': return ast::binary_op<ast::add>(left, right);
case '-': return ast::binary_op<ast::sub>(left, right);
case '/': return ast::binary_op<ast::div>(left, right);
case '*': return ast::binary_op<ast::mul>(left, right);
}
throw std::runtime_error("unreachable in make_binop");
}
// rules:
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
simple = varname | number;
binop = (simple >> char_("-+*/") >> expr)
[ _val = phx::bind(make_binop, qi::_2, qi::_1, qi::_3) ];
expr = binop | simple;
As you can see, this has the potential to reduce complexity. It is only a small step now, to remove the binop intermediate (which has become quite redundant):
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
simple = varname | number;
expr = simple [ _val = _1 ]
> *(char_("-+*/") > expr)
[ _val = phx::bind(make_binop, qi::_1, _val, qi::_2) ]
> eoi;
As you can see,
- within the
expr
rule, the _val
lazy placeholder is used as a pseudo-local variable that accumulates the binops. Across rules, you'd have to use qi::locals<ast::expression>
for such an approach. (This was your question regarding _r1
).
- there are now explicit expectation points, making the grammar more robust
- the
expr
rule no longer needs to be an auto-rule (expr =
instead of expr %=
)
Finally, for fun and gory, let me show how you could have handled your suggested code, along with the shifting bindings of _1, _2 etc.:
static ast::expression make_binop(
const ast::expression& left,
const boost::fusion::vector2<char, ast::expression>& op_right)
{
switch(boost::fusion::get<0>(op_right))
{
case '+': return ast::binary_op<ast::add>(left, boost::fusion::get<1>(op_right));
case '-': return ast::binary_op<ast::sub>(left, boost::fusion::get<1>(op_right));
case '/': return ast::binary_op<ast::div>(left, boost::fusion::get<1>(op_right));
case '*': return ast::binary_op<ast::mul>(left, boost::fusion::get<1>(op_right));
}
throw std::runtime_error("unreachable in make_op");
}
// rules:
expression::base_type(expr) {
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
simple = varname | number;
binop %= (simple >> (char_("-+*/") > expr))
[ _val = phx::bind(make_binop, qi::_1, qi::_2) ]; // note _2!!!
expr %= binop | simple;
As you can see, not nearly as much fun writing the make_binop
function that way!