I have tried to write a mustache parser with the excellent Boost.XPressive from the brilliant Eric Niebler. But since this is my first parser I am not familiar with the "normal" approach and lingo of compiler writers and feel a bit lost after a few days of trial&error. So I come here and hope someone can tell me the foolishness of my n00bish ways ;)
This is the HTML code with the mustache templates that I want to extract (http://mustache.github.io/):
Now <bold>is the {{#time}}gugus {{zeit}} oder nicht{{/time}} <i>for all good men</i> to come to the {007} aid of their</bold> {{country}}. Result: {{#Res1}}Nullum <b>est</b> mundi{{/Res1}}
I have the following problems that I couldn't yet solve alone:
- The parser I wrote doesn't print out anything but also doesn't issue a warning at compile-time. I managed before to have it print out parts of the mustache code but never all of it correctly.
- I don't know how I can loop through all the code to find all occurrences but then also access them like with the
smatch what;
variable. The doc only shows how to find the first occurrence with "what" or how to output all the occurrences with the "iterator".- Actually I need a combination of both. Because once something is found I need to question the tags name and the content between the tags (which "what" would offer but the "iterator" won't allow) - and act accordingly. I guess I could use "actions" but how?
- I think that it should be possible to do the tag finding and "content between tags" in one swoop, right? Or do I need to parser 2 times for that - and if so how?
- Is it okay to parse the opening and closing brackets like I did, since there are always 2 brackets? Or should I do it in sequence or use
repeat<2,2>('{')
? - I still feel a bit unsure about the cases where
keep()
andby_ref()
are necessary and when better not to use them. - I couldn't find the other options of the 4th parameter of the iterator
sregex_token_iterator cur( str.begin(), str.end(), html, -1 );
here -1 which outputs all except the matching tags. - Is my parser string correctly finding nested mustache tags?
#include <boost/xpressive/xpressive_static.hpp>
#include <boost/xpressive/match_results.hpp>
typedef std::string::const_iterator It;
using namespace boost::xpressive;
std::string str = "Now <bold>is the {{#time}}gugus {{zeit}} oder nicht{{/time}} <i>for all good men</i> to come to the {007} aid of their</bold> {{country}}. Result: {{#Res1}}Nullum <b>est</b> mundi{{/Res1}}";
// Parser setup --------------------------------------------------------
mark_tag mtag (1), cond_mtag (2), user_str (3);
sregex brackets = "{{"
>> keep ( mtag = repeat<1, 20> (_w) )
>> "}}"
;
sregex cond_brackets = "{{#"
>> keep (cond_mtag = repeat<1, 20> (_w) )
>> "}}"
>> * (
keep (user_str = + (*_s >> +alnum >> *_s) ) |
by_ref (brackets) |
by_ref (cond_brackets)
)
>> "{{/"
>> cond_mtag
>> "}}"
;
sregex mexpression = *( by_ref (cond_brackets) | by_ref (brackets) );
// Looping + catching the results --------------------------------------
smatch what2;
std::cout << "
regex_search:
" << str << '
';
It strBegin = str.begin(), strEnd = str.end();
int ic = 0;
do
{
if ( !regex_search ( strBegin, strEnd, what2, mexpression ) )
{
std::cout << ">> Breakout of this life...! Exit after " << ic << " loop(s)." << std::endl;
break;
}
else
{
std::cout << "**Loop Nr: " << ic << '
';
std::cout << "what2[0] " << what2[0] << '
'; // whole match
std::cout << "what2[mtag] " << what2[mtag] << '
';
std::cout << "what2[cond_mtag] " << what2[cond_mtag] << '
';
std::cout << "what2[user_str] " << what2[user_str] << '
';
// display the nested results
std::for_each (
what2.nested_results().begin(),
what2.nested_results().end(),
output_nested_results() // <--identical function from E.Nieblers documentation
);
strBegin = what2[0].second;
}
++ic;
}
while (ic < 6 || strBegin != str.end() );
See Question&Answers more detail:os