I'm trying to write a regular expression for my html parser.
I want to match a html tag with given attribute (eg. <div>
with class="tab news selected"
) that contains one or more <a href>
tags. The regexp should match the entire tag (from <div>
to </div>
). I always seem to get "memory exhausted" errors - my program probably takes every tag it can find as a matching one.
I'm using boost regex libraries.
See Question&Answers more detail:os