Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have the following data:

int  time="1356280261"
char value="3000"

bankLine {
  char value="3000"
  char currency="EUR"
  int  time="1356280261"
} #bankLine

I am parsing this data recursively and only want to match the 2 variables outside the block separately.

I do have this regex to match the variable

/(?:char|int)s*([A-z0-9]*)s*=s*"(.*)"/

Yet, the regex matches all occurrences inside the block, too.

How can I match only the first 2 variables individually and ignore all inside the bankLink-block?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
686 views
Welcome To Ask or Share your Answers For Others

1 Answer

It's a bit hackish, but you can try adding a negative lookahead, like this:

/(?:char|int)s*([A-z0-9]*)s*=s*"(.*)"(?![^{]*})/
                                        ^^^^^^^^^^^

This assumes that all braces are balanced, and fortunately nestedness shouldn't matter (whereas normally it would, in similar questions) since you're looking for the case outside brackets.

The lookahead is based on this observation: If you encounter a close-brace without encountering an open-brace, then we might reasonably assume that we're within braces.

One is tempted to extend this the other way to include a negative lookbehind, but unfortunately most implementations do not support variable-length lookbehinds.

EDIT:

As discussed in the comments below, these fixes are recommended:

/(?:char|int)s*([A-Za-z0-9]*)s*=s*"([^"]*)"(?![^{]*})/
                    ^^^                ^^^^^

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...