Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have next code:

public static void createTokens(){
    String test = "test is a word word word word big small";
    Matcher mtch = Pattern.compile("test is a (\s*.+?\s*) word (\s*.+?\s*)").matcher(test);
    while (mtch.find()){
        for (int i = 1; i <= mtch.groupCount(); i++){
            System.out.println(mtch.group(i));
        }
    }
}

And have next output:

word
w

But in my opinion it must be:

word
word

Somebody please explain me why so?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
815 views
Welcome To Ask or Share your Answers For Others

1 Answer

Because your patterns are non-greedy, so they matched as little text as possible while still consisting of a match.

Remove the ? in the second group, and you'll get
word
word word big small

Matcher mtch = Pattern.compile("test is a (\s*.+?\s*) word (\s*.+\s*)").matcher(test);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...