Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

In my program, I have a string (obtained from an external library) which doesn't match any regular expression.

String content = // extract text from PDF
assertTrue(content.matches(".*")); // fails
assertTrue(content.contains("S P E C I A L")); // passes
assertTrue(content.matches("S P E C I A L")); // fails

Any idea what might be wrong? When I print content to stdout, it looks ok.

Here is the code for extracting text from the PDF (I am using iText 5.0.1):

PdfReader reader = new PdfReader(source);
PdfTextExtractor extractor = new PdfTextExtractor(reader,
    new SimpleTextExtractingPdfContentRenderListener());
return extractor.getTextFromPage(1);
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
235 views
Welcome To Ask or Share your Answers For Others

1 Answer

By default, the . does not match line breaks. So my guess is that your content contains a line break.

Also note that matches will match the entire string, not just a part of it: it does not do what contains does!

Some examples:

String s = "foo
bar";
System.out.println(s.matches(".*"));       // false
System.out.println(s.matches("foo"));      // false
System.out.println(s.matches("foo
bar")); // true
System.out.println(s.matches("(?s).*"));   // true

The (?s) in the last example will cause the . to match line breaks as well. So (?s).* will match any string.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...