Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a multiline HTML document that I am trying to get some stuff from. I'm using java's regex (I know - XML parsers bla bla bla, just bear with me here please :) ).

    dfahfadhadaaaa<object classid="java:com.sun.java.help.impl.JHSecondaryViewer" width="14" height="14">
<param name="content" value="../Glossary/glInterlinkedTask.html">

<param name="text" value="interlinked task">
<param name="viewerActivator" value="javax.help.LinkLabel">
<param name="viewerStyle" value="javax.help.Popup">
<param name="viewerSize" value="390,340">
<param name="textFontFamily" value="SansSerif">
<param name="textFontWeight" value="plain">
<param name="textFontStyle" value="italic">
<param name="textFontSize" value="12pt">
<param name="textColor" value="blue">

<param name=iconByID" value="">
</object>
sjtsjsrjrsjsrjsrj

I've got this HTML in a string: input.

    input = input.replaceAll("<object classid="java:com.sun.java.help.impl.JHSecondaryViewer.*?object>", "buh bye!");

Obviously, it's not working. HOWEVER, I can get a pattern match if I use pattern.compile with Pattern.DOTALL.

So, my question is - how can I do something like Pattern.DOTALL with string.replaceall?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
192 views
Welcome To Ask or Share your Answers For Others

1 Answer

Attach (?s) to the front of your pattern :

input = input.replaceAll("(?s)<object classid="java:com\.sun\.java\.help\.impl\.JHSecondaryViewer.*?object>", "buh bye!");

From the Javadoc:

Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)

Other flags work this way as well

Special constructs (non-capturing)

...

(?idmsux-idmsux) Nothing, but turns match flags i d m s u x on - off

On a side note, if your goal is to remove unsafe objects from HTML from an untrusted source, please don't use regular expressions, and please don't blacklist tags.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...