Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a problem using jsoup what I am trying to do is fetch a document from the url which will redirect to another url based on meta refresh url which is not working, to explain clearly if I am entering a website url named http://www.amerisourcebergendrug.com which will automatically redirect to http://www.amerisourcebergendrug.com/abcdrug/ depending upon the meta refresh url but my jsoup is still sticking with http://www.amerisourcebergendrug.com and not redirecting and fetching from http://www.amerisourcebergendrug.com/abcdrug/

Document doc = Jsoup.connect("http://www.amerisourcebergendrug.com").get();

I have also tried using,

Document doc = Jsoup.connect("http://www.amerisourcebergendrug.com").followRedirects(true).get();

but both are not working

Any workaround for this?

Update: The Page may use meta refresh redirect methods

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
231 views
Welcome To Ask or Share your Answers For Others

1 Answer

Update (case insensitive and pretty fault tolerant)


public static void main(String[] args) throws Exception {

    URI uri = URI.create("http://www.amerisourcebergendrug.com");

    Document d = Jsoup.connect(uri.toString()).get();

    for (Element refresh : d.select("html head meta[http-equiv=refresh]")) {

        Matcher m = Pattern.compile("(?si)\d+;\s*url=(.+)|\d+")
                           .matcher(refresh.attr("content"));

        // find the first one that is valid
        if (m.matches()) {
            if (m.group(1) != null)
                d = Jsoup.connect(uri.resolve(m.group(1)).toString()).get();
            break;
        }
    }
}

Outputs correctly:

http://www.amerisourcebergendrug.com/abcdrug/

Old answer:

Are you sure that it isn't working. For me:

System.out.println(Jsoup.connect("http://www.ibm.com").get().baseUri());

.. outputs http://www.ibm.com/us/en/ correctly..


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...