Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I want to scrap a https website, but I failed.

Here is my code:

require(rvest)
url <- "https://www.sunnyplayer.com/de/"
content <- read_html(url)

But I have error in console- "Error in open.connection(x, "rb") : Timeout was reached" How I can fix this problem?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
864 views
Welcome To Ask or Share your Answers For Others

1 Answer

The same thing happens to me on a proxy. To get around this, use download.file and specify a download location. You can then parse the file with read_html.

download.file(url, destfile = 'C://whatever.html')
content <- read_html('C://whatever.html')

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...