Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am trying to scrape this website link using RSelenium. I have successfully scraped most of the contents on the page but was trying to get through to the "facility visits" and "facility complaints". Since both of those buttons have a javascript href when I inspect them with developer tools I have been using phantomjs and RSelenium.

I can successfully navigate to the page via phantom but whenever I try to extract the text from the fields using $getElementText, I get thrown the following error:

Selenium message:{"errorMessage":"Element does not exist in cache","request":{"headers":{"Accept":"application/json, text/xml, application/xml, */*","Accept-Encoding":"gzip, deflate","Host":"localhost:4444","User-Agent":"libcurl/7.53.1 r-curl/2.6 httr/1.2.1"},"httpVersion":"1.1","method":"GET","url":"/attribute/id","urlParsed":{"anchor":"","query":"","file":"id","directory":"/attribute/","path":"/attribute/id","relative":"/attribute/id","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/attribute/id","queryKey":{},"chunks":["attribute","id"]},"urlOriginal":"/session/c0f30500-55d0-11e7-96dd-3b147ee40d88/element/:wdc:1497974074536/attribute/id"}}

 Show Traceback
Error: Summary: StaleElementReference Detail: An element command failed because the referenced element is no longer attached to the DOM. class: org.openqa.selenium.StaleElementReferenceException Further Details: run errorDetails method

and when I use $currentURL and $screenship(display = T) it shows the correct website rendered and the correct link.

I know it has something to do with how elements are attached to the DOM but I am not sure how to resolve the issue in R

Code below:

url <- "https://dhs.arkansas.gov/dccece/cclas/FacilityInformation.aspx?FacilityNumber=23516"
rd<-remoteDriver(browserName = 'phantomjs')

rd$open()

rd$navigate(url)

webElem<- rd$findElement(using="xpath", value = '//*[@id="ctl00_ContentPlaceHolder1_lbtnVisits"]')

webElem$clickElement()

webElem$findElements('css',"#aspnetForm > div.page > div.main")

webElem$getElementAttribute("id")
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.2k views
Welcome To Ask or Share your Answers For Others

1 Answer

You are probably getting a StaleElementReference as a result of clicking the webElem.

The webElem element is likely modified in the DOM after the click, so if you try to "use" webElem again, it is no longer attached to the DOM and is considered "stale".


An easy fix is to simply re-locate webElem after it is clicked:

webElem <- rd$findElement(...
webElem$clickElement()
webElem <- rd$findElement(... # re-locate webElem
webElem$findElements('css',"#aspnetForm > div.page > div.main")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...