Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need to render an HTML page server-side and "extract" the raw bytes of a canvas element so I can save it to a PNG. Problem is, the canvas element is created from javascript (I'm using jquery's Flot to generate a chart, basically). So I guess I need a way to "host" the DOM+Javascript functionality from a browser without actually using the browser. I settled on mshtml (but open to any and all suggestions) as it seems that it should be able to to exactly that. This is an ASP.NET MVC project.

I've searched far and wide and haven't seen anything conclusive.

So I have this simple HTML - example kept as simple as possible to demonstrate the problem -

<!DOCTYPE html>
<html>
<head>
    <title>Wow</title>
    <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.1.min.js" type="text/javascript"></script>
</head>
<body>
    <div id="hello">
    </div>
    <script type="text/javascript">
        function simple() 
        {
            $("#hello").append("<p>Hello</p>");
        }                    
    </script>
</body>
</html>

which produces the expected output when run from a browser.

I want to be able to load the original HTML into memory, execute the javascript function, then manipulate the final DOM tree. I cannot use any System.Windows.WebBrowser-like class, as my code needs to run in a service environment.

So here's my code:

IHTMLDocument2 domRoot = (IHTMLDocument2)new HTMLDocument();

        using (WebClient wc = new WebClient())
        {
            using (var stream = new StreamReader(wc.OpenRead((string)url)))
            {
                string html = stream.ReadToEnd();
                domRoot.write(html);
                domRoot.close();
            }
        }

        while (domRoot.readyState != "complete")
            Thread.Sleep(SleepTime);

        string beforeScript = domRoot.body.outerHTML;

        IHTMLWindow2 parentWin = domRoot.parentWindow;            
        parentWin.execScript("simple");

        while (domRoot.readyState != "complete")
            Thread.Sleep(SleepTime);


        string afterScript = domRoot.body.outerHTML;

        System.Runtime.InteropServices.Marshal.FinalReleaseComObject(domRoot);
        domRoot = null;

The problem is, "beforeScript" and "afterScript" are exactly the same. The IHTMLDocument2 instance goes through the normal "uninitialized", "loading", "complete" cycle, no errors are thrown, nothing.

Anybody have any ideas on what I'm doing wrong? Completely lost here.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
438 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can consider using Watin. Generate your page then use Watin api to capture the generated page.

http://fwdnug.com/blogs/ddodgen/archive/2008/06/19/watin-api-capturewebpagetofile.aspx


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...