Automating a lot of xml and html processing is an important goal of Xillio Content Tools. You can crawl and scrape websites, get exactly the parts of content you need from pages, APIs or feeds, and let robots build new xml or clean/change html. This article aims to give a complete impression of the possibilities on this subject and to explain how to use all built-in html/xml functionality.
Loading a web document is very simple on itself. Open a new robot, and put the following code in it:
html_page = loadpage("http://www.google.com");Now put a breakpoint on the second line, press play and look at the debug panel. You should see the NODE variable. Note that you can select two tabs in the preview; Source and Web. The web version is only meant for a glance of which page you got. You shouldn't rely on it for debugging, since it's rendered with a different engine than the internal page variable. So the source tab is what you need.
xml_doc = loadpage("http://www.omdbapi.com/?t=back+to+the+future&y=&plot=full&r=xml");
There's a lot more to navigating with Xill, like the click() and input() functions. This is beyond the scope of this article, but you can read about it in the web navigation tutorial.
Loading from the local file system can be done by using loadxml,