Module: util/html

HTML parser/scraper utilities.

Methods


<static> extract(ii, cb)

Extract HTML information from a URL (http/https/file).
Parameters:
Name Type Description
ii Object Input Information
Properties
Name Type Argument Description
page string URL ('http://', 'https://' or 'file://')
selector string | Array.<string> CSS selector
encoding string <optional>
HTML encoding (default 'utf8')
paginate string <optional>
CSS selector for pagination
result string <optional>
Result object CSS selectors
cb function Callback
Properties
Name Type Description
er Error Error
data string | Array.<string> Output data
Example
var html = mbot.load('util/html');
html.extract({
    page: 'https://news.google.com/news/',
    selector: ['h2 > a > .titletext']
}, function(er, data) {
    if (er)
        console.log('error: ' + er);
    else
        console.log('data: ' + data.join('\n'));
});