urltomarkdown/node_modules/domino/README.md

5.5 KiB
Executable File

Server-side DOM implementation based on Mozilla's dom.js

Build Status dependency status dev dependency status

As the name might suggest, domino's goal is to provide a DOM in Node.

In contrast to the original dom.js project, domino was not designed to run untrusted code. Hence it doesn't have to hide its internals behind a proxy facade which makes the code not only simpler, but also more performant.

Domino currently doesn't use any harmony/ES6 features like proxies or WeakMaps and therefore also runs in older Node versions.

Speed over Compliance

Domino is intended for building pages rather than scraping them. Hence Domino doesn't execute scripts nor does it download external resources.

Also Domino doesn't generally implement properties which have been deprecated in HTML5.

Domino sticks to DOM level 4, which means that Attributes do not inherit the Node interface.

Note that because domino does not use proxies, Element.attributes is not a true JavaScript array; it is an object with a length property and an item(n) accessor method. See github issue #27 for further discussion. It does however implement direct indexed accessors (element.attributes[i]) and is live.

CSS Selector Support

Domino provides support for querySelector(), querySelectorAll(), and matches() backed by the Zest selector engine.

Optimization

Domino represents the DOM tree structure in the same way Webkit and other browser-based implementations do: as a linked list of children which is converted to an array-based representation iff the Node#childNodes accessor is used. You will get the best performance from tree modification code (inserting and removing children) if you avoid the use of Node#childNodes and traverse the tree using Node#firstChild/Node#nextSibling (or Node#lastChild/Node#previousSibling) or querySelector()/etc.

Usage

Domino supports the DOM level 4 API, and thus API documentation can be found on standard reference sites. For example, you could start from MDN's documentation for Document and Node.

The only exception is the initial creation of a document:

var domino = require('domino');
var Element = domino.impl.Element; // etc

var window = domino.createWindow('<h1>Hello world</h1>', 'http://example.com');
var document = window.document;

// alternatively: document = domino.createDocument(htmlString, true)

var h1 = document.querySelector('h1');
console.log(h1.innerHTML);
console.log(h1 instanceof Element);

There is also an incremental parser available, if you need to interleave parsing with other processing:

var domino = require('domino');

var pauseAfter = function(ms) {
  var start = Date.now();
  return function() { return (Date.now() - start) >= ms; };
};

var incrParser = domino.createIncrementalHTMLParser();
incrParser.write('<p>hello<');
incrParser.write('b>&am');
incrParser.process(pauseAfter(1/*ms*/)); // can interleave processing
incrParser.write('p;');
// ...etc...
incrParser.end(); // when done writing the document

while (incrParser.process(pauseAfter(10/*ms*/))) {
  // ...do other housekeeping...
}

console.log(incrParser.document().outerHTML);

If you want a more standards-compliant way to create a Document, you can also use DOMImplementation:

var domino = require('domino');
var domimpl = domino.createDOMImplementation();
var doc = domimpl.createHTMLDocument();

By default many domino methods will be stored in writable properties, to allow polyfills (as browsers do). You can lock down the implementation if desired as follows:

global.__domino_frozen__ = true; // Must precede any `require('domino')`
var domino = require('domino');

Tests

Domino includes test from the W3C DOM Conformance Suites as well as tests from HTML Working Group.

The tests can be run via npm test or directly though the Mocha command line:

Screenshot

License and Credits

The majority of the code was originally written by Andreas Gal and David Flanagan as part of the dom.js project. Please refer to the included LICENSE file for the original copyright notice and disclaimer.

Felix Gnass extracted the code and turned it into a stand-alone npm package.

The code has been maintained since 2013 by C. Scott Ananian on behalf of the Wikimedia Foundation, which uses it in its Parsoid project. A large number of improvements have been made, mostly focusing on correctness, performance, and (to a lesser extent) completeness of the implementation.