Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

libxml-to-js

SaltwaterC1.6kBSD-3-Clause0.3.12

XML to JavaScript object parser based on libxmljs

xml, javascript, object, parser, libxml, libxmljs, namespace, cdata, xpath

readme

About build status NPM version

This is a XML to JavaScript object parser. It uses the libxmljs module for the actual XML parsing. It aims to be an easy xml2js v1 replacement, but it doesn't follow the xml2js API.

libxml-to-js uses the string parser method of libxmljs. Basically a modified version of the algorithm from here in order to fit the formal specifications of xml2js output.

Installation

npm install libxml-to-js

The installation of the underlying dependency, libxmljs, fails if you don't have gcc (or compatible compiler), the libxml2 development headers, and the xml2-config script. Under various Linux distributions you may install the appropriate libxml2 development package: libxml2-dev (Debian, Ubuntu, etc), libxml2-devel (RHEL, CentOS, Fedora, etc).

Usage mode

var parser = require('libxml-to-js');
var xml = 'xml string';

parser(xml, function (error, result) {
  if (error) {
    console.error(error);
  } else {
    console.log(result);
  }
});

With XPath query:

parser(xml, '//xpath/query', function (error, result) {
  if (error) {
    console.error(error);
  } else {
    console.log(result);
  }
});

Gotcha

Due to the fact that libxmljs does not have any method for returning the namespace attributes of a specific element, the returned namespaces aren't returned as expected:

  • the returned namespaces are only the actual used namespaces by the XML document. If there are unused namespaces, they aren't returned. This is a consequence of the fact that the namespaces are pushed into the returned object as they are detected by the parsing recursion.
  • the returned namespaces are attached as attributes to the root element, into the xmlns key in order to keep the code simple.

Example from the WordPress RSS 2 feed:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:wfw="http://wellformedweb.org/CommentAPI/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
  >
<!-- the rest of the doc -->
</rss>

is parsed as:

{ '@':
  { version: '2.0',
    xmlns:
      { atom: 'http://www.w3.org/2005/Atom',
        sy: 'http://purl.org/rss/1.0/modules/syndication/',
        dc: 'http://purl.org/dc/elements/1.1/',
        content: 'http://purl.org/rss/1.0/modules/content/',
        wfw: 'http://wellformedweb.org/CommentAPI/',
        slash: 'http://purl.org/rss/1.0/modules/slash/' } },
// the rest of the doc
}

Contributors

changelog

v0.3.12

  • Fixes #16 - Entities remain in output object text.
  • Replace the development tooling.

v0.3.11

  • Adds more CDATA support #13. Thanking XApp-Studio for the patch.
  • jslint compliant.

v0.3.10

  • Fixes a couple of global variable leaks #10.

v0.3.9

  • Takes a more safe approach to the err argument of the catch block in the exported method. It proves that in production the err argument may be undefined which breaks things.

v0.3.8

  • Fixes #8 regarding the error handling inside the passed callback to the parser. Thanking kongelaks for reporting it.

v0.3.7

  • Fixes #6 simple test case losing attribute name. Thanks to Richard Anaya for the contribution.

v0.3.6

  • Fixes #5 which was introduced in v0.3.5.

v0.3.5

  • Improved the text kludge and namespaces support. Thanks to @VirgileD for the contribution.

v0.3.4

  • XPath queries support for parsing just parts of the XML document. Thanks to @Marsup for the contribution.

v0.3.3

  • Refactored the namespace support in order to make it more stable. The parser used to crash for large XML documents in an undeterministic manner (missing method errors or segmentation faults, for the same input).

v0.3.2

  • Does not return the namespace at all if the prefix is null.

v0.3.1

  • The returned error argument is now an instance of Error().
  • In case of error, the result argument is not returned.
  • Ignores the namespace prefix if the namespace prefix is null.

v0.3

  • The error argument is null in case of successful execution in order to follow the node.js convention. This may break some code if the evaluation is made against 'undefined'.
  • Won't recurse if the children name is 'undefined'.
  • XML namespace support into the key names. This may break some existing code.
  • CDATA support for the values.
  • Support for returning the used namespaces into the XML document.

v0.2.2

  • Cleaned up the garbage from the npm published package.

v0.2.1

  • Updated the package.json file with more data.

v0.2

  • Dropped the SAX parser in favor of maintaining the string parser. Basically it cuts the code base in half. xml2js might implement libxmljs as well, therefore nothing is really lost.

v0.1

  • Initial release, featuring SAX and string parsers.