You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

readme.md 2.7KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116
  1. # domhandler [![Build Status](https://travis-ci.org/fb55/domhandler.svg?branch=master)](https://travis-ci.org/fb55/domhandler)
  2. The DOM handler (formally known as DefaultHandler) creates a tree containing all nodes of a page. The tree may be manipulated using the [domutils](https://github.com/fb55/domutils) library.
  3. ## Usage
  4. ```javascript
  5. var handler = new DomHandler([ <func> callback(err, dom), ] [ <obj> options ]);
  6. // var parser = new Parser(handler[, options]);
  7. ```
  8. Available options are described below.
  9. ## Example
  10. ```javascript
  11. var htmlparser = require("htmlparser2");
  12. var rawHtml = "Xyz <script language= javascript>var foo = '<<bar>>';< / script><!--<!-- Waah! -- -->";
  13. var handler = new htmlparser.DomHandler(function (error, dom) {
  14. if (error)
  15. [...do something for errors...]
  16. else
  17. [...parsing done, do something...]
  18. console.log(dom);
  19. });
  20. var parser = new htmlparser.Parser(handler);
  21. parser.write(rawHtml);
  22. parser.end();
  23. ```
  24. Output:
  25. ```javascript
  26. [{
  27. data: 'Xyz ',
  28. type: 'text'
  29. }, {
  30. type: 'script',
  31. name: 'script',
  32. attribs: {
  33. language: 'javascript'
  34. },
  35. children: [{
  36. data: 'var foo = \'<bar>\';<',
  37. type: 'text'
  38. }]
  39. }, {
  40. data: '<!-- Waah! -- ',
  41. type: 'comment'
  42. }]
  43. ```
  44. ## Option: normalizeWhitespace
  45. Indicates whether the whitespace in text nodes should be normalized (= all whitespace should be replaced with single spaces). The default value is "false".
  46. The following HTML will be used:
  47. ```html
  48. <font>
  49. <br>this is the text
  50. <font>
  51. ```
  52. ### Example: true
  53. ```javascript
  54. [{
  55. type: 'tag',
  56. name: 'font',
  57. children: [{
  58. data: ' ',
  59. type: 'text'
  60. }, {
  61. type: 'tag',
  62. name: 'br'
  63. }, {
  64. data: 'this is the text ',
  65. type: 'text'
  66. }, {
  67. type: 'tag',
  68. name: 'font'
  69. }]
  70. }]
  71. ```
  72. ### Example: false
  73. ```javascript
  74. [{
  75. type: 'tag',
  76. name: 'font',
  77. children: [{
  78. data: '\n\t',
  79. type: 'text'
  80. }, {
  81. type: 'tag',
  82. name: 'br'
  83. }, {
  84. data: 'this is the text\n',
  85. type: 'text'
  86. }, {
  87. type: 'tag',
  88. name: 'font'
  89. }]
  90. }]
  91. ```
  92. ## Option: withDomLvl1
  93. Adds DOM level 1 properties to all elements.
  94. <!-- TODO: description -->
  95. ## Option: withStartIndices
  96. Indicates whether a `startIndex` property will be added to nodes. When the parser is used in a non-streaming fashion, `startIndex` is an integer indicating the position of the start of the node in the document. The default value is "false".
  97. ## Option: withEndIndices
  98. Indicates whether a `endIndex` property will be added to nodes. When the parser is used in a non-streaming fashion, `endIndex` is an integer indicating the position of the end of the node in the document. The default value is "false".