Hướng dẫn html-to-text npm
Advanced converter that parses HTML and returns beautiful text. Features
ChangelogAvailable here: CHANGELOG.md Version 6 contains a ton of changes, so it worth to take a look. Version 7 contains an important change for custom formatters. Version 8 brings the selectors support to greatly increase the flexibility but that also changes some things introduced in version 6. Base element(s) selection also got important changes. Installation
UsageConvert a single document: const { convert } = require('html-to-text'); // There is also an alias to `convert` called `htmlToText`. const html = ' Configure const { compile } = require('html-to-text'); const convert = compile({ wordwrap: 130 }); const htmls = [ ' OptionsGeneral options
Deprecated or removed options
Other things deprecated:
SelectorsSome example: const { convert } = require('html-to-text'); const html = 'PageAction'; const text = convert(html, { selectors: [ { selector: 'a', options: { baseUrl: 'https://example.com' } }, { selector: 'a.button', format: 'skip' } ] }); console.log(text); // Page [https://example.com/page.html] Selectors array is our loose approximation of a stylesheet.
To achieve the best performance when checking each DOM element against provided selectors, they are compiled into a decision tree. But it is also important how you choose selectors. For example, Supported selectors
Following selectors can be used in any combinations:
You can match ... Predefined formattersFollowing selectors have a formatter specified as a part of the default configuration. Everything can be overridden, but you don't have to repeat the
More formatters also available for use:
Format optionsFollowing options are available for built-in formatters.
Deprecated format options
Override formattingThis is significantly changed in version 6.
Each formatter is a function of four arguments that returns nothing. Arguments are:
Custom formatter example: const { convert } = require('html-to-text'); const html = ' Refer to built-in formatters for more examples. The easiest way to write your own is to pick an existing one and customize. Refer to BlockTextBuilder for available functions and arguments. Note: Command Line InterfaceIt is possible to use html-to-text as command line interface. This allows an easy validation of your generated text and the integration in other systems that does not run on node.js.
There also all options available as described above. You can use them like this:
The Example
Contributors
LicenseMIT License |