Tutorial for NodeJS's htmlparser?


I don't really understand the readme of htmlparser.. and I searched over the internet but cannot find a proper tutorial for it (or other NodeJS parsers).

I believe for most of the time if there's no tutorial for a pretty complete and old library it's mostly because that it's easy to do thus people don't really feel the need to write tutorial for it... But I found NodeJS html parser is pretty hard to understand...

Problem courtesy of: songyy


You should check out htmlparser2. It's the newer htmlparser and it's got a decent readme. The way I tend to use it isn't streamish, and thus looks something like this:

handler = new htmlparser.DomHandler(function(err, dom) {
    // ... DO CODE HERE
new htmlparser.Parser(handler).parseComplete(html_string)

For the code inside the handler function, I use soupselect because it's documented and I'm lazy, but htmlparser2 guys suggest domutils, but it has no documentation.

Solution courtesy of: Tim Brown


There is currently no discussion for this recipe.

This recipe can be found in it's original form on Stack Over Flow.