Command-line tool for querying and extracting data from HTML
Jq, but for HTML
hq
$ hq 'a' index.html | hq -r '@href'
$ cat page.html | hq -r 'h1'
$ hq '.container .title' data.html | hq -r 'text'