htmlcleaner

Java-based HTML parser and cleaner for parsing and manipulating HTML

brewmacoslinux
Try with needOr install directly
Source

About

HTML parser written in Java

Commands

htmlcleaner

Examples

Clean and format an HTML file$ htmlcleaner input.html -o output.html
Parse HTML and output as cleaned XML$ htmlcleaner input.html -xml
Remove unwanted tags and attributes from HTML$ htmlcleaner input.html -advanced -o cleaned.html