Java-based HTML parser and cleaner for parsing and manipulating HTML
HTML parser written in Java
htmlcleaner
$ htmlcleaner input.html -o output.html
$ htmlcleaner input.html -xml
$ htmlcleaner input.html -advanced -o cleaned.html