HtmlParser
Posted: June 20, 2009 Filed under: Uncategorized Leave a commentHtmlParser is one of the most well known HTML parser in java world, easy to use and straight forward.
Let the code explain all:)
try {
HasAttributeFilter filter = new HasAttributeFilter("CLASS","note");
Parser parser = new Parser("~/workspace/htmlParser/sample.html");
NodeList list = parser.extractAllNodesThatMatch(filter);
for (int i = 0; i < list.size(); i++) {
System.out.println(((TableColumn)list.elementAt(i)).getFirstChild().getText());
}
} catch (Exception e) {
e.printStackTrace();
}
In fact, only htmlparser.jar is needed to run above code.