HtmlParser

HtmlParser is one of the most well known HTML parser in java world,  easy to use and straight forward.

Let the code explain all:)

try {
            HasAttributeFilter filter = new HasAttributeFilter("CLASS","note");
            Parser parser = new Parser("~/workspace/htmlParser/sample.html");
            NodeList list = parser.extractAllNodesThatMatch(filter);
            for (int i = 0; i < list.size(); i++) {
                System.out.println(((TableColumn)list.elementAt(i)).getFirstChild().getText());
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

In fact, only htmlparser.jar is needed to run above code.



Leave a comment