HtmlParser is one of the most well known HTML parser in java world,  easy to use and straight forward.

Let the code explain all:)

try {
            HasAttributeFilter filter = new HasAttributeFilter("CLASS","note");
            Parser parser = new Parser("~/workspace/htmlParser/sample.html");
            NodeList list = parser.extractAllNodesThatMatch(filter);
            for (int i = 0; i < list.size(); i++) {
        } catch (Exception e) {

In fact, only htmlparser.jar is needed to run above code.


STAX results

There are two kinds of STAX results used to bother me:

1. Result: None.

As STAX simply returns what you specified between <return></return>,  None is very likely to be result of a typo of variable name.

2. Result: org.python.core.PyInstance@XXXXXXX

If the argument for a function is not provided, or provide through a variable that not defined, STAX will return such confusing result.

File path problem in calling STAX functions

I’ve said it’s better to use Unix file path pattern for all STAF related the operations, I’m wrong. When calling function of a STAX xml from another STAX xml, the Unix pattern is not working (I am working on windows).

Another thing is the return value, the result of calling STAX job will overwrite the result of called the STAX job.

There is a simple sample:

The code of staxCall.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?> 
<!DOCTYPE stax SYSTEM "stax.dtd"> 
<defaultcall function="f1"/> 
<function name="f1"> 
<import machine="'local'" file="'c:/STAF/services/custom/test/simple.xml'"/>
<call function="'notepad'"/>
<return>'result of f1'</return>


The code of the simple.xml:

<?xml version="1.0" encoding="UTF-8" standalone="no"?> 
<!DOCTYPE stax SYSTEM "stax.dtd"> 
<defaultcall function="notepad"/> 
<function name="notepad"> 
        <request>'start command notepad'</request>
    <if expr="RC != 0">

STAF local stax execute file c:STAFservicescustomsteststaxCall.xml wait returnresult

Job ID: 24
Result: result of f1
Status: Normal