[html5] Using validator.nu as a standalone library
Trubin, Stanislav
trubin at amazon.com
Thu Apr 21 17:34:37 PDT 2011
Hello all!
I am trying to build an offline solution to provide HTML4/5 validation with as little dependencies as possible. I got stuck creating a basic Java application that would use validator.nu’s core as a standalone library. I would greatly appreciate any help in providing code examples that would help creating such application.
Here is a rough outline of the program that I’d like to achieve:
Foo.java
import …
public class Foo {
public static void main(String[] args){
String input = “<html><head><title>test</title></head></html>”;
// initialize parser
// parse providing html chunk as a String input (or URL)
// print out all results in some sort of for-loop
for(){
System.out.println(“Error #” + i + “: Line “ + ParserResult[i].getLineNumber() + “, Char “ + ParserResult[i].getColumnNumber() + “, Message “ + ParserResult[i].getMessage();
}
}
}
This is how far I’ve got and got stuck:
import java.io.IOException;
import java.io.OutputStreamWriter;
import nu.validator.htmlparser.sax.HtmlParser;
import org.xml.sax.ContentHandler;
import org.xml.sax.SAXException;
import nu.validator.htmlparser.sax.HtmlSerializer;
import nu.validator.htmlparser.test.TreeDumpContentHandler;
import nu.validator.xml.SystemErrErrorHandler;
public class Foo {
public static void main(String[] args) throws SAXException, IOException {
TreeDumpContentHandler treeDumpContentHandler = new TreeDumpContentHandler(new OutputStreamWriter(System.out, "UTF-8"));
ContentHandler serializer = new HtmlSerializer(System.out);
SystemErrErrorHandler eh = new SystemErrErrorHandler();
HtmlParser htmlParser = new HtmlParser();
htmlParser.setContentHandler(serializer);
htmlParser.setLexicalHandler(treeDumpContentHandler);
htmlParser.setErrorHandler(eh);
htmlParser.parse("http://www.google.com");
}
}
I get a few basic errors back (such as un-escaped ampersand character in the URL), but nothing HTML-specific (for instance, “Attribute height not allowed on element tr at this point.”)
Thank you and best regards,
Stan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/help-whatwg.org/attachments/20110421/66f72dbd/attachment-0001.htm>
More information about the Help
mailing list