[imps] Patch for Validator.nu

Nicolas Raoul nicolas.raoul.lists at gmail.com
Mon Apr 13 00:20:54 PDT 2009


Dear all,

While reading the source code of the Validator.nu project, I have
found an old "FIXME", so I fixed it. Below is the patch I wrote. It
humbly adds URI scheme name validation in SAXDriver.java, a file that
originally came from the Aelfred2 project and was modified by the
Validator.nu project. I will also try to submit the patch upstream,
but the Aelfred2 project seems dead.

Short intro because it is my first post here: I am Nicolas Raoul, my
goal is to improve HTML5 implementations and especially the
Validator.nu implementation. I am physically at W3C's Tokyo office,
and on IRC at #whatwg. Feel free to point me to things you want to see
fixed, especially if it is Java-related.

Cheers,
Nicolas Raoul.
http://nrw.free.fr

PATCH:

Index: SAXDriver.java
===================================================================
--- SAXDriver.java	(revision 37)
+++ SAXDriver.java	(working copy)
@@ -827,8 +827,17 @@
             warn("relative URI for namespace: " + uri);
         }

-        // FIXME: char [0] must be ascii alpha; chars [1..index]
-        // must be ascii alphanumeric or in "+-." [RFC 2396]
+		// char [0] must be ascii alpha [RFC 2396]
+		if( ! isAlpha(uri.charAt(0))) {
+			fatal("First character of the URI must be ascii alpha");
+		}
+		
+		// chars [1..index] must be ascii alphanumeric or in "+-." [RFC 2396]
+		for(int i=1; i<index; i++) {
+			if( ! isAlphanumericOrPlusMinusPoint(uri.charAt(i))) {
+				fatal("Character " + i + " of the URI must be ascii alpha or in \"+-.\"");
+			}
+		}

         // Namespace Constraints
         // name for xml prefix must be http://www.w3.org/XML/1998/namespace
@@ -1065,7 +1074,38 @@
             lexicalHandler.comment(ch, start, length);
         }
     }
+
+    /**
+     * Checks whether a given character is an ASCII alpha or not.
+     * @param character
+     * 			The character to check.
+     * @return
+     * 			True if the character is an ASCII alpha.
+     */
+	private boolean isAlpha(char character) {
+		// Range of alpha characters in ASCII, should be the same for
+		// Unicode according to "The Unicode Standard 4.0", section 5.2
+		return ((int)character > 64 && (int)character < 91) // Unicode A..Z
+			|| ((int)character > 96 && (int)character < 123); // Unicode a..z
+	}

+	/**
+	 * Checks whether a given character is in "+-." or an ASCII alphanumeric.
+	 * This is useful for a check related to [RFC 2396]
+	 * @param character
+     * 			The character to check.
+	 * @return
+     * 			True if the character is in "+-." or an ASCII alphanumeric.
+	 */
+	private boolean isAlphanumericOrPlusMinusPoint(char character) {
+		return ((int)character > 64 && (int)character < 91) // Unicode A..Z
+		|| ((int)character > 96 && (int)character < 123) // Unicode a..z
+		|| ((int)character > 48 && (int)character < 58) // Unicode 0..9
+		|| character == '+'
+		|| character == '-'
+		|| character == '.';
+	}
+
     void fatal(String message) throws SAXException {
         SAXParseException fatal;



More information about the Implementors mailing list