| 1 |
14 |
jlechner |
This file describes the jaxp (xml processing) implementation of GNU Classpath.
|
| 2 |
|
|
GNU Classpath includes interfaces and implementations for basic XML processing
|
| 3 |
|
|
in in the java programming language, some general purpose SAX2 utilities, and
|
| 4 |
|
|
transformation.
|
| 5 |
|
|
|
| 6 |
|
|
These classes used to be maintained as part of an external project GNU JAXP
|
| 7 |
|
|
but are now integrated with the rest of the core class library provided by
|
| 8 |
|
|
GNU Classpath.
|
| 9 |
|
|
|
| 10 |
|
|
PACKAGES
|
| 11 |
|
|
|
| 12 |
|
|
. javax.xml.* ... JAXP 1.3 interfaces
|
| 13 |
|
|
|
| 14 |
|
|
. gnu.xml.aelfred2.* ... SAX2 parser + validator
|
| 15 |
|
|
. gnu.xml.dom.* ... DOM Level 3 Core, Traversal, XPath implementation
|
| 16 |
|
|
. gnu.xml.dom.ls.* ... DOM Level 3 Load & Save implementation
|
| 17 |
|
|
. gnu.xml.xpath.* ... JAXP XPath implementation
|
| 18 |
|
|
. gnu.xml.transform.* ... JAXP XSL transformer implementation
|
| 19 |
|
|
. gnu.xml.pipeline.* ... SAX2 event pipeline support
|
| 20 |
|
|
. gnu.xml.stream.* ... StAX pull parser implementation
|
| 21 |
|
|
. gnu.xml.util.* ... various XML utility classes
|
| 22 |
|
|
. gnu.xml.libxmlj.dom.* ... libxmlj DOM Level 3 Core and XPath
|
| 23 |
|
|
. gnu.xml.libxmlj.sax.* ... libxmlj SAX parser
|
| 24 |
|
|
. gnu.xml.libxmlj.transform.* ... libxmlj XSL transformer
|
| 25 |
|
|
. gnu.xml.libxmlj.util.* ... libxmlj utility classes
|
| 26 |
|
|
|
| 27 |
|
|
In the external directory you can find the following packages.
|
| 28 |
|
|
They are not maintained as part of GNU Classpath, but are used by the
|
| 29 |
|
|
classes in the above packages.
|
| 30 |
|
|
|
| 31 |
|
|
. org.xml.sax.* ... SAX2 interfaces
|
| 32 |
|
|
. org.w3c.dom.* ... DOM Level 3 interfaces
|
| 33 |
|
|
|
| 34 |
|
|
CONFORMANCE
|
| 35 |
|
|
|
| 36 |
|
|
The primary test resources are at http://xmlconf.sourceforge.net
|
| 37 |
|
|
and include:
|
| 38 |
|
|
|
| 39 |
|
|
SAX2/XML conformance tests
|
| 40 |
|
|
That the "xml.testing.Driver" addresses the core XML 1.0
|
| 41 |
|
|
specification requirements, which closely correspond to the
|
| 42 |
|
|
functionality SAX1 provides. The driver uses SAX2 APIs to
|
| 43 |
|
|
test that functionality It is used with a bugfixed version of
|
| 44 |
|
|
the NIST/OASIS XML conformance test cases.
|
| 45 |
|
|
|
| 46 |
|
|
The AElfred2 parser is highly conformant, though it still takes
|
| 47 |
|
|
a few implementation shortcuts. See its package documentation
|
| 48 |
|
|
for information about known XML conformance issues in AElfred2.
|
| 49 |
|
|
|
| 50 |
|
|
The primary issue is using Unicode character tables, rather than
|
| 51 |
|
|
those in the XML specification, for determining what names are
|
| 52 |
|
|
valid. Most applications won't notice the difference, and this
|
| 53 |
|
|
solution is smaller and faster than the alternative.
|
| 54 |
|
|
|
| 55 |
|
|
For validation, a secondary issue is that issues relating to
|
| 56 |
|
|
entity modularity are not validated; they can't all be cleanly
|
| 57 |
|
|
layered. For example, validity constraints related to standalone
|
| 58 |
|
|
declarations and PE nesting are not checked.
|
| 59 |
|
|
|
| 60 |
|
|
The current implementation has also been tested against Elliotte
|
| 61 |
|
|
Rusty Harold's SAXTest test suite (http://www.cafeconleche.org/SAXTest)
|
| 62 |
|
|
and achieves approximately 93% conformance to the SAX specification
|
| 63 |
|
|
according to these tests, higher than any other current Java parser.
|
| 64 |
|
|
|
| 65 |
|
|
SAX2
|
| 66 |
|
|
SAX2 API conformance currently has a minimal JUNIT (0.2) test suite,
|
| 67 |
|
|
which can be accessed at the xmlconf site listed above. It does
|
| 68 |
|
|
not cover namespaces or LexicalHandler and Declhandler extensions
|
| 69 |
|
|
anywhere as exhaustively as the SAX1 level functionality is
|
| 70 |
|
|
tested by the "xml.testing.Driver". However:
|
| 71 |
|
|
|
| 72 |
|
|
- Applying the DOM unit tests to this implementation gives
|
| 73 |
|
|
the LexicalHandler (comments, and boundaries of DTDs,
|
| 74 |
|
|
CDATA sections, and general entities) a workout, and
|
| 75 |
|
|
does the same for DeclHandler entity declarations.
|
| 76 |
|
|
|
| 77 |
|
|
- The pipeline package's layered validator demands that
|
| 78 |
|
|
element and attribute declarations are reported correctly.
|
| 79 |
|
|
|
| 80 |
|
|
By those metrics, SAX2 conformance for AElfred2 is also strong.
|
| 81 |
|
|
|
| 82 |
|
|
DOM Level 3 Core Tests
|
| 83 |
|
|
The DOM implementation has been tested against the W3C DOM Level 3
|
| 84 |
|
|
Core conformance test suite (http://www.w3.org/DOM/Test/). Current
|
| 85 |
|
|
conformance according to these tests is 72.3%. Many of the test
|
| 86 |
|
|
failures are due to the fact that GNU JAXP does not currently
|
| 87 |
|
|
provide any W3C XML Schema support.
|
| 88 |
|
|
|
| 89 |
|
|
XSL transformation
|
| 90 |
|
|
The transformer and XPath implementation have been tested against
|
| 91 |
|
|
the OASIS XSLT and XPath TC test suite. Conformance against the
|
| 92 |
|
|
Xalan tests is currently 77%.
|
| 93 |
|
|
|
| 94 |
|
|
|
| 95 |
|
|
libxmlj
|
| 96 |
|
|
========================================================================
|
| 97 |
|
|
|
| 98 |
|
|
libxmlj is an effort to create a 100% JAXP-compatible Java wrapper for
|
| 99 |
|
|
libxml2 and libxslt. JAXP is the Java API for XML processing, libxml2
|
| 100 |
|
|
is the XML C library for Gnome, and libxslt is the XSLT C library for
|
| 101 |
|
|
Gnome.
|
| 102 |
|
|
|
| 103 |
|
|
libxmlj currently supports most of the DOM Level 3 Core, Traversal, and
|
| 104 |
|
|
XPath APIs, SAX2, and XSLT transformations. There is no W3C XML Schema
|
| 105 |
|
|
support yet.
|
| 106 |
|
|
|
| 107 |
|
|
libxmlj can parse and transform XML documents extremely quickly in
|
| 108 |
|
|
comparison to Java-based JAXP implementations. DOM manipulations, however,
|
| 109 |
|
|
involve JNI overhead, so the speed of DOM tree construction and traversal
|
| 110 |
|
|
can be slower than the Java implementation.
|
| 111 |
|
|
|
| 112 |
|
|
libxmlj is highly experimental, doesn't always conform to the DOM
|
| 113 |
|
|
specification correctly, and may leak memory. Production use is not advised.
|
| 114 |
|
|
|
| 115 |
|
|
The implementation can be found in gnu/xml/libxmlj and native/jni/xmlj.
|
| 116 |
|
|
See the INSTALL file for the required versions of libxml2 and libxslt.
|
| 117 |
|
|
configure --enable-xmlj will build it.
|
| 118 |
|
|
|
| 119 |
|
|
Usage
|
| 120 |
|
|
------------------------------------------------------------------------
|
| 121 |
|
|
|
| 122 |
|
|
To enable the various GNU JAXP factories, set the following system properties
|
| 123 |
|
|
(command-line version shown, but they can equally be set programmatically):
|
| 124 |
|
|
|
| 125 |
|
|
AElfred2:
|
| 126 |
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.aelfred2.JAXPFactory
|
| 127 |
|
|
|
| 128 |
|
|
GNU DOM (using DOM Level 3 Load & Save):
|
| 129 |
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.DomDocumentBuilderFactory
|
| 130 |
|
|
|
| 131 |
|
|
GNU DOM (using AElfred-only pipeline classes):
|
| 132 |
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.JAXPFactory
|
| 133 |
|
|
|
| 134 |
|
|
GNU XSL transformer:
|
| 135 |
|
|
-Djavax.xml.transform.TransformerFactory=gnu.xml.transform.TransformerFactoryImpl
|
| 136 |
|
|
|
| 137 |
|
|
GNU StAX:
|
| 138 |
|
|
-Djavax.xml.stream.XMLEventFactory=gnu.xml.stream.XMLEventFactoryImpl
|
| 139 |
|
|
-Djavax.xml.stream.XMLInputFactory=gnu.xml.stream.XMLInputFactoryImpl
|
| 140 |
|
|
-Djavax.xml.stream.XMLOutputFactory=gnu.xml.stream.XMLOutputFactoryImpl
|
| 141 |
|
|
|
| 142 |
|
|
libxmlj SAX:
|
| 143 |
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.libxmlj.sax.GnomeSAXParserFactory
|
| 144 |
|
|
|
| 145 |
|
|
libxmlj DOM:
|
| 146 |
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.libxmlj.dom.GnomeDocumentBuilderFactory
|
| 147 |
|
|
|
| 148 |
|
|
libxmlj XSL transformer:
|
| 149 |
|
|
-Djavax.xml.transform.TransformerFactory=gnu.xml.libxmlj.transform.GnomeTransformerFactory
|
| 150 |
|
|
|
| 151 |
|
|
When using libxmlj, the libxmlj shared library must be available.
|
| 152 |
|
|
In general it is picked up by the runtime using GNU Classpath. If not you
|
| 153 |
|
|
might want to try adding the directory where libxmlj.so is installed
|
| 154 |
|
|
(by default ${prefix}/lib/classpath/) with ldconfig or specifing in the
|
| 155 |
|
|
LD_LIBRARY_PATH environment variable. Additionally, you may need to specify
|
| 156 |
|
|
the location of your shared libraries to the runtime environment using the
|
| 157 |
|
|
java.library.path system property.
|
| 158 |
|
|
|
| 159 |
|
|
Missing (libxmlj) Features
|
| 160 |
|
|
------------------------------------------------------------------------
|
| 161 |
|
|
|
| 162 |
|
|
See BUGS in native/jni/xmlj for known bugs in the libxmlj native bindings.
|
| 163 |
|
|
|
| 164 |
|
|
This implementation should be thread-safe, but currently all
|
| 165 |
|
|
transformation requests are queued via Java synchronization, which
|
| 166 |
|
|
means that it effectively performs single-threaded. Long story short,
|
| 167 |
|
|
both libxml2 and libxslt are not fully reentrant.
|
| 168 |
|
|
|
| 169 |
|
|
Update: it may be possible to make libxmlj thread-safe nonetheless
|
| 170 |
|
|
using thread context variables.
|
| 171 |
|
|
|
| 172 |
|
|
Update: thread context variables have been introduced. This is very
|
| 173 |
|
|
untested though, libxmll therefore still has the single thread
|
| 174 |
|
|
bottleneck.
|