1 |
767 |
jeremybenn |
This file describes the jaxp (xml processing) implementation of GNU Classpath.
|
2 |
|
|
GNU Classpath includes interfaces and implementations for basic XML processing
|
3 |
|
|
in in the java programming language, some general purpose SAX2 utilities, and
|
4 |
|
|
transformation.
|
5 |
|
|
|
6 |
|
|
These classes used to be maintained as part of an external project GNU JAXP
|
7 |
|
|
but are now integrated with the rest of the core class library provided by
|
8 |
|
|
GNU Classpath.
|
9 |
|
|
|
10 |
|
|
PACKAGES
|
11 |
|
|
|
12 |
|
|
. javax.xml.* ... JAXP 1.3 interfaces
|
13 |
|
|
|
14 |
|
|
. gnu.xml.aelfred2.* ... SAX2 parser + validator
|
15 |
|
|
. gnu.xml.dom.* ... DOM Level 3 Core, Traversal, XPath implementation
|
16 |
|
|
. gnu.xml.dom.ls.* ... DOM Level 3 Load & Save implementation
|
17 |
|
|
. gnu.xml.xpath.* ... JAXP XPath implementation
|
18 |
|
|
. gnu.xml.transform.* ... JAXP XSL transformer implementation
|
19 |
|
|
. gnu.xml.pipeline.* ... SAX2 event pipeline support
|
20 |
|
|
. gnu.xml.stream.* ... StAX pull parser and SAX-over-StAX driver
|
21 |
|
|
. gnu.xml.util.* ... various XML utility classes
|
22 |
|
|
. gnu.xml.libxmlj.dom.* ... libxmlj DOM Level 3 Core and XPath
|
23 |
|
|
. gnu.xml.libxmlj.sax.* ... libxmlj SAX parser
|
24 |
|
|
. gnu.xml.libxmlj.transform.* ... libxmlj XSL transformer
|
25 |
|
|
. gnu.xml.libxmlj.util.* ... libxmlj utility classes
|
26 |
|
|
|
27 |
|
|
In the external directory you can find the following packages.
|
28 |
|
|
They are not maintained as part of GNU Classpath, but are used by the
|
29 |
|
|
classes in the above packages.
|
30 |
|
|
|
31 |
|
|
. org.xml.sax.* ... SAX2 interfaces
|
32 |
|
|
. org.w3c.dom.* ... DOM Level 3 interfaces
|
33 |
|
|
. org.relaxng.datatype.* ... RELAX NG pluggable datatypes API
|
34 |
|
|
|
35 |
|
|
CONFORMANCE
|
36 |
|
|
|
37 |
|
|
The primary test resources are at http://xmlconf.sourceforge.net
|
38 |
|
|
and include:
|
39 |
|
|
|
40 |
|
|
SAX2/XML conformance tests
|
41 |
|
|
That the "xml.testing.Driver" addresses the core XML 1.0
|
42 |
|
|
specification requirements, which closely correspond to the
|
43 |
|
|
functionality SAX1 provides. The driver uses SAX2 APIs to
|
44 |
|
|
test that functionality It is used with a bugfixed version of
|
45 |
|
|
the NIST/OASIS XML conformance test cases.
|
46 |
|
|
|
47 |
|
|
The AElfred2 parser is highly conformant, though it still takes
|
48 |
|
|
a few implementation shortcuts. See its package documentation
|
49 |
|
|
for information about known XML conformance issues in AElfred2.
|
50 |
|
|
|
51 |
|
|
The primary issue is using Unicode character tables, rather than
|
52 |
|
|
those in the XML specification, for determining what names are
|
53 |
|
|
valid. Most applications won't notice the difference, and this
|
54 |
|
|
solution is smaller and faster than the alternative.
|
55 |
|
|
|
56 |
|
|
For validation, a secondary issue is that issues relating to
|
57 |
|
|
entity modularity are not validated; they can't all be cleanly
|
58 |
|
|
layered. For example, validity constraints related to standalone
|
59 |
|
|
declarations and PE nesting are not checked.
|
60 |
|
|
|
61 |
|
|
The current implementation has also been tested against Elliotte
|
62 |
|
|
Rusty Harold's SAXTest test suite (http://www.cafeconleche.org/SAXTest)
|
63 |
|
|
and achieves approximately 93% conformance to the SAX specification
|
64 |
|
|
according to these tests, higher than any other current Java parser.
|
65 |
|
|
|
66 |
|
|
SAX2
|
67 |
|
|
SAX2 API conformance currently has a minimal JUNIT (0.2) test suite,
|
68 |
|
|
which can be accessed at the xmlconf site listed above. It does
|
69 |
|
|
not cover namespaces or LexicalHandler and Declhandler extensions
|
70 |
|
|
anywhere as exhaustively as the SAX1 level functionality is
|
71 |
|
|
tested by the "xml.testing.Driver". However:
|
72 |
|
|
|
73 |
|
|
- Applying the DOM unit tests to this implementation gives
|
74 |
|
|
the LexicalHandler (comments, and boundaries of DTDs,
|
75 |
|
|
CDATA sections, and general entities) a workout, and
|
76 |
|
|
does the same for DeclHandler entity declarations.
|
77 |
|
|
|
78 |
|
|
- The pipeline package's layered validator demands that
|
79 |
|
|
element and attribute declarations are reported correctly.
|
80 |
|
|
|
81 |
|
|
By those metrics, SAX2 conformance for AElfred2 is also strong.
|
82 |
|
|
|
83 |
|
|
DOM Level 3 Core Tests
|
84 |
|
|
The DOM implementation has been tested against the W3C DOM Level 3
|
85 |
|
|
Core conformance test suite (http://www.w3.org/DOM/Test/). Current
|
86 |
|
|
conformance according to these tests is 72.3%. Many of the test
|
87 |
|
|
failures are due to the fact that GNU JAXP does not currently
|
88 |
|
|
provide any W3C XML Schema support.
|
89 |
|
|
|
90 |
|
|
XSL transformation
|
91 |
|
|
The transformer and XPath implementation have been tested against
|
92 |
|
|
the OASIS XSLT and XPath TC test suite. Conformance against the
|
93 |
|
|
Xalan tests is currently 77%.
|
94 |
|
|
|
95 |
|
|
|
96 |
|
|
libxmlj
|
97 |
|
|
========================================================================
|
98 |
|
|
|
99 |
|
|
libxmlj is an effort to create a 100% JAXP-compatible Java wrapper for
|
100 |
|
|
libxml2 and libxslt. JAXP is the Java API for XML processing, libxml2
|
101 |
|
|
is the XML C library for Gnome, and libxslt is the XSLT C library for
|
102 |
|
|
Gnome.
|
103 |
|
|
|
104 |
|
|
libxmlj currently supports most of the DOM Level 3 Core, Traversal, and
|
105 |
|
|
XPath APIs, SAX2, and XSLT transformations. There is no W3C XML Schema
|
106 |
|
|
support yet.
|
107 |
|
|
|
108 |
|
|
libxmlj can parse and transform XML documents extremely quickly in
|
109 |
|
|
comparison to Java-based JAXP implementations. DOM manipulations, however,
|
110 |
|
|
involve JNI overhead, so the speed of DOM tree construction and traversal
|
111 |
|
|
can be slower than the Java implementation.
|
112 |
|
|
|
113 |
|
|
libxmlj is highly experimental, doesn't always conform to the DOM
|
114 |
|
|
specification correctly, and may leak memory. Production use is not advised.
|
115 |
|
|
|
116 |
|
|
The implementation can be found in gnu/xml/libxmlj and native/jni/xmlj.
|
117 |
|
|
See the INSTALL file for the required versions of libxml2 and libxslt.
|
118 |
|
|
configure --enable-xmlj will build it.
|
119 |
|
|
|
120 |
|
|
Usage
|
121 |
|
|
------------------------------------------------------------------------
|
122 |
|
|
|
123 |
|
|
To enable the various GNU JAXP factories, set the following system properties
|
124 |
|
|
(command-line version shown, but they can equally be set programmatically):
|
125 |
|
|
|
126 |
|
|
AElfred2:
|
127 |
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.aelfred2.JAXPFactory
|
128 |
|
|
|
129 |
|
|
GNU DOM (using DOM Level 3 Load & Save):
|
130 |
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.DomDocumentBuilderFactory
|
131 |
|
|
|
132 |
|
|
GNU DOM (using AElfred-only pipeline classes):
|
133 |
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.JAXPFactory
|
134 |
|
|
|
135 |
|
|
GNU XSL transformer:
|
136 |
|
|
-Djavax.xml.transform.TransformerFactory=gnu.xml.transform.TransformerFactoryImpl
|
137 |
|
|
|
138 |
|
|
GNU StAX:
|
139 |
|
|
-Djavax.xml.stream.XMLEventFactory=gnu.xml.stream.XMLEventFactoryImpl
|
140 |
|
|
-Djavax.xml.stream.XMLInputFactory=gnu.xml.stream.XMLInputFactoryImpl
|
141 |
|
|
-Djavax.xml.stream.XMLOutputFactory=gnu.xml.stream.XMLOutputFactoryImpl
|
142 |
|
|
|
143 |
|
|
GNU SAX-over-StAX:
|
144 |
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.stream.SAXParserFactory
|
145 |
|
|
|
146 |
|
|
libxmlj SAX:
|
147 |
|
|
-Djavax.xml.parsers.SAXParserFactory=gnu.xml.libxmlj.sax.GnomeSAXParserFactory
|
148 |
|
|
|
149 |
|
|
libxmlj DOM:
|
150 |
|
|
-Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.libxmlj.dom.GnomeDocumentBuilderFactory
|
151 |
|
|
|
152 |
|
|
libxmlj XSL transformer:
|
153 |
|
|
-Djavax.xml.transform.TransformerFactory=gnu.xml.libxmlj.transform.GnomeTransformerFactory
|
154 |
|
|
|
155 |
|
|
When using libxmlj, the libxmlj shared library must be available.
|
156 |
|
|
In general it is picked up by the runtime using GNU Classpath. If not you
|
157 |
|
|
might want to try adding the directory where libxmlj.so is installed
|
158 |
|
|
(by default ${prefix}/lib/classpath/) with ldconfig or specifying in the
|
159 |
|
|
LD_LIBRARY_PATH environment variable. Additionally, you may need to specify
|
160 |
|
|
the location of your shared libraries to the runtime environment using the
|
161 |
|
|
java.library.path system property.
|
162 |
|
|
|
163 |
|
|
Missing (libxmlj) Features
|
164 |
|
|
------------------------------------------------------------------------
|
165 |
|
|
|
166 |
|
|
See BUGS in native/jni/xmlj for known bugs in the libxmlj native bindings.
|
167 |
|
|
|
168 |
|
|
This implementation should be thread-safe, but currently all
|
169 |
|
|
transformation requests are queued via Java synchronization, which
|
170 |
|
|
means that it effectively performs single-threaded. Long story short,
|
171 |
|
|
both libxml2 and libxslt are not fully reentrant.
|
172 |
|
|
|
173 |
|
|
Update: it may be possible to make libxmlj thread-safe nonetheless
|
174 |
|
|
using thread context variables.
|
175 |
|
|
|
176 |
|
|
Update: thread context variables have been introduced. This is very
|
177 |
|
|
untested though, libxmlj therefore still has the single thread
|
178 |
|
|
bottleneck.
|
179 |
|
|
|
180 |
|
|
|
181 |
|
|
Validation
|
182 |
|
|
===================================================
|
183 |
|
|
|
184 |
|
|
Pluggable datatypes
|
185 |
|
|
---------------------------------------------------
|
186 |
|
|
Validators should use the RELAX NG pluggable datatypes API to retrieve
|
187 |
|
|
datatype (XML Schema simple type) implementations in a schema-neutral
|
188 |
|
|
fashion. The following code demonstrates looking up a W3C XML Schema
|
189 |
|
|
nonNegativeInteger datatype:
|
190 |
|
|
|
191 |
|
|
DatatypeLibrary xsd = DatatypeLibraryLoader
|
192 |
|
|
.createDatatypeLibrary(XMLConstants.W3C_XML_SCHEMA_NS_URI);
|
193 |
|
|
Datatype nonNegativeInteger = xsd.createDatatype("nonNegativeInteger");
|
194 |
|
|
|
195 |
|
|
It is also possible to create new types by derivation. For instance,
|
196 |
|
|
to create a datatype that will match a US ZIP code:
|
197 |
|
|
|
198 |
|
|
DatatypeBuilder b = xsd.createDatatypeBuilder("string");
|
199 |
|
|
b.addParameter("pattern", "(^[0-9]{5}$)|(^[0-9]{5}-[0-9]{4}$)");
|
200 |
|
|
Datatype zipCode = b.createDatatype();
|
201 |
|
|
|
202 |
|
|
A datatype library implementation for XML Schema is provided; other
|
203 |
|
|
library implementations may be added.
|
204 |
|
|
|