The Simple API for XML (SAX) APIs
The SAX Packages: The SAX parser is defined in the following packages.
Package | Description |
org.xml.sax | Defines the SAX interfaces. The name
"org.xml " is the package prefix that was settled
on by the group that defined the SAX API. |
org.xml.sax.ext | Defines SAX extensions that are used when doing more sophisticated SAX processing, for example, to process a document type definitions (DTD) or to see the detailed syntax for a file. |
org.xml.sax.helpers |
Contains helper classes that make it easier to use SAX -- for example, by defining a default handler that has null-methods for all of the interfaces, so you only need to override the ones you actually want to implement. |
javax.xml.parsers | Defines the SAXParserFactory
class which returns the SAXParser. Also defines exception classes for
reporting errors. |
javax.xml.parsers Package : Describing the main classes needed here
SAXParser | Defines the API that wraps an XMLReader implementation class |
SAXParserFactory | Defines a factory API that enables applications to configure and obtain a SAX based parser to parse XML documents |
org.xml.sax Package : Describing few interfaces
ContentHandler | Receive notification of the logical content of a document. |
DTDHandler | Receive notification of basic DTD-related events. |
EntityResolver | Basic interface for resolving entities. |
ErrorHandler | Basic interface for SAX error handlers. |
org.xml.sax.helpers Package : Describing the needed interface
DefaultHandler | Default base class for SAX2 event handlers. |
Understanding SAX Parser
At the very first, create an instance of the SAXParserFactory
class which generates an instance of the parser. This parser wraps a SAXReader object. When the parser's parse()
method is invoked, the reader invokes one of the several callback methods (implemented in the
application). These callback methods are defined by the interfaces ContentHandler
,
ErrorHandler
, DTDHandler
, and EntityResolver
.
Brief description of the key SAX APIs:
SAXParserFactory
- SAXParserFactory
object creates an instance of the parser determined by the system property,
using the class
javax.xml.parsers
.SAXParserFactory. SAXParser
- The
SAXParser
interface defines several kinds of parse() methods. Generally, XML data source and a DefaultHandler object is passed to the parser. This parser processes the XML file and invokes the appropriate method on the handler object. - SAXReader
- The SAXParser wraps a SAXReader (may use SAXParser's getXMLReader() and configure it). It is the SAXReader which carries on the conversation with the SAX event handlers you define.
- DefaultHandler
- Not shown in the diagram, a DefaultHandler
implements the
ContentHandler
,ErrorHandler
,DTDHandler
, andEntityResolver
interfaces (with null methods).You override only the ones you're interested in. ContentHandler
- Methods like
startDocument
,endDocument
,startElement
, andendElement
are invoked when an XML tag is recognized. This interface also defines methodscharacters
andprocessingInstruction
, which are invoked when the parser encounters the text in an XML element or an inline processing instruction, respectively. ErrorHandler
- Methods
error
,fatalError
, andwarning
are invoked in response to various parsing errors. The default error handler throws an exception for fatal errors and ignores other errors (including validation errors). To ensure the correct handling, you'll need to supply your own error handler to the parser. DTDHandler
- Defines methods you will rarely call. Used while processing a DTD to recognize and act on declarations for an unparsed entity.
EntityResolver
- The
resolveEntity
method is invoked when the parser needs to identify the data referenced by a URI. .