The experimental, loadable  DOM XML/XSLT filter module
   mod-dom.so 
    is invoked by the zebra.cfg configuration statement
    
     recordtype.xml: dom.db/filter_dom_conf.xml
    
    In this example the DOM XML filter is configured to work 
    on all data files with suffix 
    *.xml, where the configuration file is found in the
    path db/filter_dom_conf.xml.
   
The DOM XSLT filter configuration file must be valid XML. It might look like this:
 
    
    <?xml version="1.0" encoding="UTF8"?>
    <dom xmlns="http://indexdata.com/zebra-2.0">
      <input>
        <xmlreader level="1"/>
        <!-- <marc inputcharset="marc-8"/> -->
      </input>
      <extract>
         <xslt stylesheet="common2index.xsl"/>
      </extract>
      <store>
         <xslt stylesheet="common2store.xsl"/>
      </store>
      <retrieve name="dc">
        <xslt stylesheet="store2dc.xsl"/>
      </retrieve>
      <retrieve name="mods">
        <xslt stylesheet="store2mods.xsl"/>
      </retrieve>
    </dom>
    
    
     The root XML element <dom> and all other DOM
     XML filter elements are residing in the namespace 
     xmlns="http://indexdata.com/zebra-2.0".
   
    All pipeline definition elements - i.e. the
     <input>,
     <extract>,
     <store>, and 
     <retrieve> elements - are optional.
     Missing pipeline definitions are just interpreted
     do-nothing identity pipelines.
   
    All pipeline definition elements may contain zero or more 
    <xslt stylesheet="path/file.xsl"/>
    XSLT transformation instructions, which are performed
    sequentially from top to bottom.
    The paths in the stylesheet attributes
    are relative to zebras working directory, or absolute to the file
    system root.
   
    The <input> pipeline definition element
    may contain either one XML Reader definition
    <xmlreader level="1"/>, used to split
    an XML collection input stream into individual XML DOM
    documents at the prescribed element level, 
    or one MARC binary
    parsing instruction
    <marc inputcharset="marc-8"/>, which defines
    a conversion to MARCXML format DOM trees. The allowed values
    of the inputcharset attribute depend on your
    local iconv™ set-up.
   
    Both input parsers deliver individual DOM XML documents to the
    following chain of zero or more  
    <xslt stylesheet="path/file.xsl"/>
    XSLT transformations. At the end of this pipeline, the documents
    are in the common format, used to feed both the 
     <extract> and 
     <store> pipelines.
   
       The <extract> pipeline takes documents
       from any common DOM XML format to the Zebra specific
        indexing DOM XML format.
       It may consist of zero ore more 
       <xslt stylesheet="path/file.xsl"/>
       XSLT transformations, and the outcome is handled to the
       Zebra core to drive the process of building the inverted
       indexes. See
       Section 2.5, “Canonical Indexing Format” for
       details.
     
<store> pipeline takes documents
       from any common DOMXML format to the Zebra specific
        storage DOMXML format.
       It may consist of zero ore more 
       <xslt stylesheet="path/file.xsl"/>XSLT transformations, and the outcome is handled to the
       Zebra core for deposition into the internal storage system.
    
      Finally, there may be one or more 
      <retrieve> pipeline definitions, each
      of them again consisting of zero or more
      <xslt stylesheet="path/file.xsl"/>
       XSLT transformations. These are used for document
      presentation after search, and take the internal storage DOM
      XML to the requested output formats during record present
      requests.  
    
     The  possible multiple 
     <retrieve> pipeline definitions
     are distinguished by their unique name
     attributes, these are the literal schema or 
     element set names used in 
      SRW,
      SRU and
      Z39.50 protocol queries.
   
     DOM XML indexing comes in two flavors: pure
     processing-instruction governed plain XML documents, and - very
     similar to the Alvis filter indexing format - XML documents
     containing XML <record> and
     <index> instructions from the magic
     namespace xmlns:z="http://indexdata.com/zebra-2.0".
    
The output of the processing instruction driven 
      indexing XSLT stylesheets must contain
      processing instructions named 
       zebra-2.0. 
      The output of the XSLT indexing transformation is then
      parsed using DOM methods, and the contained instructions are
      performed on the elements and their
      subtrees directly following the processing instructions.
      
For example, the output of the command
  
       xsltproc dom-index-pi.xsl marc-one.xml
     might look like this:
      
      <?xml version="1.0" encoding="UTF-8"?>
      <?zebra-2.0 record id=11224466 rank=42?>
      <record>
        <?zebra-2.0 index control:0?>
        <control>11224466</control>
        <?zebra-2.0 index any:w title:w title:p title:s?>
        <title>How to program a computer</title>
      </record>
      
     
The output of the indexing XSLT stylesheets must contain
    certain elements in the magic 
     xmlns:z="http://indexdata.com/zebra-2.0"
    namespace. The output of the XSLT indexing transformation is then
    parsed using DOM methods, and the contained instructions are
    performed on the magic elements and their
    subtrees.
    
For example, the output of the command
   
      xsltproc dom-index-element.xsl marc-one.xml 
     might look like this:
      
      <?xml version="1.0" encoding="UTF-8"?>
      <z:record xmlns:z="http://indexdata.com/zebra-2.0" 
                z:id="11224466" z:rank="42">
          <z:index name="control:0">11224466</z:index>
          <z:index name="any:w title:w title:p title:s">
                    How to program a computer</z:index>
      </z:record>
      
     
Both indexing formats are defined with equal semantics and behavior in mind:
Zebra specific instructions are either
         processing instructions named
         zebra-2.0 or
         elements contained in the namespace
         xmlns:z="http://indexdata.com/zebra-2.0".
         
There must be exactly one record
           instruction, which sets the scope for the following,
           possibly nested index instructions. 
         
            The unique record instruction
	    may have additional attributes id,
            rank and type.
            Attribute id is the value of the opaque ID
            and may be any string not containing the whitespace character 
            ' '.
            The rank attribute value must be a
            non-negative integer. See 
            Section 9, “Relevance Ranking and Sorting of Result Sets” .
            The type attribute specifies how the record
            is to be treated. The following values may be given for 
            type:
            
insertThe record is inserted. If the record already exists, it is skipped (i.e. not replaced).
replaceThe record is replaced. If the record does not already exist, it is skipped (i.e. not inserted).
deleteThe record is deleted. If the record does not already exist, it is skipped (i.e. nothing is deleted).
updateThe record is inserted or replaced depending on whether the record exists or not. This is the default behavior but may be effectively changed by "outside" the scope of the DOM filter by zebraidx commands or extended services updates.
          Note that the value of type is only used to
          determine the action if and only if the Zebra indexer is running
          in "update" mode (i.e zebraidx update) or if the specialUpdate
          action of the
          Extended
          Service Update is used.
          For this reason a specialUpdate may end up deleting records!
         
 Multiple and possible nested index
         instructions must contain at least one 
         indexname:indextype
         pair, and may contain multiple such pairs separated by the 
         whitespace character  ' '. In each index
         pair, the name and the type of the index is separated by a 
         colon character ':'.
         
Any index name consisting of ASCII letters, and following the standard Zebra rules will do, see Section 3.5.1, “Mapping of PQF APT access points”.
 
         Index types are restricted to the values defined in
         the standard configuration
         file default.idx, see
         Section 2.3, “BIB-1 Attribute Set” and 
         Chapter 10, Field Structure and Character Sets
   for details.
         
 
         DOM input documents which are not resulting in both one
         unique valid 
         record instruction and one or more valid 
         index instructions can not be searched and
         found. Therefore,
         invalid document processing is aborted, and any content of
         the <extract> and 
         <store> pipelines is discarded.
          A warning is issued in the logs. 
         
The examples work as follows: 
     From the original XML file 
     marc-one.xml (or from the XML record DOM of the
     same form coming from an <input>
     pipeline),
     the indexing
     pipeline <extract>
     produces an indexing XML record, which is defined by
     the record instruction
     Zebra uses the content of 
     z:id="11224466" 
     or
     id=11224466 
     as internal
     record ID, and - in case static ranking is set - the content of 
     rank=42
     or
     z:rank="42"
     as static rank.
    
In these examples, the following literal indexes are constructed:
       any:w
       control:0
       title:w
       title:p
       title:s
     
     where the indexing type is defined after the 
     literal ':' character.  
     Any value from the standard configuration
     file default.idx will do.
     Finally, any 
     text() node content recursively contained
     inside the <z:index> element, or any
     element following a index processing instruction,
     will be filtered through the
     appropriate char map for character normalization, and will be
     inserted in the named indexes.
    
Finally, this example configuration can be queried using PQF queries, either transported by Z39.50, (here using a yaz-client)
      
      Z> open localhost:9999
      Z> elem dc
      Z> form xml
      Z>
      Z> find @attr 1=control @attr 4=3 11224466
      Z> scan @attr 1=control @attr 4=3 ""
      Z>
      Z> find @attr 1=title program
      Z> scan @attr 1=title ""
      Z>
      Z> find @attr 1=title @attr 4=2 "How to program a computer"
      Z> scan @attr 1=title @attr 4=2 ""
      
     
     or the proprietary
     extensions x-pquery and
     x-pScanClause to
     SRU, and SRW
     
      
      http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=@attr 1=title program
      http://localhost:9999/?version=1.1&operation=scan&x-pScanClause=@attr 1=title ""
      
     See the section called “The SRU Server” for more information on SRU/SRW configuration, and the section called “YAZ server virtual hosts” or the YAZ CQL section for the details or the YAZ frontend server.
     Notice that there are no *.abs,
     *.est, *.map, or other GRS-1
     filter configuration files involves in this process, and that the
     literal index names are used during search and retrieval.
    
     In case that we want to support the usual
     bib-1 Z39.50 numeric access points, it is a
     good idea to choose string index names defined in the default
     configuration file tab/bib1.att, see   
     Section 3.4, “The Attribute Set (.att) Files”