The Zebra internal query engine has been extended to specific needs
    not covered by the bib-1 attribute set query
    model. These extensions are non-standard
    and non-portable: most functional extensions
    are modeled over the bib-1 attribute set,
    defining type 7 and higher values.
    There are also the special 
    string type index names for the
    idxpath attribute set.  
   
     Zebra defines a hardwired string index name
     called _ALLRECORDS. It matches any record
     contained in the database, if used in conjunction with 
     the relation attribute 
     AlwaysMatches (103).
     
     The _ALLRECORDS index name is used for total database
     export. The search term is ignored, it may be empty.
     
      Z> find @attr 1=_ALLRECORDS @attr 2=103 ""
     
     Combination with other index types can be made. For example, to
     find all records which are not indexed in
     the Title register, issue one of the two
     equivalent queries:
     
      Z> find @not @attr 1=_ALLRECORDS @attr 2=103 "" @attr 1=Title @attr 2=103 ""
      Z> find @not @attr 1=_ALLRECORDS @attr 2=103 "" @attr 1=4 @attr 2=103 ""
     
      The special string index _ALLRECORDS is
      experimental, and the provided functionality and syntax may very
      well change in future releases of Zebra.
     
     Zebra extends the BIB-1 attribute types, and these extensions are
     recognized regardless of attribute 
     set used in a search operation query.
    
Table 5.9. Zebra Search Attribute Extensions
| Name | Value | Operation | Zebra version | 
|---|---|---|---|
| Embedded Sort | 7 | search | 1.1 | 
| Term Set | 8 | search | 1.1 | 
| Rank Weight | 9 | search | 1.1 | 
| Term Reference | 10 | search | 1.4 | 
| Local Approx Limit | 11 | search | 1.4 | 
| Global Approx Limit | 12 | search | 2.0.8 | 
| Maximum number of truncated terms (truncmax) | 13 | search | 2.0.10 | 
| Specifies whether un-indexed fields should be ignored. A zero value (default) throws a diagnostic when an un-indexed field is specified. A non-zero value makes it return 0 hits. | 14 | search | 2.0.16 | 
The embedded sort is a way to specify sort within a query - thus removing the need to send a Sort Request separately. It is both faster and does not require clients to deal with the Sort Facility.
      All ordering operations are based on a lexicographical ordering, 
      except when the 
      structure attribute numeric (109) is used. In
      this case, ordering is numerical. See 
      Section 2.4.3, “Structure Attributes (type 4)”.
     
      The possible values after attribute type 7 are
      1 ascending and 
      2 descending. 
      The attributes+term (APT) node is separate from the
      rest and must be @or'ed. 
      The term associated with APT is the sorting level in integers,
      where 0 means primary sort, 
      1 means secondary sort, and so forth.
      See also Section 9, “Relevance Ranking and Sorting of Result Sets”.
     
For example, searching for water, sort by title (ascending)
       Z> find @or @attr 1=1016 water @attr 7=1 @attr 1=4 0
      
Or, searching for water, sort by title ascending, then date descending
       Z> find @or @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 @attr 7=2 @attr 1=30 1
      
Rank weight is a way to pass a value to a ranking algorithm - so that one APT has one value - while another as a different one. See also Section 9, “Relevance Ranking and Sorting of Result Sets”.
For example, searching for utah in title with weight 30 as well as any with weight 20:
  
       Z> find @attr 2=102 @or @attr 9=30 @attr 1=4 utah @attr 9=20 utah
      
Zebra supports the searchResult-1 facility. If the Term Reference Attribute (type 10) is given, that specifies a subqueryId value returned as part of the search result. It is a way for a client to name an APT part of a query.
Experimental. Do not use in production code.
Zebra computes - unless otherwise configured - the exact hit count for every APT (leaf) in the query tree. These hit counts are returned as part of the searchResult-1 facility in the binary encoded Z39.50 search response packages.
By setting an estimation limit size of the resultset of the APT leaves, Zebra stops processing the result set when the limit length is reached. Hit counts under this limit are still precise, but hit counts over it are estimated using the statistics gathered from the chopped result set.
      Specifying a limit of 0 results in exact hit counts.
     
For example, we might be interested in exact hit count for a, but for b we allow hit count estimates for 1000 and higher.
       Z> find @and a @attr 11=1000 b
      
The estimated hit count facility makes searches faster, as one only needs to process large hit lists partially. It is mostly used in huge databases, where you you want trade exactness of hit counts against speed of execution.
Do not use approximative hit count limits in conjunction with relevance ranking, as re-sorting of the result set only works when the entire result set has been processed.
      By default Zebra computes precise hit counts for a query as
      a whole. Setting attribute 12 makes it perform approximative
      hit counts instead. It has the same semantics as 
      estimatehits for the Section 2, “The Zebra Configuration File”.
     
The attribute (12) can occur anywhere in the query tree. Unlike regular attributes it does not relate to the leaf (APT) - but to the whole query.
Do not use approximative hit count limits in conjunction with relevance ranking, as re-sorting of the result set only works when the entire result set has been processed.
Zebra extends the Bib1 attribute types, and these extensions are recognized regardless of attribute set used in a scan operation query.
Table 5.10. Zebra Scan Attribute Extensions
| Name | Type | Operation | Zebra version | 
|---|---|---|---|
| Result Set Narrow | 8 | scan | 1.3 | 
| Approximative Limit | 12 | scan | 2.0.20 | 
      If attribute Result Set Narrow (type 8)
      is given for scan, the value is the name of a
      result set. Each hit count in scan is 
      @and'ed with the result set given. 
     
Consider for example the case of scanning all title fields around the scanterm mozart, then refining the scan by issuing a filtering query for amadeus to restrict the scan to the result set of the query:
      Z> scan @attr 1=4 mozart 
      ...
      * mozart (43)
        mozartforskningen (1)
        mozartiana (1)
        mozarts (16)
      ...
      Z> f @attr 1=4 amadeus   
      ...
      Number of hits: 15, setno 2
      ...
      Z> scan @attr 1=4 @attr 8=2 mozart
      ...
      * mozart (14)
        mozartforskningen (0)
        mozartiana (0)
        mozarts (1)
      ...
      
Zebra 2.0.2 and later is able to skip 0 hit counts. This, however, is known not to scale if the number of terms to skip is high. This most likely will happen if the result set is small (and result in many 0 hits).
     The attribute-set idxpath consists of a single 
     Use (type 1) attribute. All non-use attributes behave as normal. 
    
     This feature is enabled when defining the
     xpath enable option in the GRS-1 filter
     *.abs configuration files. If one wants to use
     the special idxpath numeric attribute set, the
     main Zebra configuration file zebra.cfg
     directive attset: idxpath.att must be enabled.
    
      The idxpath is deprecated, may not be
      supported in future Zebra versions, and should definitely
      not be used in production code.
     
This attribute set allows one to search GRS-1 filter indexed records by XPATH like structured index names.
       The idxpath option defines hard-coded
       index names, which might clash with your own index names.
      
Table 5.11. Zebra specific IDXPATH Use Attributes (type 1)
| IDXPATH | Value | String Index | Notes | 
|---|---|---|---|
| XPATH Begin | 1 | _XPATH_BEGIN | deprecated | 
| XPATH End | 2 | _XPATH_END | deprecated | 
| XPATH CData | 1016 | _XPATH_CDATA | deprecated | 
| XPATH Attribute Name | 3 | _XPATH_ATTR_NAME | deprecated | 
| XPATH Attribute CData | 1015 | _XPATH_ATTR_CDATA | deprecated | 
      See tab/idxpath.att for more information.
     
      Search for all documents starting with root element 
      /root (either using the numeric or the string
      use attributes):
      
       Z> find @attrset idxpath @attr 1=1 @attr 4=3 root/ 
       Z> find @attr idxpath 1=1 @attr 4=3 root/ 
       Z> find @attr 1=_XPATH_BEGIN @attr 4=3 root/ 
      
      Search for all documents where specific nested XPATH 
      /c1/c2/../cn exists. Notice the very
      counter-intuitive reverse notation!
      
       Z> find @attrset idxpath @attr 1=1 @attr 4=3 cn/cn-1/../c1/ 
       Z> find @attr 1=_XPATH_BEGIN @attr 4=3 cn/cn-1/../c1/ 
      
Search for CDATA string text in any element
       Z> find @attrset idxpath @attr 1=1016 text
       Z> find @attr 1=_XPATH_CDATA text
      
Search for CDATA string anothertext in any attribute:
 
       Z> find @attrset idxpath @attr 1=1015 anothertext
       Z> find @attr 1=_XPATH_ATTR_CDATA anothertext
      
Search for all documents with have an XML element node including an XML attribute named creator
 
       Z> find @attrset idxpath @attr 1=3 @attr 4=3 creator 
       Z> find @attr 1=_XPATH_ATTR_NAME @attr 4=3 creator 
      
      Combining usual bib-1 attribute set searches
      with idxpath attribute set searches:
      
       Z> find @and @attr idxpath 1=1 @attr 4=3 link/ @attr 1=4 mozart
       Z> find @and @attr 1=_XPATH_BEGIN @attr 4=3 link/ @attr 1=_XPATH_CDATA mozart
      
      Scanning is supported on all idxpath
      indexes, both specified as numeric use attributes, or as string
      index names. 
      
       Z> scan  @attrset idxpath @attr 1=1016 text
       Z> scan  @attr 1=_XPATH_ATTR_CDATA anothertext
       Z> scan  @attrset idxpath @attr 1=3 @attr 4=3 ''
      
The rules for PQF APT mapping are rather tricky to grasp in the first place. We deal first with the rules for deciding which internal register or string index to use, according to the use attribute or access point specified in the query. Thereafter we deal with the rules for determining the correct structure type of the named register.
Zebra understands four fundamental different types of access points, of which only the numeric use attribute type access points are defined by the Z39.50 standard. All other access point types are Zebra specific, and non-portable.
Table 5.12. Access point name mapping
| Access Point | Type | Grammar | Notes | 
|---|---|---|---|
| Use attribute | numeric | [1-9][1-9]* | directly mapped to string index name | 
| String index name | string | [a-zA-Z](\-?[a-zA-Z0-9])* | normalized name is used as internal string index name | 
| Zebra internal index name | zebra | _[a-zA-Z](_?[a-zA-Z0-9])* | hardwired internal string index name | 
| XPATH special index | XPath | /.* | special xpath search for GRS-1 indexed records | 
      Attribute set names and 
      string index names are normalizes
      according to the following rules: all single
      hyphens '-' are stripped, and all upper case
      letters are folded to lower case.
     
      Numeric use attributes are mapped 
      to the Zebra internal
      string index according to the attribute set definition in use.
      The default attribute set is BIB-1, and may be
      omitted in the PQF query.
     
According to normalization and numeric use attribute mapping, it follows that the following PQF queries are considered equivalent (assuming the default configuration has not been altered):
      Z> find  @attr 1=Body-of-text serenade
      Z> find  @attr 1=bodyoftext serenade
      Z> find  @attr 1=BodyOfText serenade
      Z> find  @attr 1=bO-d-Y-of-tE-x-t serenade
      Z> find  @attr 1=1010 serenade
      Z> find  @attrset BIB-1 @attr 1=1010 serenade
      Z> find  @attrset bib1 @attr 1=1010 serenade
      Z> find  @attrset Bib1 @attr 1=1010 serenade
      Z> find  @attrset b-I-b-1 @attr 1=1010 serenade
     
      The numerical
      use attributes (type 1)  
      are interpreted according to the
      attribute sets which have been loaded in the
      zebra.cfg file, and are matched against specific
      fields as specified in the .abs file which
      describes the profile of the records which have been loaded.
      If no use attribute is provided, a default of 
      BIB-1 Use Any (1016) is assumed.
      The predefined use attribute sets
      can be reconfigured by  tweaking the configuration files
      tab/*.att, and 
      new attribute sets can be defined by adding similar files in the
      configuration path profilePath of the server.  
    
      String indexes can be accessed directly,
      independently which attribute set is in use. These are just
      ignored. The above mentioned name normalization applies.
      String index names are defined in the
      used indexing  filter configuration files, for example in the
      GRS-1 
      *.abs configuration files, or in the
      alvis filter XSLT indexing stylesheets.
     
      Zebra internal indexes can be accessed directly,
      according to the same rules as the user defined
      string indexes. The only difference is that   
      Zebra internal index names are hardwired,
      all uppercase and
      must start with the character '_'. 
     
      Finally, XPATH access points are only
      available using the GRS-1 filter for indexing.
      These access point names must start with the character
      '/', they are not
      normalized, but passed unaltered to the Zebra internal
      XPATH engine. See Section 2.1.6, “Zebra's special access point of type 'XPath' 
      for GRS-1 filters”.
     
Internally Zebra has in its default configuration several different types of registers or indexes, whose tokenization and character normalization rules differ. This reflects the fact that searching fundamental different tokens like dates, numbers, bitfields and string based text needs different rule sets.
Table 5.13. Structure and completeness mapping to register types
| Structure | Completeness | Register type | Notes | 
|---|---|---|---|
| phrase (@attr 4=1), word (@attr 4=2), word-list (@attr 4=6), free-form-text (@attr 4=105), or document-text (@attr 4=106) | Incomplete field (@attr 6=1) | Word ('w') | Traditional tokenized and character normalized word index | 
| phrase (@attr 4=1), word (@attr 4=2), word-list (@attr 4=6), free-form-text (@attr 4=105), or document-text (@attr 4=106) | complete field' (@attr 6=3) | Phrase ('p') | Character normalized, but not tokenized index for phrase matches | 
| urx (@attr 4=104) | ignored | URX/URL ('u') | Special index for URL web addresses | 
| numeric (@attr 4=109) | ignored | Numeric ('n') | Special index for digital numbers | 
| key (@attr 4=3) | ignored | Null bitmap ('0') | Used for non-tokenized and non-normalized bit sequences | 
| year (@attr 4=4) | ignored | Year ('y') | Non-tokenized and non-normalized 4 digit numbers | 
| date (@attr 4=5) | ignored | Date ('d') | Non-tokenized and non-normalized ISO date strings | 
| ignored | ignored | Sort ('s') | Used with special sort attribute set (@attr 7=1, @attr 7=2) | 
| overruled | overruled | special | Internal record ID register, used whenever Relation Always Matches (@attr 2=103) is specified | 
     If a Structure attribute of
     Phrase is used in conjunction with a
     Completeness attribute of
     Complete (Sub)field, the term is matched
     against the contents of the phrase (long word) register, if one
     exists for the given Use attribute.
     A phrase register is created for those fields in the
     GRS-1 *.abs file that contains a
     p-specifier.
      
       Z> scan @attr 1=Title @attr 4=1 @attr 6=3 beethoven 
       ...
       bayreuther festspiele (1)
       * beethoven bibliography database (1)
       benny carter (1)
       ...
       Z> find @attr 1=Title @attr 4=1 @attr 6=3 "beethoven bibliography" 
       ...
       Number of hits: 0, setno 5
       ...
       Z> find @attr 1=Title @attr 4=1 @attr 6=3 "beethoven bibliography database" 
       ...
       Number of hits: 1, setno 6
       
     If Structure=Phrase is
     used in conjunction with Incomplete Field - the
     default value for Completeness, the
     search is directed against the normal word registers, but if the term
     contains multiple words, the term will only match if all of the words
     are found immediately adjacent, and in the given order.
     The word search is performed on those fields that are indexed as
     type w in the GRS-1 *.abs file.
      
       Z> scan @attr 1=Title @attr 4=1 @attr 6=1 beethoven 
       ...
         beefheart (1)
       * beethoven (18)
         beethovens (7)
       ...
       Z> find @attr 1=Title @attr 4=1 @attr 6=1 beethoven 
       ...
       Number of hits: 18, setno 1
       ...
       Z> find @attr 1=Title @attr 4=1 @attr 6=1 "beethoven  bibliography"
       ...
       Number of hits: 2, setno 2
       ...
     
     If the Structure attribute is
     Word List,
     Free-form Text, or
     Document Text, the term is treated as a
     natural-language, relevance-ranked query.
     This search type uses the word register, i.e. those fields
     that are indexed as type w in the
     GRS-1 *.abs file.
    
     If the Structure attribute is
     Numeric String the term is treated as an integer.
     The search is performed on those fields that are indexed
     as type n in the GRS-1 
      *.abs file.
    
     If the Structure attribute is
     URX the term is treated as a URX (URL) entity.
     The search is performed on those fields that are indexed as type
     u in the *.abs file.
    
If the Structure attribute is Local Number the term is treated as native Zebra Record Identifier.
If the Relation attribute is Equals (default), the term is matched in a normal fashion (modulo truncation and processing of individual words, if required). If Relation is Less Than, Less Than or Equal, Greater than, or Greater than or Equal, the term is assumed to be numerical, and a standard regular expression is constructed to match the given expression. If Relation is Relevance, the standard natural-language query processor is invoked.
For the Truncation attribute, No Truncation is the default. Left Truncation is not supported. Process # in search term is supported, as is Regxp-1. Regxp-2 enables the fault-tolerant (fuzzy) search. As a default, a single error (deletion, insertion, replacement) is accepted when terms are matched against the register contents.
Each term in a query is interpreted as a regular expression if the truncation value is either Regxp-1 (@attr 5=102) or Regxp-2 (@attr 5=103). Both query types follow the same syntax with the operands:
Table 5.14. Regular Expression Operands
| x | Matches the character x. | 
| . | Matches any character. | 
| [ .. ] | Matches the set of characters specified;
         such as [abc]or[a-c]. | 
The above operands can be combined with the following operators:
Table 5.15. Regular Expression Operators
| x* | Matches xzero or more times. 
	 Priority: high. | 
| x+ | Matches xone or more times. 
	 Priority: high. | 
| x? | Matches xzero or once. 
	 Priority: high. | 
| xy | Matches x, theny.
         Priority: medium. | 
| x|y | Matches either xory.
         Priority: low. | 
| ( ) | The order of evaluation may be changed by using parentheses. | 
     If the first character of the Regxp-2 query
     is a plus character (+) it marks the
     beginning of a section with non-standard specifiers.
     The next plus character marks the end of the section.
     Currently Zebra only supports one specifier, the error tolerance,
     which consists one digit. 
     
    
Since the plus operator is normally a suffix operator the addition to the query syntax doesn't violate the syntax for standard regular expressions.
For example, a phrase search with regular expressions in the title-register is performed like this:
      Z> find @attr 1=4 @attr 5=102 "informat.* retrieval"
     
Combinations with other attributes are possible. For example, a ranked search with a regular expression:
      Z> find @attr 1=4 @attr 5=102 @attr 2=102 "informat.* retrieval"