|
|
cts:and-not-query(
|
|
$positive-query as cts:query,
|
|
$negative-query as cts:query
|
| ) as cts:and-not-query |
|
 |
Summary:
Returns a query specifying the set difference of
the matches specified by two sub-queries.
|
Parameters:
$positive-query
:
A positive query, specifying the search results
filtered in.
|
$negative-query
:
A negative query, specifying the search results
to filter out.
|
|
Usage Notes:
The cts:and-not-query constructor is fragment-based, so
it returns true only if the specified query does not produce a match
anywhere in a fragment. Therefore, a search using
cts:and-not-query is only guaranteed to be accurate if the
underlying query that is being negated is accurate from its index
resolution (that is,
if the unfiltered results of the $negative-query parameter to
cts:not-query are accurate). The accuracy of the index
resolution depends on many factors such as the query, if you search
at a fragment root (that is, if the first parameter of
cts:search specifies an XPath that resolves to a fragment root),
the index options enabled on the database, the search options,
and other factors.
In cases where the $negative-query parameter has false
positive matches,
the negation of the query can miss matches (have false negative matches).
In these cases,
searches with cts:and-not-query can miss results, even if those
searches are filtered.
|
Example:
cts:search(//PLAY,
cts:and-not-query(
cts:word-query("summer"),
cts:word-query("glorious")))
=> .. sequence of 'PLAY' elements containing some
text node with the word 'summer' BUT NOT containing
any text node with the word 'glorious'. This sequence
may be (in fact is) non-empty, but certainly does not
contain the PLAY element with:
PLAY/TITLE =
"The Tragedy of King Richard the Second"
since this play contains both 'glorious' and 'summer'.
|
|
|
|
cts:and-query(
|
|
$queries as cts:query*,
|
|
[$options as xs:string*]
|
| ) as cts:and-query |
|
 |
Summary:
Returns a query specifying the intersection
of the matches specified by the sub-queries.
|
Parameters:
$queries
:
A sequence of sub-queries.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "ordered"
- An ordered and-query, which specifies that the sub-query matches
must occur in the order of the specified sub-queries. For example,
if the sub-queries are "cat" and "dog", an ordered
query will only match fragments where both "cat" and "dog" occur,
and where "cat" comes before "dog" in the fragment.
- "unordered"
- An unordered and-query, which specifies that the sub-query matches
can occur in any order.
|
|
Usage Notes:
If the options parameter contains neither "ordered" nor "unordered",
then the default is "unordered".
If you specify the empty sequence for the queries parameter
to cts:and-query, you will get a match for every document in
the database. For example, the following query always returns true:
cts:contains(collection(), cts:and-query(()))
In order to match a cts:and-query, the matches
from each of the specified sub-queries must all occur in the same
fragment.
|
Example:
cts:search(//PLAY,
cts:and-query((
cts:word-query("to be or"),
cts:word-query("or not to be"))))
=> .. a sequence of 'PLAY' elements which are
ancestors (or self) of some node whose text content
contains the phrase 'to be or' AND some node
whose text content contains the phrase 'or not to be'.
With high probability this intersection contains only
one 'PLAY' element, namely,
PLAY/TITLE =
"The Tragedy of Hamlet, Prince of Denmark".
|
|
|
|
cts:directory-query(
|
|
$uris as xs:string*,
|
|
[$depth as xs:string?]
|
| ) as cts:directory-query |
|
 |
Summary:
Returns a query matching documents in the directories with the given URIs.
|
Parameters:
$uris
:
One or more directory URIs.
|
$depth
(optional):
"1" for immediate children, "infinity" for all. If not supplied,
depth is "1".
|
|
Example:
cts:search(//function,
cts:directory-query(("/reports/","/analysis/"),"1"))
=> .. a sequence of 'function' elements in any document
in the directory "/reports/" or the directory "/analysis/".
|
Example:
cts:search(//function, cts:and-query(("repair",
cts:directory-query(("/reports/", "/analysis/"), "1"))))
=> .. relevance ordered sequence of 'function' elements in
any document that both contains the word "repair" and is
in either the directory "/reports/" or in the directory
"/analysis/".
|
|
|
|
cts:element-attribute-pair-geospatial-query(
|
|
$element-name as xs:QName*,
|
|
$latitude-attribute-names as xs:QName*,
|
|
$longitude-attribute-names as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-pair-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name which has
specific attributes representing latitude and longitude values for
a point contained within the given geographic box, circle, or polygon,
or equal to the given point. Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$element-name
:
One or more parent element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$latitude-attribute-names
:
One or more latitude attribute QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching latitude
attribute in any point instance will be checked.
|
$longitude-attribute-names
:
One or more longitude attribute QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching longitude
attribute in any point instance will be checked.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where
multiple regions
are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0.
This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed as the numerical values in the
textual content of the named attributes.
The point values and the boundary specifications are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary,
no points will
match. However, longitudes wrap around the globe, so that if the western
boundary is east of the eastern boundary (that is, if the value of 'w' is
greater than the value of 'e'), then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point lat="10.5" long="30.0"/></item>
<item><point lat="15.35" long="35.34"/></item>
<item><point lat="5.11" long="40.55"/></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-attribute-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point lat="15.35" long="35.34"/></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-attribute-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point lat="10.5" long="30.0"/></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-attribute-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-attribute-range-query(
|
|
$element-name as xs:QName*,
|
|
$attribute-name as xs:QName*,
|
|
$operator as xs:string,
|
|
$value as xs:anyAtomicType*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-range-query |
|
 |
Summary:
Returns a cts:query matching elements by name with a
range-index entry equal a given value. Searches with the
cts:element-attribute-range-query
constructor require an attribute range index on the specified QName(s);
if there is no range index configured, then an exception is thrown.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$attribute-name
:
One or more attribute QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$operator
:
A comparison operator.
Operators include:
- "<"
- Match range index values less than $value.
- "<="
- Match range index values less than or equal to $value.
- ">"
- Match range index values greater than $value.
- ">="
- Match range index values greater than or equal to $value.
- "="
- Match range index values equal to $value.
- "!="
- Match range index values not equal to $value.
|
$value
:
Some values to match.
When multiple values are specified,
the query matches if any value matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "collation=URI"
- Use the range index with the collation specified by
URI. If not specified, then the default collation
from the query is used. If a range index with the specified
collation does not exist, an error is thrown.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0. In the current release,
this option is ignored; range queries do not contribute to the score.
|
|
Usage Notes:
If you want to constrain on a range of values, you can combine multiple
cts:element-attribute-range-query constructors together
with cts:and-query or other composable cts:query
constructors.
If neither "cached" nor "uncached" is present, it specifies "cached".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
(: create a document with test data :)
xdmp:document-insert("/attributes.xml",
<root>
<entry sku="100">
<product>apple</product>
</entry>
<entry sku="200">
<product>orange</product>
</entry>
<entry sku="1000">
<product>electric car</product>
</entry>
</root>) ;
(:
requires an attribute (range) index of
type xs:int on the "sku" attribute of
the "entry" element
:)
cts:search(doc("/attributes.xml")/root/entry,
cts:element-attribute-range-query(
xs:QName("entry"), xs:QName("sku"), ">=",
500))
(:
returns the following node:
<entry sku="1000">
<product>electric car</product>
</entry>
:)
|
|
|
|
cts:element-attribute-value-query(
|
|
$element-name as xs:QName*,
|
|
$attribute-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-value-query |
|
 |
Summary:
Returns a query matching elements by name with attributes by name
with text content equal a given phrase.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$attribute-name
:
One or more attribute QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
One or more attribute values to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:search(//module,
cts:element-attribute-value-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' element
ancestors (or self) of 'function' elements that have
an attribute 'type' whose value equals 'MarkLogic
Corporation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-attribute-value-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation",
false(), true(), 0.5),
cts:element-word-query(
xs:QName("title"),
"faster"))))
=> .. relevance-ordered sequence of 'module' element
ancestors (or self) of both:
(a) 'function' elements with attribute 'type' whose
value equals the string 'MarkLogic Corporation',
ignoring embedded punctuation,
AND
(b) 'title' elements whose text content contains the
word 'faster', with the results from (a) given
weight 0.5, and the results from (b) given default
weight 1.0.
|
|
|
|
cts:element-attribute-word-query(
|
|
$element-name as xs:QName*,
|
|
$attribute-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-word-query |
|
 |
Summary:
Returns a query matching elements by name
with attributes by name
with text content containing a given phrase.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$attribute-name
:
One or more attribute QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
Some words or phrases to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
cts:search(//module,
cts:element-attribute-word-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements that have a 'type'
attribute whose value contains the phrase
'MarkLogic Corporation'.
|
Example:
cts:search(//module,
cts:element-attribute-word-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation", "case-insensitive"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements that have a 'type'
attribute whose value contains the phrase
'MarkLogic Corporation', or any other case-shift,
like 'MARKLOGIC CorpoRation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-attribute-word-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation",
"punctuation-insensitive", 0.5),
cts:element-word-query(
xs:QName("title"),
"faster"))))
=> .. relevance-ordered sequence of 'module' element
ancestors of both:
(a) 'function' elements with 'type' attribute whose value
contains the phrase 'MarkLogic Corporation',
ignoring embedded punctuation,
AND
(b) 'title' elements whose text content contains the
term 'faster',
with the results of the first query given weight 0.5,
as opposed to the default 1.0 for the second query.
|
|
|
|
cts:element-child-geospatial-query(
|
|
$parent-element-name as xs:QName*,
|
|
$child-element-names as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-child-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name which has
specific element children representing latitude and longitude values for
a point contained within the given geographic box, circle, or polygon, or
equal to the given point. Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$parent-element-name
:
One or more parent element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$child-element-names
:
One or more child element QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching latitude
child in any point instance will be checked. The element must specify
both latitude and longitude coordinates.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where multiple
regions are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0.
This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed in the content of the element as a child
of numbers, separated by whitespace and punctuation (excluding decimal points
and sign characters).
Point values and boundary specifications of boxes are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary, no
points will match. However, longitudes wrap around the globe, so that if
the western boundary is east of the eastern boundary,
then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point><pos>10.5 30.0</pos></point></item>
<item><point><pos>15.35 35.34</pos></point></item>
<item><point><pos>5.11 40.55</pos></point></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-child-geospatial-query(xs:QName("point"), xs:QName("pos"),
cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point><pos>15.35 35.34</pos></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-child-geospatial-query(xs:QName("point"), xs:QName("pos"),
cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point><pos>10.5 30.0</pos></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-child-geospatial-query(xs:QName("point"), xs:QName("pos"),
cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-geospatial-query(
|
|
$element-name as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name whose content
represents a point contained within the given geographic box, circle, or
polygon, or equal to the given point. Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where multiple
regions are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
|
$weight
(optional):
A weight for this query. The default is 1.0. This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed in the content of the element as a pair
of numbers, separated by whitespace and punctuation (excluding decimal points
and sign characters).
Point values and boundary specifications of
boxes are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary, no
points will match. However, longitudes wrap around the globe, so that if
the western boundary is east of the eastern boundary,
then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point>10.5, 30.0</point></item>
<item><point>15.35, 35.34</point></item>
<item><point>5.11, 40.55</point></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-geospatial-query(xs:QName("point"), cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point>15.35, 35.34</point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-geospatial-query(xs:QName("point"), cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point>10.5, 30.0</point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-geospatial-query(xs:QName("point"), cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-pair-geospatial-query(
|
|
$element-name as xs:QName*,
|
|
$latitude-element-names as xs:QName*,
|
|
$longitude-element-names as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-pair-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name which has
specific element children representing latitude and longitude values for
a point contained within the given geographic box, circle, or polygon, or
equal to the given point.
Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$element-name
:
One or more parent element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$latitude-element-names
:
One or more latitude element QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching latitude
child in any point instance will be checked.
|
$longitude-element-names
:
One or more longitude element QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching longitude
child in any point instance will be checked.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where multiple
regions are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0.
This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed in the content of the element as a pair
of numbers, separated by whitespace and punctuation (excluding decimal points
and sign characters).
Point values and boundary specifications of boxes are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary, no
points will match. However, longitudes wrap around the globe, so that if
the western boundary is east of the eastern boundary,
then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point><lat>10.5</lat><long>30.0</long></point></item>
<item><point><lat>15.35</lat><long>35.34</long></point></item>
<item><point><lat>5.11</lat><long>40.55</long></point></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point><lat>15.35</lat><long>35.34</long></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point><lat>10.5</lat><long>30.0</long></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-query(
|
|
$element-name as xs:QName*,
|
|
$query as cts:query
|
| ) as cts:element-query |
|
 |
Summary:
Returns a cts:query matching elements by name
with the content constrained by the given cts:query in the
second parameter.
Searches for matches in the specified element and all of its descendants.
If the specified query in the second parameter has any
cts:element-attribute-*-query constructors, it will search
attributes directly on the specified element and attributes on any
descendant elements (see the second example below).
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$query
:
A query for the element to match. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
|
Usage Notes:
Enabling both the word position and element position indexes ("word position"
and "element word position" in the database configuration screen of the
Admin Interface) will speed up query performance for many queries that use
cts:element-query. The position indexes enable MarkLogic
Server to eliminate many false-positive results, which can reduce
disk I/O and processing, thereby speeding the performance of many queries.
The amount of benefit will vary depending on your data.
|
Example:
cts:search(//module,
cts:element-query(
xs:QName("function"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' elements
ancestors (or self) of elements with QName 'function'
and text content containing the phrase 'MarkLogic
Corporation'.
|
Example:
let $x := <a attr="something">hello</a>
return
cts:contains($x, cts:element-query(xs:QName("a"),
cts:and-query((
cts:element-attribute-word-query(xs:QName("a"),
xs:QName("attr"), "something"),
cts:word-query("hello")))))
(: returns true :)
|
|
|
|
cts:element-range-query(
|
|
$element-name as xs:QName*,
|
|
$operator as xs:string,
|
|
$value as xs:anyAtomicType*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-range-query |
|
 |
Summary:
Returns a cts:query matching elements by name with a
range-index entry equal a given value. Searches with the
cts:element-range-query
constructor require an element range index on the specified QName(s);
if there is no range index configured, then an exception is thrown.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$operator
:
A comparison operator.
Operators include:
- "<"
- Match range index values less than $value.
- "<="
- Match range index values less than or equal to $value.
- ">"
- Match range index values greater than $value.
- ">="
- Match range index values greater than or equal to $value.
- "="
- Match range index values equal to $value.
- "!="
- Match range index values not equal to $value.
|
$value
:
One or more element values to match.
When multiple values are specified,
the query matches if any value matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "collation=URI"
- Use the range index with the collation specified by
URI. If not specified, then the default collation
from the query is used. If a range index with the specified
collation does not exist, an error is thrown.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0. In the current release,
this option is ignored; range queries do not contribute to the score.
|
|
Usage Notes:
If you want to constrain on a range of values, you can combine multiple
cts:element-range-query constructors together
with cts:and-query or any of the other composable
cts:query constructors, as in the last part of the example
below.
If neither "cached" nor "uncached" is present, it specifies "cached".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
(: create a document with test data :)
xdmp:document-insert("/dates.xml",
<root>
<entry>
<date>2007-01-01</date>
<info>Some information.</info>
</entry>
<entry>
<date>2006-06-23</date>
<info>Some other information.</info>
</entry>
<entry>
<date>1971-12-23</date>
<info>Some different information.</info>
</entry>
</root>);
(:
requires an element (range) index of
type xs:date on "date"
:)
cts:search(doc("/dates.xml")/root/entry,
cts:element-range-query(xs:QName("date"), "<=",
xs:date("2000-01-01")))
(:
returns the following node:
<entry>
<date>1971-12-23</date>
<info>Some different information.</info>
</entry>
:)
;
(:
requires an element (range) index of
type xs:date on "date"
:)
cts:search(doc("/dates.xml")/root/entry,
cts:and-query((
cts:element-range-query(xs:QName("date"), ">",
xs:date("2006-01-01")),
cts:element-range-query(xs:QName("date"), "<",
xs:date("2008-01-01")))))
(:
returns the following 2 nodes:
<entry>
<date>2007-01-01</date>
<info>Some information.</info>
</entry>
<entry>
<date>2006-06-23</date>
<info>Some other information.</info>
</entry>
:)
|
|
|
|
cts:element-value-query(
|
|
$element-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-value-query |
|
 |
Summary:
Returns a query matching elements by name with text content equal a
given phrase. cts:element-value-query only matches against
simple elements (that is, elements that contain only text and have no element
children).
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
One or more element values to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
Note that the text content for the value in a
cts:element-value-query is treated the same as a phrase in a
cts:word-query, where the phrase is the element value.
Therefore, any wildcard and/or stemming rules are treated like a phrase.
For example, if you have an element value of "hello friend" with wildcarding
enabled for a query, a cts:element-value-query for "he*" will
not match because the wildcard matches do not span word boundaries, but a
cts:element-value-query for "hello *" will match. A search
for "*" will match, because a "*" wildcard by itself is defined to match
the value. Similarly, stemming rules are applied to each term, so a
search for "hello friends" would match when stemming is enabled for the query
because "friends" matches "friend". For an example, see the
fourth example below.
Similarly, because a "*" wildcard by itself is defined to match
the value, the following query will match any element with the
QName my-element, regardless of the wildcard indexes enabled in
the database configuration:
cts:element-value-query(xs:QName("my-element"), "*", "wildcarded")
|
Example:
cts:search(//module,
cts:element-value-query(
xs:QName("function"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements whose text
content equals 'MarkLogic Corporation'.
|
Example:
cts:search(//module,
cts:element-value-query(
xs:QName("function"),
"MarkLogic Corporation", "case-insensitive"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements whose text
content equals 'MarkLogic Corporation', or any other
case-shift like 'MARKLOGIC CorpoRation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-value-query(
xs:QName("function"),
"MarkLogic Corporation",
"punctuation-insensitive", 0.5),
cts:element-value-query(
xs:QName("title"),
"Word Query"))))
=> .. relevance-ordered sequence of 'module' elements
which are ancestors of both:
(a) 'function' elements with text content equal to
'MarkLogic Corporation', ignoring embedded
punctuation,
AND
(b) 'title' elements with text content equal to
'Word Query', with the results of the first sub-query
query given weight 0.5, and the results of the second
sub-query given the default weight 1.0. As a result,
the title phrase 'Word Query' counts more heavily
towards the relevance score.
|
Example:
let $node := <my-node>hello friend</my-node>
return (
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
"hello friends", "stemmed")),
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
"he*", "wildcarded")),
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
"hello f*", "wildcarded"))
)
=> true
false
true
|
|
|
|
cts:element-word-query(
|
|
$element-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-word-query |
|
 |
Summary:
Returns a query matching elements by name with text content containing
a given phrase. Searches only through immediate text node children of
the specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs
or phrase-throughs; does not search through any other children of
the specified element.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
Some words or phrases to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
Relevance adjustment for the "distance-weight" option depends on
the closest proximity of any two matches of the query. For example,
cts:element-word-query(xs:QName("p"),("dog","cat"),("distance-weight=10"))
will adjust relevance based on the distance between the closest pair of
matches of either "dog" or "cat" within an element named "p"
(the pair may consist only of matches of
"dog", only of matches of "cat", or a match of "dog" and a match of "cat").
|
Example:
cts:search(//module,
cts:element-word-query(
xs:QName("function"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' elements
ancestors (or self) of elements with QName 'function'
and text content containing the phrase 'MarkLogic
Corporation'.
|
Example:
cts:search(//module,
cts:element-word-query(
xs:QName("function"),
"MarkLogic Corporation", "case-sensitive"))
=> .. relevance-ordered sequence of 'module' elements
ancestors (or self) of elements with QName 'function'
and text content containing the phrase 'MarkLogic
Corporation',
or any other case-shift, like 'MarkLogic Corporation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-word-query(
xs:QName("function"),
"MarkLogic Corporation",
("case-insensitive", "punctuation-insensitive"), 0.5),
cts:element-word-query(
xs:QName("title"),
"faster"))))
=> .. relevance-ordered sequence of 'module' element
ancestors (or self) of both:
(a) 'function' elements with text content containing
the phrase 'MarkLogic Corporation', ignoring embedded
punctuation,
AND
(b) 'title' elements containing the word 'faster',
with the results of the first sub-query query given
weight 0.5, and the results of the second sub-query
given the default weight 1.0. As a result, the title
term 'faster' counts more towards the relevance
score.
|
|
|
|
cts:field-range-query(
|
|
$field-name as xs:string*,
|
|
$operator as xs:string,
|
|
$value as xs:anyAtomicType*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:field-range-query |
|
 |
Summary:
Returns a cts:query matching fields by name with a
range-index entry equal a given value. Searches with the
cts:field-range-query
constructor require a field range index on the specified field name(s);
if there is no range index configured, then an exception is thrown.
|
Parameters:
$field-name
:
One or more field names to match. When multiple field names are specified,
the query matches if any field name matches.
|
$operator
:
A comparison operator.
Operators include:
- "<"
- Match range index values less than $value.
- "<="
- Match range index values less than or equal to $value.
- ">"
- Match range index values greater than $value.
- ">="
- Match range index values greater than or equal to $value.
- "="
- Match range index values equal to $value.
- "!="
- Match range index values not equal to $value.
|
$value
:
One or more field values to match.
When multiple values are specified,
the query matches if any value matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "collation=URI"
- Use the range index with the collation specified by
URI. If not specified, then the default collation
from the query is used. If a range index with the specified
collation does not exist, an error is thrown.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0. In the current release,
this option is ignored; range queries do not contribute to the score.
|
|
Usage Notes:
If you want to constrain on a range of values, you can combine multiple
cts:field-range-query constructors together
with cts:and-query or any of the other composable
cts:query constructors.
If neither "cached" nor "uncached" is present, it specifies "cached".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
(: Insert few documents with test data :)
let $content1 := <name><fname>John</fname><mname>Rob</mname><lname>Goldings</lname></name>
let $content2 := <name><fname>Jim</fname><mname>Ken</mname><lname>Kurla</lname></name>
let $content3 := <name><fname>Ooi</fname><mname>Ben</mname><lname>Fu</lname></name>
let $content4 := <name><fname>James</fname><mname>Rick</mname><lname>Tod</lname></name>
return (
xdmp:document-insert("/aname1.xml",$content1),
xdmp:document-insert("/aname2.xml",$content2),
xdmp:document-insert("/aname3.xml",$content3),
xdmp:document-insert("/aname4.xml",$content4));
(:
requires a field (range) index of
type xs:string on field "aname"
:)
(:
returns the following:
<?xml version="1.0" encoding="UTF-8"?>
<name><fname>John</fname><mname>Rob</mname><lname>Goldings</lname></name>
<?xml version="1.0" encoding="UTF-8"?>
<name><fname>Ooi</fname><mname>Ben</mname><lname>Fu</lname></name>
:)
;
(:
requires an element (range) index of
type xs:string on "aname"
:)
cts:contains(doc(),cts:field-range-query("aname",">","Jim Kurla"))
(:
returns "true".
:)
|
|
|
|
cts:field-value-query(
|
|
$field-name as xs:string*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:field-value-query |
|
 |
Summary:
Returns a query matching text content containing a given value in the
specified field. If the specified field does not exist,
cts:field-value-query throws an exception.
If the specified field does have the index setting
field value searches enabled, either for the database or
for the specified field, then a cts:search with a
cts:field-value-query throws an exception. A field
is a named object that specified elements to include and exclude
from a search, and can include score weights for any included elements.
You create fields at the database level using the Admin Interface. For
details on fields, see the chapter on "Fields Database Settings" in the
Administrator's Guide.
|
Parameters:
$field-name
:
One or more field names to search over. If multiple field names are
supplied, the match can be in any of the specified fields (or-query
semantics).
|
$text
:
The value to match. If multiple strings are specified,
the query matches if any of the values match (or-query
semantics).
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If you use cts:near-query with
cts:field-value-query, the distance supplied in the near query
applies to the whole document, not just to the field. For example, if
you specify a near query with a distance of 3, it will return matches
when the values are within 3 words in the whole document,
For a code example illustrating this, see the second example
below.
Values are determined based on words (tokens)of values of elements that are
included in the field. Field values span all the included elements. They
cannot span excluded elements (this is because MarkLogic Server breaks
out of the field when it encounters the excluded element and start it again
field when it encounters the next included element). Field
values will also span included sibling elements.
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
let $contents := <Employee><name><fname>Jaz</fname><mname>Roy</mname><lname>Smith</lname></name></Employee>
return
cts:contains($contents,cts:field-value-query("myField","Jaz Roy Smith"))
=> check if the filed myField has a value matching to "Jaz Roy Smith"
in node $contents. The field must exist in the database against which
this query is evaluated. "myField" in thic case includes element "name" and excludes "mname". This expression returns false.
|
Example:
let $contents := <Employee><name><fname>Jaz</fname><mname>Roy</mname><lname>Smith</lname></name></Employee>
return
cts:contains($contents,cts:field-value-query("myField","Jaz Smith"))
=> Returns true.
|
Example:
In this query, the search is fully resolved in the index.
cts:search(fn:doc("/Employee/jaz.xml"),cts:field-value-query("myField","Jaz Smith"),"unfiltered")
=> Returns the doc which has field "myField" and a match with the value of the field.
|
|
|
|
cts:field-word-query(
|
|
$field-name as xs:string*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:field-word-query |
|
 |
Summary:
Returns a query matching text content containing a given phrase in the
specified field. If the specified field does not exist,
cts:field-word-query throws an exception. A field
is a named object that specified elements to include and exclude
from a search, and can include score weights for any included elements.
You create fields at the database level using the Admin Interface. For
details on fields, see the chapter on "Fields Database Settings" in the
Administrator's Guide.
|
Parameters:
$field-name
:
One or more field names to search over. If multiple field names are
supplied, the match can be in any of the specified fields (or-query
semantics).
|
$text
:
The word or phrase to match. If multiple strings are specified,
the query matches if any of the words or phrases match (or-query
semantics).
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If you use cts:near-query with
cts:field-word-query, the distance supplied in the near query
applies to the whole document, not just to the field. For example, if
you specify a near query with a distance of 3, it will return matches
when the words or phrases are within 3 words in the whole document,
even if some of those words are not in the specified field. For a code
example illustrating this, see the second example
below.
Phrases are determined based on words being next to each other
(word positions with a distance of 1) and words being in the same
instance of the field. Because field word positions
are determined based on the fragment, not on the field, field phrases
cannot span excluded elements (this is because MarkLogic Server breaks
out of the field when it encounters the excluded element and start a new
field when it encounters the next included element). Similarly, field
phrases will not span included sibling elements. The
second code example below illustrates this.
Field phrases will automatically phrase-through all child elements of
an included element, until it encounters an explicitly excluded
element. The third example below illustrates this.
An example of when this automatic phrase-through behavior might be
convenient is if you create a field that includes only the element
ABSTRACT. Then all child elements of ABSTRACT
are included in the field, and phrases would span all of the child
elements (that is, phrases would "phrase-through" all the child elements).
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
cts:search(fn:doc(), cts:field-word-query("myField", "my phrase"))
=> a list of documents that contain the phrase
"my phrase" in the field "myField". The field
must exist in the database against which this query
is evaluated.
|
Example:
(:
Assume the database has a field named
"buzz" with the element "buzz"
included and the element "baz" excluded.
:)
let $x :=
<hello>word1 word2 word3
<buzz>word4 word5</buzz>
<baz>word6 word7 word8</baz>
<buzz>word9 word10</buzz>
</hello>
return (
cts:contains($x, cts:near-query(
(cts:field-word-query("buzz", "word5"),
cts:field-word-query("buzz", "word9")), 3)),
cts:contains($x, cts:near-query(
(cts:field-word-query("buzz", "word5"),
cts:field-word-query("buzz", "word9")), 4)),
cts:contains($x,
cts:field-word-query("buzz", "word5 word9")))
(:
Returns the sequence ("false", "true", "false").
The first part does not match because
the distance between "word5" and "word9"
is 4. This is because the distance is
calculated based on the whole node (if the
document was in a database, based on the
fragment), not based on the field. The
second part specifies a distance of 4, and
therefore matches and returns true. The third
part does not match because the phrase is
based on the entire node, not on the field,
and there are words between "word5" and "word9"
in the node (even though not in the field).
:)
|
Example:
(:
Assume the database has a field named
"buzz" with the element "buzz"
included and the element "baz" excluded.
:)
let $x :=
<hello>
<buzz>word1 word2
<gads>word3 word4 word5</gads>
<zukes>word6 word7 word8</zukes>
word9 word10
</buzz>
</hello>
return (
cts:contains($x,
cts:field-word-query("buzz", "word2 word3")))
(:
Returns "true" because the children of
"buzz" are not excluded, and are therefore
automatically phrased through.
:)
|
|
|
|
cts:near-query(
|
|
$queries as cts:query*,
|
|
[$distance as xs:double?],
|
|
[$options as xs:string*],
|
|
[$distance-weight as xs:double?]
|
| ) as cts:near-query |
|
 |
Summary:
Returns a query matching all of the specified queries, where
the matches occur within the specified distance from each other.
|
Parameters:
$queries
:
A sequence of queries to match.
|
$distance
(optional):
A distance, in number of words, between any two matching queries.
The results match if two queries match and the distance between the
two matches is equal to or less than the specified distance. A
distance of 0 matches when the text is the exact same text or when
there is overlapping text (see the third example below). A negative
distance is treated as 0. The default value is 10.
|
$options
(optional):
Options to this query. The default value is ().
Options include:
- "ordered"
- Any near-query matches must occur in the order of
the specified sub-queries.
- "unordered"
- Any near-query matches will satisfy the query,
regardless of the order they were specified.
|
$distance-weight
(optional):
A weight attributed to the distance for this query. Higher
weights add to the importance of distance (as opposed to term matches)
when the relevance order is calculated. The default value is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score. This parameter has no effect if the word positions
index is not enabled.
|
|
Usage Notes:
If the options parameter contains neither "ordered" nor "unordered",
then the default is "unordered".
The word positions index will speed the performance of
queries that use cts:near-query. The element word
positions index will speed the performance of element-queries
that use cts:near-query.
If you use cts:near-query with a field, the distance
specified is the distance in the whole document, not the distance
in the field. For example, if the distance between two words is 20 in
the document, but the distance is 10 if you look at a view of the document
that only includes the elements in a field, a cts:near-query
must have a distance of 20 or more to match; a distance of 10 would not
match.
If you use cts:near-query with
cts:field-word-query, the distance supplied in the near query
applies to the whole document, not just to the field. For details, see
cts:field-word-query.
Expressions using the ordered option are more efficient
than those using the unordered option, especially if they
specify many queries to match.
|
Example:
The following query searches for paragraphs containing
both "MarkLogic" and "Server" within 3 words of each
other, given the following paragraphs in a database:
<p>MarkLogic Server is an enterprise-class
database specifically built for content.</p>
<p>MarkLogic is an excellent XML Content Server.</p>
cts:search(//p,
cts:near-query(
(cts:word-query("MarkLogic"),
cts:word-query("Server")),
3))
=>
<p>MarkLogic Server is an enterprise-class
database specifically built for content.</p>
|
Example:
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("discontent", "winter"),
3, "ordered"))
=> false because "discontent" comes after "winter"
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("discontent", "winter"),
3, "unordered"))
=> true because the query specifies "unordered",
and it is still a match even though
"discontent" comes after "winter"
|
Example:
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("is the winter", "winter of"),
0))
=> true because the phrases overlap
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("is the winter", "of our"),
0))
=> false because the phrases do not overlap
(they have 1 word distance, not 0)
|
|
|
|
cts:not-query(
|
|
$query as cts:query
|
| ) as cts:not-query |
|
 |
Summary:
Returns a query specifying the matches not specified by its sub-query.
|
Parameters:
$query
:
A negative query, specifying the search results
to filter out.
|
|
Usage Notes:
The cts:not-query constructor is fragment-based, so
it returns true only if the specified query does not produce a match
anywhere in a fragment. Therefore, a search using
cts:not-query is only guaranteed to be accurate if the underlying
query that is being negated is accurate from its index resolution (that is,
if the unfiltered results of the $query parameter to
cts:not-query are accurate). The accuracy of the index
resolution depends on the many factors such as the query, if you search
at a fragment root (that is, if the first parameter of
cts:search specifies an XPath that resolves to a fragment root),
the index options enabled on the database, the search options,
and other factors.
In cases where the $query parameter has false-positive matches,
the negation of the query can miss matches (have false negative matches).
In these cases,
searches with cts:not-query can miss results, even if those
searches are filtered.
|
Example:
cts:search(//PLAY,
cts:not-query(
cts:word-query("summer")))
=> ... sequence of 'PLAY' elements not containing
any text node with the word 'summer'.
|
Example:
let $doc :=
<doc>
<p n="1">Dogs, cats, and pigs</p>
<p n="2">Trees, frogs, and cats</p>
<p n="3">Dogs, alligators, and wolves</p>
</doc>
return
$doc//p[cts:contains(., cts:not-query("cat"))]
(: Returns the third p element (the one without
a "cat" term). Note that the
cts:contains forces the constraint to happen
in the filtering stage of the query. :)
|
|
|
|
cts:registered-query(
|
|
$ids as xs:unsignedLong*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:registered-query |
|
 |
Summary:
Returns a query matching fragments specified by previously registered
queries (see cts:register). If a
registered query with the specified ID(s) is not found, then
a cts:search operation with an invalid
cts:registered-query throws an XDMP-UNREGISTERED
exception.
|
Parameters:
$ids
:
Some registered query identifiers.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "filtered"
- A filtered query (the default). Filtered queries
eliminate any false-positive results and properly resolve
cases where there are multiple candidate matches within the same
fragment, thereby guaranteeing
that the results fully satisfy the original
cts:query
item that was registered. This option is not available in
the 4.0 release.
- "unfiltered"
- An unfiltered query. Unfiltered registered queries
select fragments from the indexes that are candidates to satisfy
the
cts:query.
Depending on the original cts:query, the
structure of the documents in the database, and the configuration
of the database,
unfiltered registered queries may result in false-positive results
or in incorrect matches when there are multiple candidate matches
within the same fragment.
To avoid these problems, you should only use unfiltered queries
on top-level XPath expressions (for example, document nodes,
collections, directories) or on fragment roots. Using unfiltered
queries on complex XPath expressions or on XPath expressions that
traverse below a fragment root can result in unexpected results.
This option is required in the 4.0 release.
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If the options parameter does not contain "unfiltered",
then an error is returned, as the "unfiltered" option is required.
Registered queries are persisted as a soft state only; they can
become unregistered through an explicit direction (using
cts:deregister),
as a result of the cache growing too large, or because of a server restart.
Consequently, either your XQuery code or your middleware layer should handle
the case when an XDMP-UNREGISTERED exception occurs (for example, you can
wrap your cts:registered-query code in a try/catch block
or your Java or .NET code can catch and handle the exception).
|
Example:
cts:search(//function,
cts:registered-query(1234567890123456,"unfiltered"))
=> .. relevance-ordered sequence of 'function' elements
in any document that also matches the registered query
|
Example:
(: wrap the registered query in a try/catch :)
try {
cts:search(fn:doc(),cts:registered-query(995175721241192518,"unfiltered")))
} catch ($e) {
if ($e/err:code = "XDMP-UNREGISTERED")
then ("Retry this query with the following registered query ID: ",
cts:register(cts:word-query("hello*world","wildcarded")))
else $e
}
|
|
|
|
cts:similar-query(
|
|
$nodes as node()*,
|
|
[$weight as xs:double?],
|
|
[$options as element()?]
|
| ) as cts:similar-query |
|
 |
Summary:
Returns a query matching nodes similar to the model nodes. It uses an
algorithm which finds the most "relevant" terms in the model nodes
(that is, the terms with the highest scores), and then creates a
query equivalent to a cts:or-query of those terms. By default
16 terms are used.
|
Parameters:
$nodes
:
Some model nodes.
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
$options
(optional):
An XML representation of the options for defining which terms to
generate and how to evaluate them.
The options node must be in the cts:distinctive-terms
namespace. The following is a sample options node:
<options xmlns="cts:distinctive-terms">
<max-terms>20</max-terms>
</options>
See the
cts:distinctive-terms
options for the valid options to use with this function.
Note that enabling index settings that
are disabled in the database configuration will not affect the results,
as similar documents will not be found on the basis of terms that do
not exist in the actual database index.
|
|
Usage Notes:
As the number of fragments in a database grows, the results
of cts:similar-query become increasingly accurate.
For best results, there should be at least 10,000 fragments for 32-bit
systems, and 1,000 fragments for 64-bit systems.
|
Example:
cts:search(//function,
cts:similar-query((//function)[1]))
=> .. relevance-ordered sequence of 'function' element
ancestors (or self) of any node similar to the first
'function' element.
|
Example:
xdmp:estimate(
cts:search(//function,
cts:similar-query((//function)[1], (),
<options xmlns="cts:distinctive-terms">
<max-terms>20</max-terms>
<use-db-config>true</use-db-config>
</options>)))
=> the number of fragments containing any node similar
to the first 'function' element.
|
|
|
|
cts:word-query(
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:word-query |
|
 |
Summary:
Returns a query matching text content containing a given phrase.
|
Parameters:
$text
:
Some words or phrases to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
Relevance adjustment for the "distance-weight" option depends on
the closest proximity of any two matches of the query. For example,
cts:word-query(("dog","cat"),("distance-weight=10"))
will adjust relevance based on the distance between the closest pair of
matches of either "dog" or "cat" (the pair may consist only of matches of
"dog", only of matches of "cat", or a match of "dog" and a match of "cat").
|
Example:
cts:search(//function,
cts:word-query("MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'function' element
ancestors (or self) of any node containing the phrase
'MarkLogic Corporation'.
|
Example:
cts:search(//function,
cts:word-query("MarkLogic Corporation",
"case-insensitive"))
=> .. relevance-ordered sequence of 'function'
element ancestors (or self) of any node containing
the phrase 'MarkLogic Corporation' or any other
case-shift like 'MarkLogic Corporation',
'MARKLOGIC Corporation', etc.
|
Example:
cts:search(//SPEECH,
cts:word-query("to be, or not to be",
"punctuation-insensitive"))
=> .. relevance-ordered sequence of 'SPEECH'
element ancestors (or self) of any node
containing the phrase 'to be, or not to be',
ignoring punctuation.
|
|
|