|
|
cts:and-not-query(
|
|
$positive-query as cts:query,
|
|
$negative-query as cts:query
|
| ) as cts:and-not-query |
|
 |
Summary:
Returns a query specifying the set difference of
the matches specified by two sub-queries.
|
Parameters:
$positive-query
:
A positive query, specifying the search results
filtered in.
|
$negative-query
:
A negative query, specifying the search results
to filter out.
|
|
Usage Notes:
The cts:and-not-query constructor is fragment-based, so
it returns true only if the specified query does not produce a match
anywhere in a fragment. Therefore, a search using
cts:and-not-query is only guaranteed to be accurate if the
underlying query that is being negated is accurate from its index
resolution (that is,
if the unfiltered results of the $negative-query parameter to
cts:not-query are accurate). The accuracy of the index
resolution depends on many factors such as the query, if you search
at a fragment root (that is, if the first parameter of
cts:search specifies an XPath that resolves to a fragment root),
the index options enabled on the database, the search options,
and other factors.
In cases where the $negative-query parameter has false
positive matches,
the negation of the query can miss matches (have false negative matches).
In these cases,
searches with cts:and-not-query can miss results, even if those
searches are filtered.
|
Example:
cts:search(//PLAY,
cts:and-not-query(
cts:word-query("summer"),
cts:word-query("glorious")))
=> .. sequence of 'PLAY' elements containing some
text node with the word 'summer' BUT NOT containing
any text node with the word 'glorious'. This sequence
may be (in fact is) non-empty, but certainly does not
contain the PLAY element with:
PLAY/TITLE =
"The Tragedy of King Richard the Second"
since this play contains both 'glorious' and 'summer'.
|
|
|
|
cts:and-query(
|
|
$queries as cts:query*,
|
|
[$options as xs:string*]
|
| ) as cts:and-query |
|
 |
Summary:
Returns a query specifying the intersection
of the matches specified by the sub-queries.
|
Parameters:
$queries
:
A sequence of sub-queries.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "ordered"
- An ordered and-query, which specifies that the sub-query matches
must occur in the order of the specified sub-queries. For example,
if the sub-queries are "cat" and "dog", an ordered
query will only match fragments where both "cat" and "dog" occur,
and where "cat" comes before "dog" in the fragment.
- "unordered"
- An unordered and-query, which specifies that the sub-query matches
can occur in any order.
|
|
Usage Notes:
If the options parameter contains neither "ordered" nor "unordered",
then the default is "unordered".
If you specify the empty sequence for the queries parameter
to cts:and-query, you will get a match for every document in
the database. For example, the following query always returns true:
cts:contains(collection(), cts:and-query(()))
In order to match a cts:and-query, the matches
from each of the specified sub-queries must all occur in the same
fragment.
|
Example:
cts:search(//PLAY,
cts:and-query((
cts:word-query("to be or"),
cts:word-query("or not to be"))))
=> .. a sequence of 'PLAY' elements which are
ancestors (or self) of some node whose text content
contains the phrase 'to be or' AND some node
whose text content contains the phrase 'or not to be'.
With high probability this intersection contains only
one 'PLAY' element, namely,
PLAY/TITLE =
"The Tragedy of Hamlet, Prince of Denmark".
|
|
|
|
cts:arc-intersection(
|
|
$p1 as cts:point,
|
|
$p2 as cts:point,
|
|
$q1 as cts:point,
|
|
$q2 as cts:point,
|
|
[$options as xs:string*]
|
| ) as cts:point |
|
 |
Summary:
Returns the point at the intersection of two arcs. If the arcs do
not intersect, or lie on the same great circle, or if either arc covers
more than 180 degrees, an error is raised.
|
Parameters:
$p1
:
The starting point of the first arc.
|
$p2
:
The ending point of the first arc.
|
$q1
:
The starting point of the second arc.
|
$q2
:
The ending point of the second arc.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
|
|
Example:
let $sf := cts:point(37, -122)
let $ny := cts:point(40, -73)
let $a := cts:point(35,-100)
let $b := cts:point(41,-70)
return
cts:arc-intersection($sf,$ny,$a,$b)
=> 40.458347,-76.203682
|
|
|
|
cts:avg(
|
|
$arg as xs:anyAtomicType*
|
| ) as xs:anyAtomicType? |
|
 |
Summary:
Returns a frequency-weighted average of a sequence.
This function works like fn:avg except each item in the
sequence is multiplied by cts:frequency before summing.
|
Parameters:
$arg
:
The sequence of values to be averaged. The values should be the result of
a lexicon lookup.
|
|
Usage Notes:
This function is designed to take a sequence of values returned
by a lexicon function (for example, cts:element-values); if you
input non-lexicon values, the result will always be 0.
|
Example:
xquery version "1.0-ml";
(:
This query assumes an int range index
is configured in the database. It
generates some sample data and then
performs the aggregation in a separate
transaction.
:)
for $x in 1 to 10
return
xdmp:document-insert(fn:concat($x, ".xml"),
<my-element>{
for $y in 1 to $x
return <int>{$x}</int>
}</my-element>);
cts:avg(cts:element-values(xs:QName("int"), (),
("type=int", "item-frequency"))),
cts:avg(cts:element-values(xs:QName("int"), (),
("type=int", "fragment-frequency")))
=>
7
5.5
|
|
|
|
cts:bounding-boxes(
|
|
$region as cts:region,
|
|
[$options as xs:string*]
|
| ) as cts:box* |
|
 |
Summary:
Returns a sequence of boxes that bound the given region.
|
Parameters:
$region
:
A geographic region (box, circle, polygon, or point).
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "box-percent=n"
- An integer between 0 and 100 (default is 100) that indicates what percentage of a polygon's bounding box slivers should be returned. Lower numbers give fewer, less accurate boxes; larger numbers give more, more accurate boxes.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
cts:bounding-boxes(
cts:polygon("0,0 20,20 -10,18 5,5 0,0")
)
(: Returns two boxes:
[-10, 0, 5, 18.976505]
[5, 4.7157488, 20, 20]
:)
;
cts:bounding-boxes(
cts:polygon("0,0 20,20 -10,18 5,5 0,0"),
"box-percent=50"
)
(: Returns one box:
[-10, 0, 20, 20]
:)
|
|
|
|
cts:box-intersects(
|
|
$box as cts:box,
|
|
$region as cts:region*,
|
|
[$options as xs:string*]
|
| ) as xs:boolean |
|
 |
Summary:
Returns true if the box intersects with a region.
|
Parameters:
$box
:
A geographic box.
|
$region
:
One or more geographic regions (boxes, circles, polygons, or points).
Where multiple regions are specified, return true if any region intersects
the box.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
|
|
|
|
cts:circle-intersects(
|
|
$circle as cts:circle,
|
|
$region as cts:region*,
|
|
[$options as xs:string*]
|
| ) as xs:boolean |
|
 |
Summary:
Returns true if the circle intersects with a region.
|
Parameters:
$circle
:
A geographic circle.
|
$region
:
One or more geographic regions (boxes, circles, polygons, or points).
Where multiple regions are specified, return true if any region intersects
the target circle.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
|
|
|
|
cts:cluster(
|
|
$nodes as node()*,
|
|
[$options as element()?]
|
| ) as element(cts:clustering) |
|
 |
Summary:
Produces a set of clusters from a sequence of nodes. The nodes can be
any set of nodes, and are typically the result of a cts:search
operation.
|
Parameters:
$nodes
:
The sequence of nodes to cluster.
|
$options
(optional):
An XML representation of the options for defining the clustering
parameters. The options node must be in the cts:cluster
namespace. The following is a sample options node:
<options xmlns="cts:cluster">
<label-max-terms>4</label-max-terms>
<max-clusters>6</max-clusters>
<use-db-config>true</use-db-config>
</options>
The cts:cluster options include:
<xs:element ref="opt:overlapping"/>
<xs:element ref="opt:max-terms"/>
<hierarchical-levels>
- An integer specifying how many hierarchical cluster levels the clusterer
should return. The default is
1, which means no hierarchical
clusters are returned.
<label-max-terms>
- An integer specifying the maximum number of terms to use in constructing
a cluster label. The default is
3.
<label-ignore-words>
- A space-separated list of words that are to be excluded from cluster
label. The default is to not exclude any words.
<label-ignore-attributes>
- A boolean that indicates whether attribute terms should be excluded
from the cluster label. The default is to include terms from attributes.
<details>
- A boolean that indicates whether additional details on the terms
used in label generation are to be included in the output. See the
documentation on cts:distinctive-terms for details on the format of the
terms returned. The default
false, meaning no such details
are given.
<min-clusters>
- An integer specifying a minimum number of desired clusters returned
(at any hierarchical level).
However, if no satisfactory clustering can be produced at a given level,
only one cluster will be returned, regardless of this setting.
The default is
3.
<max-clusters>
- An integer specifying a maximum number of clusters that can be returned
(at any hierarchical level). The default is
15.
<overlapping>
- A boolean indicating whether it is acceptable for nodes to be
assigned to more than one cluster. The default is
false.
<max-terms>
- An integer value specifying the maximum number of distinct terms to
use in calculating the cluster. The default is
200.
Increasing the value will increase the cost (in terms of both time
and memory) of calculating the clusters, but may improve the quality
of the clusters.
<algorithm>
- A value indicating which clustering algorithm to use, either
k-means or lsi. The default is
k-means. The LSI algorithm is significantly more expensive
to compute, both in terms of time and space.
<num-tries>
- Specifies the number of times to run the clusterer against
the specified data. The default is 1.
Because of the way the algorithms work, running
the cluster multiple times will increase the number of terms, and
tends to improve the accuratacy of the clusters. It does so at the
cost of performance, as each time it runs, it has to do more work.
<use-db-config>
- A boolean value indicating whether to use the current DB configuration
for determining which terms to use. The default is
false,
which means that the default set of options, as well as any indexing
options you specify in the options node, will be
used for calculating the clusters and their labels. When set to
true, any indexing options set in the context database
configuration (including any field settings) are used, as well as any
default settings that you have not explicitly turned off in the options
node.
The options element also includes indexing options in the
http://marklogic.com/xdmp/database namespace.
These control which terms to use. Note that the use of certain
options, such as fast-case-sensitive-searches, will not
impact final results unless the term vector size is limited with
the max-terms option. Other options, such as
phrase-throughs, will only generate terms if some
other option is also enabled (in this case
fast-phrase-searches).
The database options are the same as the database options shown for
cts:distinctive-terms.
|
|
Example:
cts:cluster(
cts:search(//MILITARY, cts:word-query("apache"))[1 to 100],
<options xmlns="cts:cluster" xmlns:db="http://marklogic.com/xdmp/database">
<hierarchical-levels>2</hierarchical-levels>
<overlapping>false</overlapping>
<label-max-terms>3</label-max-terms>
<max-clusters>100</max-clusters>
<label-ignore-words>of the on in at a an for from by and</label-ignore-words>
<db:stemmed-searches>advanced</db:stemmed-searches>
<db:fast-phrase-searches>true</db:fast-phrase-searches>
<db:fast-element-word-searches>true</db:fast-element-word-searches>
<db:fast-element-phrase-searches>true</db:fast-element-phrase-searches>
</options>)
==>
<clustering xmlns="http://marklogic.com/cts">
<cluster id="123456" label="apache helicopters" count="7" nodes="3 34 31 98 34 23 39"/>
<cluster id="374632" label="apache linux" count="6" nodes="1 378 56 23 93 6"/>
<cluster id="3452231" label="navajo codetalkers" count="8" nodes="44 87 32 77 50 12 13 15"/>
...
<cluster id="2234" parent="123456" label="AH-64" count="2" nodes="3 39"/>
<cluster id="34321" parent="123456" label="air force" count="5" nodes="34 31 98 34 23"/>
<cluster id="34523" parent="374632" label="HTTP" count="3" nodes="1 56 23"/>
<cluster id="968" parent="374632" label="LAMP" count="3" nodes="378 93 6"/>
<options xmlns="cts:cluster" xmlns:db="http://marklogic.com/xdmp/database">
<algorithm>k-means</algorithm>
<db:stemmed-searches>advanced</db:stemmed-searches>
<db:fast-element-word-searches>true</db:fast-element-word-searches>
<db:fast-element-phrase-searches>true</db:fast-element-phrase-searches>
<db:language>en</db:language>
<max-clusters>100</max-clusters>
<min-clusters>2</min-clusters>
<hierarchical-levels>2</hierarchical-levels>
<initialization>smart</initialization>
<label-max-terms>3</label-max-terms>
<num-tries>1</num-tries>
<score>logtfidf</score>
<use-db-config>false</use-db-config>
</options>
</clustering>
|
Example:
cts:cluster(
cts:search(//function, "foo"),
<options xmlns="cts:cluster">
<use-db-config>true</use-db-config>
</options>)
=> The cts:clustering element
|
|
|
|
cts:collection-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the collection lexicon
that match the specified wildcard pattern.
This function requires the collection-lexicon database configuration
parameter to be enabled. If the uri-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
URIs from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only URIs from the first N fragments after skip
selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only URIs from the first N fragments after skip
selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include URIs from fragments selected by the cts:query,
and compute frequencies from this set of included URIs.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included URIs may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then URIs from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:collection-match("collection*")
=> ("collection1", "collection2", ...)
|
|
|
|
cts:collections(
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the collection lexicon.
This function requires the collection-lexicon database configuration
parameter to be enabled. If the collection-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$start
(optional):
A starting value. Return only this value and following values.
If the parameter is is not in the lexicon, then it returns the values
beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
URIs from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only URIs from the first N fragments after skip
selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only URIs from the first N fragments after skip
selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include URIs from fragments selected by the cts:query,
and compute frequencies from this set of included URIs.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:collections("aardvark")
=> ("aardvark", "aardvarks", ...)
|
|
|
|
cts:complex-polygon(
|
|
$outer as cts:polygon,
|
|
$inner as cts:polygon*
|
| ) as cts:complex-polygon |
|
 |
Summary:
Returns a geospatial complex polygon value.
|
Parameters:
$outer
:
The outer polygon.
|
$inner
:
The innner (hole) polygons.
|
|
Example:
cts:complex-polygon(
cts:polygon("0,0 10,0 10,10 0,10 0,0"),
cts:polygon("5,0 7,0 7,5 5,5 5,0"))
|
|
|
|
cts:complex-polygon-contains(
|
|
$complex-polygon as cts:complex-polygon,
|
|
$region as cts:region*,
|
|
[$options as xs:string*]
|
| ) as xs:boolean |
|
 |
Summary:
Returns true if the complex-polygon contains a region.
|
Parameters:
$complex-polygon
:
A geographic complex polygon.
|
$region
:
One or more geographic regions (boxes, circles, polygons, or points).
Where multiple regions are specified, return true if any region intersects
the target polygon.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and complex-polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and complex-polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
let $cp :=
cts:complex-polygon(
cts:polygon("0,0 10,0 10,10 0,10 0,0"),
cts:polygon("5,0 7,0 7,5 5,5 5,0"))
let $poly :=
cts:polygon("6,8 6.5,8 6.5,9 6,9 6,8")
return cts:complex-polygon-contains($cp, $poly)
(: returns true :)
|
|
|
|
cts:complex-polygon-inner(
|
|
$complexPolygon as cts:complex-polygon
|
| ) as cts:polygon* |
|
 |
Summary:
Returns a complex polygon's inner polygons.
|
Parameters:
$complexPolygon
:
The complex polygon.
|
|
Example:
let $node :=
<complexPolygon name="Arapahoe">POLYGON((
0.396982870000000E+02 -0.104935135000000E+03,
0.396965870000000E+02 -0.104938635000000E+03,
0.396965870000000E+02 -0.104938635000000E+03,
0.397110870000000E+02 -0.104931634000000E+03,
0.397066870000000E+02 -0.104926934000000E+03,
0.397012870000000E+02 -0.104932834000000E+03,
0.396971870000000E+02 -0.104928134000000E+03,
0.396965870000000E+02 -0.104928134000000E+03,
0.396965870000000E+02 -0.104928134000000E+03,
0.396965870000000E+02 -0.104931534000000E+03,
0.396966870000000E+02 -0.104934335000000E+03,
0.396966870000000E+02 -0.104934335000000E+03,
0.396981250000000E+02 -0.104934109000000E+03
),
(
0.396981250000000E+02 -0.104934109000000E+03,
0.397001130000000E+02 -0.104931652000000E+03,
0.397001870000000E+02 -0.104934034000000E+03,
0.396981250000000E+02 -0.104934109000000E+03
))
</complexPolygon>
return
cts:complex-polygon-inner(cts:parse-wkt(fn:data($node)))
|
|
|
|
cts:complex-polygon-intersects(
|
|
$complex-polygon as cts:complex-polygon,
|
|
$region as cts:region*,
|
|
[$options as xs:string*]
|
| ) as xs:boolean |
|
 |
Summary:
Returns true if the complex-polygon intersects with a region.
|
Parameters:
$complex-polygon
:
A geographic complex-polygon.
|
$region
:
One or more geographic regions (boxes, circles, complex-polygons, or points).
Where multiple regions are specified, return true if any region intersects
the target complex-polygon.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and complex-polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and complex-polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
|
|
|
|
cts:complex-polygon-outer(
|
|
$complexPolygon as cts:complex-polygon
|
| ) as cts:polygon? |
|
 |
Summary:
Returns a complex polygon's outer polygon.
|
Parameters:
$complexPolygon
:
The complex polygon.
|
|
Example:
let $node :=
<complexPolygon name="Arapahoe">POLYGON((
0.396982870000000E+02 -0.104935135000000E+03,
0.396965870000000E+02 -0.104938635000000E+03,
0.396965870000000E+02 -0.104938635000000E+03,
0.397110870000000E+02 -0.104931634000000E+03,
0.397066870000000E+02 -0.104926934000000E+03,
0.397012870000000E+02 -0.104932834000000E+03,
0.396971870000000E+02 -0.104928134000000E+03,
0.396965870000000E+02 -0.104928134000000E+03,
0.396965870000000E+02 -0.104928134000000E+03,
0.396965870000000E+02 -0.104931534000000E+03,
0.396966870000000E+02 -0.104934335000000E+03,
0.396966870000000E+02 -0.104934335000000E+03,
0.396981250000000E+02 -0.104934109000000E+03
),
(
0.396981250000000E+02 -0.104934109000000E+03,
0.397001130000000E+02 -0.104931652000000E+03,
0.397001870000000E+02 -0.104934034000000E+03,
0.396981250000000E+02 -0.104934109000000E+03
)
</complexPolygon>
return
cts:complex-polygon-outer(cts:parse-wkt(fn:data($node)))
|
|
|
|
cts:count(
|
|
$arg as item()*,
|
|
[$maximum as xs:double]
|
| ) as xs:integer |
|
 |
Summary:
Returns a frequency-weighted count of a sequence.
This function works like fn:count except the count
of each item is multiplied by cts:frequency.
|
Parameters:
$arg
:
The sequence of items to count. The items should be the result of
a lexicon lookup.
|
$maximum
(optional):
The maximum value of the count to return. MarkLogic Server will stop
counting when the $maximum value is reached and return
the $maximum value.
|
|
Usage Notes:
This function is designed to take a sequence of values returned
by a lexicon function (for example, cts:element-values); if you
input non-lexicon values, the result will always be 0.
|
Example:
xquery version "1.0-ml";
(:
This query assumes an int range index
is configured in the database. It
generates some sample data and then
performs the aggregation in a separate
transaction.
:)
for $x in 1 to 10
return
xdmp:document-insert(fn:concat($x, ".xml"),
<my-element>{
for $y in 1 to $x
return <int>{$x}</int>
}</my-element>);
cts:count(cts:element-values(xs:QName("int"), (),
("type=int", "item-frequency"))),
cts:count(cts:element-values(xs:QName("int"), (),
("type=int", "fragment-frequency")))
=>
55
10
|
|
|
|
cts:directory-query(
|
|
$uris as xs:string*,
|
|
[$depth as xs:string?]
|
| ) as cts:directory-query |
|
 |
Summary:
Returns a query matching documents in the directories with the given URIs.
|
Parameters:
$uris
:
One or more directory URIs.
|
$depth
(optional):
"1" for immediate children, "infinity" for all. If not supplied,
depth is "1".
|
|
Example:
cts:search(//function,
cts:directory-query(("/reports/","/analysis/"),"1"))
=> .. a sequence of 'function' elements in any document
in the directory "/reports/" or the directory "/analysis/".
|
Example:
cts:search(//function, cts:and-query(("repair",
cts:directory-query(("/reports/", "/analysis/"), "1"))))
=> .. relevance ordered sequence of 'function' elements in
any document that both contains the word "repair" and is
in either the directory "/reports/" or in the directory
"/analysis/".
|
|
|
|
cts:distinctive-terms(
|
|
$nodes as node()*,
|
|
[$options as element()?]
|
| ) as element(cts:class) |
|
 |
Summary:
Return the most "relevant" terms in the model nodes (that is, the
terms with the highest scores).
|
Parameters:
$nodes
:
Some model nodes.
|
$options
(optional):
An XML representation of the options for defining which terms to
generate and how to evaluate them.
The options node must be in the cts:distinctive-terms
namespace. The following is a sample options node:
<options xmlns="cts:distinctive-terms">
<max-terms>20</max-terms>
</options>
The
cts:distinctive-terms options (which are also valid for
cts:similar-query, cts:train,
and cts:cluster)
include:
<max-terms>
- An integer defining the maximum number of distinctive terms to list
in the
cts:distinctive-terms output. The default is 16.
<min-val>
- A double specifying the minimum value a term can
have and still be considered a distinctive term. The default is 0.
<min-weight>
- A number specifying the minimum weighted term frequency a term can
have and still be considered a distinctive term. In general this value
will be either 0 (include unweighted terms) or 1 (don't include unweighted
terms). The default is 1.
<score>
- A string defining which scoring method to use in comparing the values
of the terms.
The default is
logtfidf. See the description of scoring
methods in the cts:search function for more details.
Possible values are:
logtfidf
- Compute scores using the logtfidf method.
logtf
- Compute scores using the logtf method.
simple
- Compute scores using the simple method.
<use-db-config>
- A boolean value indicating whether to use the current DB configuration
for determining which terms to use. The default is
true.
Setting the value to false means that the indexing
options in the options node will be used, as well as the default value
for any of the options not specified. This may be used to easily
target a small set of terms.
<complete>
- A boolean value indicating whether to return terms even if there is no
query associated with them. The default is false.
The options element also includes indexing options in the
http://marklogic.com/xdmp/database namespace.
These control which terms to use.
These database options include the following (shown here with
a db prefix to denote the
http://marklogic.com/xdmp/database namespace. The default
given below is the default value if use-db-config is set
to false:
<db:word-searches>
- Include terms for the words in the node. The default is 'false'.
<db:stemmed-searches>
- Define whether to include terms for the stems in the node, and at
what level of stemming:
off, basic,
advanced, or decompounding. The default is 'basic'.
<db:fast-case-sensitive-searches>
- Include terms for case-sensitive variations of the words in the
node. The default is 'false'.
<db:fast-diacritic-sensitive-searches>
- Include terms for diacritic-sensitive variations of the words in
the node. The default is 'false'.
<db:fast-phrase-searches> - Include
terms for two-word phrases in the node. The default is 'true'.
<db:phrase-throughs> - If phrase
terms are included, include terms for phrases that cross the given
elements. The default is to have no such elements.
<db:phrase-arounds> - If phrase
terms are included, include terms for phrases that skip over the
given elements. The default is to have no such elements.
<db:fast-element-word-searches>
- Include terms for words in particular elements. The default is 'true'.
<db:fast-element-phrase-searches>
- Include terms for phrases in particular elements. The default is 'true'.
<db:element-word-query-throughs>
- Include terms for words in sub-elements of the given elements. The default is to have no such elements.
<db:fast-element-character-searches>
- Include terms for characters in particular elements. The default is 'false'.
<db:range-element-indexes>
- Include terms for data values in specific elements. The default is to have no such indexes.
<db:range-field-indexes>
- Include terms for data values in specific fields. The default is to have no such indexes.
<db:range-element-attribute-indexes>
- Include terms for data values in specific attributes. The default is to have no such indexes.
<db:one-character-searches>
- Include terms for single character. The default is 'false'.
<db:two-character-searches>
- Include terms for two-character sequences. The default is 'false'.
<db:three-character-searches>
- Include terms three-character sequences. The default is 'false'.
<db:trailing-wildcard-searches>
- Include terms for trailing wildcards. The default is 'false'.
<db:fast-element-trailing-wildcard-searches>
- If trailing wildcard terms are included, include terms for
trailing wildcards by element. The default is 'false'.
<db:fields>
- Include terms for the defined fields. The default is to have no fields.
|
|
Usage Notes:
Output Format
The output of the function is a cts:class element containing a
sequence of cts:term elements. (This is the same as the weights
form of a class for the SVM classifier; see cts:train.) Each
cts:term element identifies the term ID as well as a score,
confidence, and fitness measure for the term, in addition to a
cts:query that corresponds to the term. The correspondence of
terms to queries is not precise: queries typically make use of multiple
terms, and not all terms correspond to a query. However, a search using the
query given for a term will match the model node that gave rise to it.
|
Example:
cts:distinctive-terms( fn:doc("book.xml"),
<options xmlns="cts:distinctive-terms"><max-terms>3</max-terms></options> )
== >
<cts:class name="dterms book.xml" offset="0" xmlns:cts="http://marklogic.com/cts">
<cts:term id="1230725848944963443" val="482" score="372" confidence="0.686441" fitness="0.781011">
<cts:element-word-query>
<cts:element>title</cts:element>
<cts:text xml:lang="en">the</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>diacritic-insensitive</cts:option>
<cts:option>stemmed</cts:option>
<cts:option>unwildcarded</cts:option>
</cts:element-word-query>
</cts:term>
<cts:term id="2859044029148442125" val="435" socre="662" confidence="0.922555" fitness="0.971371">
<cts:word-query>
<cts:text xml:lang="en">text</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>diacritic-insensitive</cts:option>
<cts:option>stemmed</cts:option>
<cts:option>unwildcarded</cts:option>
</cts:word-query>
</cts:term>
<cts:term id="17835615465481541363" val="221" score="237" confidence="0.65647" fitness="0.781263">
<cts:word-query>
<cts:text xml:lang="en">of</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>diacritic-insensitive</cts:option>
<cts:option>stemmed</cts:option>
<cts:option>unwildcarded</cts:option>
</cts:word-query>
</cts:term>
</cts:class>
|
Example:
cts:distinctive-terms(//title,
<options xmlns="cts:distinctive-terms">
<use-db-config>true</use-db-config>
</options>)
=> a cts:class element contianing the 16 most distinctive query terms
|
Example:
cts:distinctive-terms(<foo>hello there you</foo>,
<options xmlns="cts:distinctive-terms"
xmlns:db="http://marklogic.com/xdmp/database">
<db:word-positions>true</db:word-positions>
</options>)
=> a cts:class element contianing the 16 most distinctive query terms
|
|
|
|
cts:element-attribute-pair-geospatial-boxes(
|
|
$parent-element-names as xs:QName*,
|
|
$latitude-names as xs:QName*,
|
|
$longitude-names as xs:QName*,
|
|
[$latitude-bounds as xs:double*],
|
|
[$longitude-bounds as xs:double*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:box* |
|
 |
Summary:
Returns boxes derived from the specified element point lexicon(s).
Element point lexicons are implemented using geospatial indexes; consequently
this function requires a geospatial element attribute pair index for each
prarent element and attribute pair
specified in the function. If there is not a geospatial index configured for
each of the specified combinations, an exception is thrown.
The points are divided into box-shaped buckets. The $latitude-bounds and
$longitude-bounds parameters specify the number and the size of each
box-shaped bucket. All included points are bucketed, even those outside
the bounds. An empty sequence for both $latitude-bounds and
$longitude-bounds specifies one bucket, a single value for both specifies
four buckets, two values for both specify nine buckets, and so on.
For each non-empty bucket, a cts:box value is returned.
By default, the cts:box value is the minimum bounding box
of all the points in the bucket. If the "gridded" option is specified,
then if a bucket is bounded on a side, its corresponding
cts:box side is the bound.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$parent-element-names
:
One or more element QNames.
|
$latitude-names
:
One or more element QNames.
|
$longitude-names
:
One or more element QNames.
|
$latitude-bounds
(optional):
A sequence of latitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$longitude-bounds
(optional):
A sequence of longitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Boxes should be returned in ascending order.
- "descending"
- Boxes should be returned in descending order.
- "gridded"
- For each side that a bucket is bounded, return the corresponding
bound as the edge of the box, instead of the extremum from the
points in the bucket.
- "empties"
- Include fully-bounded ranges whose frequency is 0. Only
empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Points from any fragment should be included.
- "document"
- Points from document fragments should be included.
- "properties"
- Points from properties fragments should be included.
- "locks"
- Points from locks fragments should be included.
- "frequency-order"
- Boxes should be returned ordered by frequency.
- "item-order"
- Boxes should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included point.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included point.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N boxes.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Points from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only boxes for buckets with at least one point from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only points from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:box* sequence.
|
$query
(optional):
Only include points in fragments selected by the cts:query,
and compute frequencies from this set of included points.
The points do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all boxes with included points may be returned. If a $query
parameter is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then points from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-attribute-pair-geospatial-query(
|
|
$element-name as xs:QName*,
|
|
$latitude-attribute-names as xs:QName*,
|
|
$longitude-attribute-names as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-pair-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name which has
specific attributes representing latitude and longitude values for
a point contained within the given geographic box, circle, or polygon,
or equal to the given point. Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$element-name
:
One or more parent element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$latitude-attribute-names
:
One or more latitude attribute QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching latitude
attribute in any point instance will be checked.
|
$longitude-attribute-names
:
One or more longitude attribute QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching longitude
attribute in any point instance will be checked.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where
multiple regions
are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0.
This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed as the numerical values in the
textual content of the named attributes.
The point values and the boundary specifications are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary,
no points will
match. However, longitudes wrap around the globe, so that if the western
boundary is east of the eastern boundary (that is, if the value of 'w' is
greater than the value of 'e'), then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point lat="10.5" long="30.0"/></item>
<item><point lat="15.35" long="35.34"/></item>
<item><point lat="5.11" long="40.55"/></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-attribute-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point lat="15.35" long="35.34"/></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-attribute-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point lat="10.5" long="30.0"/></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-attribute-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-attribute-pair-geospatial-value-match(
|
|
$element-names as xs:QName*,
|
|
$latitude-names as xs:QName*,
|
|
$longitude-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element attribute pair geospatial value
lexicon(s)
that match the specified wildcard pattern. Element attribute pair
geospatial value lexicons
are implemented using geospatial indexes; consequently this function
requires an element attribute pair geospatial index for each combination
of elements and attributes specified in the
function. If there is not a geospatial index configured for each of the
specified combinations, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$latitude-names
:
One or more latitude element QNames.
|
$longitude-names
:
One or more longitude element QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-attribute-pair-geospatial-value-match(
xs:QName("location"),xs:QName("lat"),xs:QName("long"),cts:point(10,20))
=> 10,20
|
|
|
|
cts:element-attribute-pair-geospatial-values(
|
|
$element-names as xs:QName*,
|
|
$latitude-names as xs:QName*,
|
|
$longitude-names as xs:QName*,
|
|
[$start as cts:point?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element-attribute-pair geospatial value lexicon(s).
element-attribute-pair geospatial value lexicons are implemented using geospatial
indexes;
consequently this function requires an element-attribute-pair geospatial index
of for each of the combinatation specified in the function.
If there is not a geospatial index configured for each of the specified
combinations, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$latitude-names
:
One or more latitude element QNames.
|
$longitude-names
:
One or more longitude element QNames.
|
$start
(optional):
A starting value.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N fragments after
skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or child QNames are specified,
then all possible element/child QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-pair-geospatial-values(
xs:QName("location"), xs:QName("position"), cts:point(0,0) )
=> (cts:point(0,0),cts:point(0,10),cts:point(0,20),...)
|
|
|
|
cts:element-attribute-range-query(
|
|
$element-name as xs:QName*,
|
|
$attribute-name as xs:QName*,
|
|
$operator as xs:string,
|
|
$value as xs:anyAtomicType*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-range-query |
|
 |
Summary:
Returns a cts:query matching elements by name with a
range-index entry equal a given value. Searches with the
cts:element-attribute-range-query
constructor require an attribute range index on the specified QName(s);
if there is no range index configured, then an exception is thrown.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$attribute-name
:
One or more attribute QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$operator
:
A comparison operator.
Operators include:
- "<"
- Match range index values less than $value.
- "<="
- Match range index values less than or equal to $value.
- ">"
- Match range index values greater than $value.
- ">="
- Match range index values greater than or equal to $value.
- "="
- Match range index values equal to $value.
- "!="
- Match range index values not equal to $value.
|
$value
:
Some values to match.
When multiple values are specified,
the query matches if any value matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "collation=URI"
- Use the range index with the collation specified by
URI. If not specified, then the default collation
from the query is used. If a range index with the specified
collation does not exist, an error is thrown.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0. In the current release,
this option is ignored; range queries do not contribute to the score.
|
|
Usage Notes:
If you want to constrain on a range of values, you can combine multiple
cts:element-attribute-range-query constructors together
with cts:and-query or other composable cts:query
constructors.
If neither "cached" nor "uncached" is present, it specifies "cached".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
(: create a document with test data :)
xdmp:document-insert("/attributes.xml",
<root>
<entry sku="100">
<product>apple</product>
</entry>
<entry sku="200">
<product>orange</product>
</entry>
<entry sku="1000">
<product>electric car</product>
</entry>
</root>) ;
(:
requires an attribute (range) index of
type xs:int on the "sku" attribute of
the "entry" element
:)
cts:search(doc("/attributes.xml")/root/entry,
cts:element-attribute-range-query(
xs:QName("entry"), xs:QName("sku"), ">=",
500))
(:
returns the following node:
<entry sku="1000">
<product>electric car</product>
</entry>
:)
|
|
|
|
cts:element-attribute-value-co-occurrences(
|
|
$element-name-1 as xs:QName,
|
|
$attribute-name-1 as xs:QName?,
|
|
$element-name-2 as xs:QName,
|
|
$attribute-name-2 as xs:QName?,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences from the specified element or element-attribute
value lexicon(s).
Value lexicons are implemented using range indexes;
consequently this function requires a range index for each element/attribute
pairs specified in the function.
If there is not a range index configured for each of the specified
element or element/attribute pairs, then an exception is thrown.
|
Parameters:
$element-name-1
:
An element QName.
|
$attribute-name-1
:
An attribute QName or empty sequence.
The empty sequence specifies an element lexicon.
|
$element-name-2
:
An element QName.
|
$attribute-name-2
:
An attribute QName or empty sequence.
The empty sequence specifies an element lexicon.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For both lexicons, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-1=type"
- For the first lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-2=type"
- For the second lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- For both lexicons, use the collation specified by
URI.
- "collation-1=URI"
- For the first lexicon, use the collation specified by
URI.
- "collation-2=URI"
- For the second lexicon, use the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Co-occurrences from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
element(cts:co-occurrence)* sequence.
|
$query
(optional):
Only include co-occurrences in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-attribute-value-geospatial-co-occurrences(
|
|
$element-name-1 as xs:QName,
|
|
$attribute-name-1 as xs:QName?,
|
|
$geo-element-name as xs:QName,
|
|
$child-name-1 as xs:QName?,
|
|
$child-name-2 as xs:QName?,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences from the specified element-attribute
value lexicon with the specified geospatial lexicon.
Value lexicons are implemented using range indexes;
consequently this function requires a range index for the element and attribute
pair specified in the function.
If there is not a range index configured for the specified
element and attribute pair, then an exception is thrown.
Geospatial lexicons are implemented using geospatial indexes;
consequently this function requires a geospatial index for the
element/attribute combination specified in the function.
If there is not a geospatial index configured for the specified
element/attribute combination, then an exception is thrown.
|
Parameters:
$element-name-1
:
An element QName.
|
$attribute-name-1
:
An attribute QName.
|
$geo-element-name
:
An element QName.
|
$child-name-1
:
An element or attribute QName or empty sequence.
The empty sequence specifies an element geospatial lexicon.
|
$child-name-2
:
An element or attribute QName or empty sequence.
The empty sequence specifies either an element lexicon or an
element-child geospatial lexicon.
|
$options
(optional):
Options. The default is ().
Options include:
- "geospatial-format=format"
- Use the kind of geospatial lexicon specified by format
(element, element-child, element-pair, or element-attribute-pair).
If neither of the child QNames is specified, the default is
"element"; if only the first of the child QNames is specified,
the default is "element-child:; if both child QNames are specified,
the default is "element-pair". If the selection is not compatible
with the number of geospatial QNames specified, an error is raised.
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For the non-geospatial lexicon, use the type specified by
type (int, unsignedInt, long, unsignedLong, float, double,
decimal, dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- For the non-geospatial lexicon, use the collation specified by
URI.
- "coordinate-system=URI"
- For the geospatial lexicons, use the coordinate system specified
by name.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "reversed"
- Consider the second lexicon as the first and vice versa.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Co-occurrences from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
element(cts:co-occurrence)* sequence.
|
$query
(optional):
Only include co-occurrences in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-attribute-value-match(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element-attribute value lexicon(s)
that match the specified pattern. Element-attribute value lexicons are
implemented using range indexes; consequently this function requires an
attribute range index for each of the element/attribute pairs specified
in the function. If there is not a range index configured for each of the
specified element/attribute pairs, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
String parameters may include wildcard characters.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "item-order"
- Values should be returned ordered by item.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:anyAtomicType* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-value-match(xs:QName("animals"),
xs:QName("name"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-attribute-value-query(
|
|
$element-name as xs:QName*,
|
|
$attribute-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-value-query |
|
 |
Summary:
Returns a query matching elements by name with attributes by name
with text content equal a given phrase.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$attribute-name
:
One or more attribute QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
One or more attribute values to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:search(//module,
cts:element-attribute-value-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' element
ancestors (or self) of 'function' elements that have
an attribute 'type' whose value equals 'MarkLogic
Corporation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-attribute-value-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation",
false(), true(), 0.5),
cts:element-word-query(
xs:QName("title"),
"faster"))))
=> .. relevance-ordered sequence of 'module' element
ancestors (or self) of both:
(a) 'function' elements with attribute 'type' whose
value equals the string 'MarkLogic Corporation',
ignoring embedded punctuation,
AND
(b) 'title' elements whose text content contains the
word 'faster', with the results from (a) given
weight 0.5, and the results from (b) given default
weight 1.0.
|
|
|
|
cts:element-attribute-value-ranges(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$bounds as xs:anyAtomicType*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:range)* |
|
 |
Summary:
Returns value ranges from the specified element-attribute value lexicon(s).
Element-attribute value lexicons are implemented using indexes;
consequently this function requires an attribute range index
of for each of the element/attribute pairs specified in the function.
If there is not a range index configured for each of the specified
element/attribute pairs, then an exception is thrown.
The values are divided into buckets. The $bounds parameter specifies
the number of buckets and the size of each bucket.
All included values are bucketed, even those less than the lowest bound
or greater than the highest bound. An empty sequence for $bounds specifies
one bucket, a single value specifies two buckets, two values specify
three buckets, and so on.
If you have string values and you pass a $bounds parameter
as in the following call:
cts:element-value-ranges(xs:QName("myElement"), ("f", "m"))
The first bucket contains string values that are less than the
string f, the second bucket contains string values greater than
or equal to f but less than m, and the third bucket
contains string values that are greater than or equal to m.
For each non-empty bucket, a cts:range element is returned.
Each cts:range element has a cts:minimum child
and a cts:maximum child. If a bucket is bounded, its
cts:range element will also have a
cts:lower-bound child if it is bounded from below, and
a cts:upper-bound element if it is bounded from above.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$bounds
(optional):
A sequence of range bounds.
The types must match the lexicon type.
The values must be in strictly ascending order.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Ranges should be returned in ascending order.
- "descending"
- Ranges should be returned in descending order.
- "empties"
- Include fully-bounded ranges whose frequency is 0. These ranges
will have no minimum or maximum value. Only empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Ranges should be returned ordered by frequency.
- "item-order"
- Ranges should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N ranges.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only ranges for buckets with at least one value from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then ranges with all included values may be returned. If a
$query parameter is not present, then "sample=N"
has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
(: Run the following to load data for this example.
Make sure you have an int element attribute
range index on my-node/@number. :)
for $x in (1 to 10)
return
xdmp:document-insert(fn:concat("/doc", fn:string($x), ".xml"),
<root><my-node number={$x}/></root>) ;
(: The following is based on the above setup :)
cts:element-attribute-value-ranges(xs:QName("my-node"),
xs:QName("number"), (5, 10, 15, 20), "empties")
=>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">1</cts:minimum>
<cts:maximum xsi:type="xs:int">4</cts:maximum>
<cts:upper-bound xsi:type="xs:int">5</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">5</cts:minimum>
<cts:maximum xsi:type="xs:int">9</cts:maximum>
<cts:lower-bound xsi:type="xs:int">5</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">10</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">10</cts:minimum>
<cts:maximum xsi:type="xs:int">10</cts:maximum>
<cts:lower-bound xsi:type="xs:int">10</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">15</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">15</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">20</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">20</cts:lower-bound>
</cts:range>
|
|
|
|
cts:element-attribute-values(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$start as xs:anyAtomicType?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element-attribute value lexicon(s).
Element-attribute value lexicons are implemented using indexes;
consequently this function requires an attribute range index
of for each of the element/attribute pairs specified in the function.
If there is not a range index configured for each of the specified
element/attribute pairs, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$start
(optional):
A starting value. The parameter type must match the lexicon type.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N fragments
after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:anyAtomicType* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-values(xs:QName("animal"),
xs:QName("name"),
"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-attribute-word-match(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element-attribute word lexicon(s) that
match a wildcard pattern. This function requires an element-attribute
word lexicon for each of the element/attribute pairs specified in the
function. If there is not an element-attribute word lexicon
configured for any of the specified element/attribute pairs, then
an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments
after skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
after skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-word-match(xs:QName("animals"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-attribute-word-query(
|
|
$element-name as xs:QName*,
|
|
$attribute-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-attribute-word-query |
|
 |
Summary:
Returns a query matching elements by name
with attributes by name
with text content containing a given phrase.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$attribute-name
:
One or more attribute QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
Some words or phrases to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
cts:search(//module,
cts:element-attribute-word-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements that have a 'type'
attribute whose value contains the phrase
'MarkLogic Corporation'.
|
Example:
cts:search(//module,
cts:element-attribute-word-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation", "case-insensitive"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements that have a 'type'
attribute whose value contains the phrase
'MarkLogic Corporation', or any other case-shift,
like 'MARKLOGIC CorpoRation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-attribute-word-query(
xs:QName("function"),
xs:QName("type"),
"MarkLogic Corporation",
"punctuation-insensitive", 0.5),
cts:element-word-query(
xs:QName("title"),
"faster"))))
=> .. relevance-ordered sequence of 'module' element
ancestors of both:
(a) 'function' elements with 'type' attribute whose value
contains the phrase 'MarkLogic Corporation',
ignoring embedded punctuation,
AND
(b) 'title' elements whose text content contains the
term 'faster',
with the results of the first query given weight 0.5,
as opposed to the default 1.0 for the second query.
|
|
|
|
cts:element-attribute-words(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element-attribute word lexicon(s).
This function requires an element-attribute word lexicon for each of the
element/attribute pairs specified in the function. If there is not an
element/attribute word lexicon configured for any of the specified
element/attribute pairs, then an exception is thrown. The words are
returned in collation order.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after
skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after
skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-words(xs:QName("animal"),
xs:QName("name"),
"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-child-geospatial-boxes(
|
|
$parent-element-names as xs:QName*,
|
|
$child-element-names as xs:QName*,
|
|
[$latitude-bounds as xs:double*],
|
|
[$longitude-bounds as xs:double*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:box* |
|
 |
Summary:
Returns boxes derived from the specified element point lexicon(s).
Element point lexicons are implemented using geospatial indexes; consequently
this function requires an element child geospatial index for each element
specified in the function. If there is not a geospatial index configured for
each of the specified element/child combinations, an exception is thrown.
The points are divided into box-shaped buckets. The $latitude-bounds and
$longitude-bounds parameters specify the number and the size of each
box-shaped bucket. All included points are bucketed, even those outside
the bounds. An empty sequence for both $latitude-bounds and
$longitude-bounds specifies one bucket, a single value for both specifies
four buckets, two values for both specify nine buckets, and so on.
For each non-empty bucket, a cts:box value is returned.
By default, the cts:box value is the minimum bounding box
of all the points in the bucket. If the "gridded" option is specified,
then if a bucket is bounded on a side, its corresponding
cts:box side is the bound.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$parent-element-names
:
One or more element QNames.
|
$child-element-names
:
One or more element QNames.
|
$latitude-bounds
(optional):
A sequence of latitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$longitude-bounds
(optional):
A sequence of longitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Boxes should be returned in ascending order.
- "descending"
- Boxes should be returned in descending order.
- "gridded"
- For each side that a bucket is bounded, return the corresponding
bound as the edge of the box, instead of the extremum from the
points in the bucket.
- "empties"
- Include fully-bounded ranges whose frequency is 0. Only
empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Points from any fragment should be included.
- "document"
- Points from document fragments should be included.
- "properties"
- Points from properties fragments should be included.
- "locks"
- Points from locks fragments should be included.
- "frequency-order"
- Boxes should be returned ordered by frequency.
- "item-order"
- Boxes should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included point.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included point.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N boxes.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Points from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only boxes for buckets with at least one point from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only points from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:box* sequence.
|
$query
(optional):
Only include points in fragments selected by the cts:query,
and compute frequencies from this set of included points.
The points do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all boxes with included points may be returned. If a $query
parameter is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then points from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-child-geospatial-query(
|
|
$parent-element-name as xs:QName*,
|
|
$child-element-names as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-child-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name which has
specific element children representing latitude and longitude values for
a point contained within the given geographic box, circle, or polygon, or
equal to the given point. Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$parent-element-name
:
One or more parent element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$child-element-names
:
One or more child element QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching latitude
child in any point instance will be checked. The element must specify
both latitude and longitude coordinates.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where multiple
regions are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0.
This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed in the content of the element as a child
of numbers, separated by whitespace and punctuation (excluding decimal points
and sign characters).
Point values and boundary specifications of boxes are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary, no
points will match. However, longitudes wrap around the globe, so that if
the western boundary is east of the eastern boundary,
then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point><pos>10.5 30.0</pos></point></item>
<item><point><pos>15.35 35.34</pos></point></item>
<item><point><pos>5.11 40.55</pos></point></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-child-geospatial-query(xs:QName("point"), xs:QName("pos"),
cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point><pos>15.35 35.34</pos></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-child-geospatial-query(xs:QName("point"), xs:QName("pos"),
cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point><pos>10.5 30.0</pos></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-child-geospatial-query(xs:QName("point"), xs:QName("pos"),
cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-child-geospatial-value-match(
|
|
$element-names as xs:QName*,
|
|
$child-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element child geospatial value lexicon(s)
that match the specified wildcard pattern. Element
child geospatial value lexicons
are implemented using geospatial indexes; consequently this function
requires an element child geospatial index for each element and child
specified in the
function. If there is not a geospatial index configured for each of the
specified elements/child combinations, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$child-names
:
One or more child element QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-child-geospatial-value-match(
xs:QName("location"),xs:QName("pos"),cts:point(10,20))
=> 10,20
|
|
|
|
cts:element-child-geospatial-values(
|
|
$element-names as xs:QName*,
|
|
$child-names as xs:QName*,
|
|
[$start as cts:point?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element-child geospatial value lexicon(s).
Element-child geospatial value lexicons are implemented using geospatial
indexes;
consequently this function requires an element-child geospatial index
of for each of the element/child pairs specified in the function.
If there is not a range index configured for each of the specified
element/child pairs, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$child-names
:
One or more child element QNames.
|
$start
(optional):
A starting value.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N fragments after
skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or child QNames are specified,
then all possible element/child QName combinations are used
to select the matching values.
|
Example:
cts:element-child-geospatial-values(
xs:QName("location"), xs:QName("position"), cts:point(0,0) )
=> (cts:point(0,0),cts:point(0,10),cts:point(0,20),...)
|
|
|
|
cts:element-geospatial-boxes(
|
|
$element-names as xs:QName*,
|
|
[$latitude-bounds as xs:double*],
|
|
[$longitude-bounds as xs:double*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:box* |
|
 |
Summary:
Returns boxes derived from the specified element point lexicon(s).
Element point lexicons are implemented using geospatial indexes; consequently
this function requires an element geospatial index for each element
specified in the function. If there is not a geospatial index configured for
each of the specified elements, an exception is thrown.
The points are divided into box-shaped buckets. The $latitude-bounds and
$longitude-bounds parameters specify the number and the size of each
box-shaped bucket. All included points are bucketed, even those outside
the bounds. An empty sequence for both $latitude-bounds and
$longitude-bounds specifies one bucket, a single value for both specifies
four buckets, two values for both specify nine buckets, and so on.
For each non-empty bucket, a cts:box value is returned.
By default, the cts:box value is the minimum bounding box
of all the points in the bucket. If the "gridded" option is specified,
then if a bucket is bounded on a side, its corresponding
cts:box side is the bound.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$element-names
:
One or more element QNames.
|
$latitude-bounds
(optional):
A sequence of latitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$longitude-bounds
(optional):
A sequence of longitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Boxes should be returned in ascending order.
- "descending"
- Boxes should be returned in descending order.
- "gridded"
- For each side that a bucket is bounded, return the corresponding
bound as the edge of the box, instead of the extremum from the
points in the bucket.
- "empties"
- Include fully-bounded ranges whose frequency is 0. Only
empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Points from any fragment should be included.
- "document"
- Points from document fragments should be included.
- "properties"
- Points from properties fragments should be included.
- "locks"
- Points from locks fragments should be included.
- "frequency-order"
- Boxes should be returned ordered by frequency.
- "item-order"
- Boxes should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included point.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included point.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N boxes.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Points from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only boxes for buckets with at least one point from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only points from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:box* sequence.
|
$query
(optional):
Only include points in fragments selected by the cts:query,
and compute frequencies from this set of included points.
The points do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all boxes with included points may be returned. If a $query
parameter is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then points from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-geospatial-query(
|
|
$element-name as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name whose content
represents a point contained within the given geographic box, circle, or
polygon, or equal to the given point. Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where multiple
regions are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
|
$weight
(optional):
A weight for this query. The default is 1.0. This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed in the content of the element as a pair
of numbers, separated by whitespace and punctuation (excluding decimal points
and sign characters).
Point values and boundary specifications of
boxes are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary, no
points will match. However, longitudes wrap around the globe, so that if
the western boundary is east of the eastern boundary,
then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point>10.5, 30.0</point></item>
<item><point>15.35, 35.34</point></item>
<item><point>5.11, 40.55</point></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-geospatial-query(xs:QName("point"), cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point>15.35, 35.34</point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-geospatial-query(xs:QName("point"), cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point>10.5, 30.0</point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-geospatial-query(xs:QName("point"), cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-geospatial-value-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element geospatial value lexicon(s)
that match the specified wildcard pattern. Element geospatial value lexicons
are implemented using geospatial indexes; consequently this function
requires an element geospatial index for each element specified in the
function. If there is not a geospatial index configured for each of the
specified elements, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-geospatial-value-match(xs:QName("point"),cts:point(10,20))
=> 10,20
|
|
|
|
cts:element-geospatial-values(
|
|
$element-names as xs:QName*,
|
|
[$start as cts:point?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element geospatial value lexicon(s).
Geospatial value lexicons are implemented using geospatial indexes;
consequently this function requires an element geospatial index for each
element specified
in the function. If there is not a geospatial index configured for each
of the specified elements, an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$start
(optional):
A starting value.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "type=long-lat-point"
- Specifies the format for the point in the data as longitude first,
latitude second.
- "type=point"
- Specifies the format for the point in the data as latitude first,
longitude second. This is the default format.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-geospatial-values(xs:QName("point"),cts:point(0,0))
=> (cts:point(0,0),cts:point(0,10),cts:point(0,20),...)
|
|
|
|
cts:element-pair-geospatial-boxes(
|
|
$parent-element-names as xs:QName*,
|
|
$latitude-names as xs:QName*,
|
|
$longitude-names as xs:QName*,
|
|
[$latitude-bounds as xs:double*],
|
|
[$longitude-bounds as xs:double*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:box* |
|
 |
Summary:
Returns boxes derived from the specified element point lexicon(s).
Element point lexicons are implemented using geospatial indexes; consequently
this function requires a geospatial element pair index for each
parent and pair of child elements
specified in the function. If there is not a geospatial index configured for
each of the specified combinations, an exception is thrown.
The points are divided into box-shaped buckets. The $latitude-bounds and
$longitude-bounds parameters specify the number and the size of each
box-shaped bucket. All included points are bucketed, even those outside
the bounds. An empty sequence for both $latitude-bounds and
$longitude-bounds specifies one bucket, a single value for both specifies
four buckets, two values for both specify nine buckets, and so on.
For each non-empty bucket, a cts:box value is returned.
By default, the cts:box value is the minimum bounding box
of all the points in the bucket. If the "gridded" option is specified,
then if a bucket is bounded on a side, its corresponding
cts:box side is the bound.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$parent-element-names
:
One or more element QNames.
|
$latitude-names
:
One or more element QNames.
|
$longitude-names
:
One or more element QNames.
|
$latitude-bounds
(optional):
A sequence of latitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$longitude-bounds
(optional):
A sequence of longitude bounds.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Boxes should be returned in ascending order.
- "descending"
- Boxes should be returned in descending order.
- "gridded"
- For each side that a bucket is bounded, return the corresponding
bound as the edge of the box, instead of the extremum from the
points in the bucket.
- "empties"
- Include fully-bounded ranges whose frequency is 0. Only
empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Points from any fragment should be included.
- "document"
- Points from document fragments should be included.
- "properties"
- Points from properties fragments should be included.
- "locks"
- Points from locks fragments should be included.
- "frequency-order"
- Boxes should be returned ordered by frequency.
- "item-order"
- Boxes should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included point.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included point.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N boxes.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Points from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only boxes for buckets with at least one point from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only points from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:box* sequence.
|
$query
(optional):
Only include points in fragments selected by the cts:query,
and compute frequencies from this set of included points.
The points do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all boxes with included points may be returned. If a $query
parameter is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then points from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-pair-geospatial-query(
|
|
$element-name as xs:QName*,
|
|
$latitude-element-names as xs:QName*,
|
|
$longitude-element-names as xs:QName*,
|
|
$regions as cts:region*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-pair-geospatial-query |
|
 |
Summary:
Returns a cts:query matching elements by name which has
specific element children representing latitude and longitude values for
a point contained within the given geographic box, circle, or polygon, or
equal to the given point.
Points that lie
between the southern boundary and the northern boundary of a box,
travelling northwards,
and between the western boundary and the eastern boundary of the box,
travelling eastwards, will match.
Points contained within the given radius of the center point of a circle will
match, using the curved distance on the surface of the Earth.
Points contained within the given polygon will match, using great circle arcs
over a spherical model of the Earth as edges. An error may result
if the polygon is malformed in some way.
Points equal to the a given point will match, taking into account the fact
that longitudes converge at the poles.
Using the geospatial query constructors requires a valid geospatial
license key; without a valid license key, searches that include
geospatial queries will throw an exception.
|
Parameters:
$element-name
:
One or more parent element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$latitude-element-names
:
One or more latitude element QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching latitude
child in any point instance will be checked.
|
$longitude-element-names
:
One or more longitude element QNames to match.
When multiple QNames are specified, the query matches
if any QName matches; however, only the first matching longitude
child in any point instance will be checked.
|
$regions
:
One or more geographic boxes, circles, polygons, or points. Where multiple
regions are specified, the query matches if any region matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0.
This option is currently
ignored; geospatial queries do not contribute to the score.
|
|
Usage Notes:
The point value is expressed in the content of the element as a pair
of numbers, separated by whitespace and punctuation (excluding decimal points
and sign characters).
Point values and boundary specifications of boxes are given in degrees
relative to the WGS84 coordinate system. Southern latitudes and Western
longitudes take negative values. Longitudes will be wrapped to the range
(-180,+180) and latitudes will be clipped to the range (-90,+90).
If the northern boundary of a box is south of the southern boundary, no
points will match. However, longitudes wrap around the globe, so that if
the western boundary is east of the eastern boundary,
then the box crosses the anti-meridian.
Special handling occurs at the poles, as all longitudes exist at latitudes
+90 and -90.
If neither "cached" nor "uncached" is present, it specifies "cached".
|
Example:
(: create a document with test data :)
xdmp:document-insert("/points.xml",
<root>
<item><point><lat>10.5</lat><long>30.0</long></point></item>
<item><point><lat>15.35</lat><long>35.34</long></point></item>
<item><point><lat>5.11</lat><long>40.55</long></point></item>
</root> );
cts:search(doc("/points.xml")//item,
cts:element-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 35.0, 20.0, 40.0)))
(:
returns the following node:
<item><point><lat>15.35</lat><long>35.34</long></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(10.0, 40.0, 20.0, 35.0)))
(:
returns the following nodes (wrapping around the Earth):
<item><point><lat>10.5</lat><long>30.0</long></point></item>
:)
;
cts:search(doc("/points.xml")//item,
cts:element-pair-geospatial-query(xs:QName("point"),
xs:QName("lat"), xs:QName("long"), cts:box(20.0, 35.0, 10.0, 40.0)))
(:
throws an error (latitudes do not wrap)
:)
;
|
|
|
|
cts:element-pair-geospatial-value-match(
|
|
$element-names as xs:QName*,
|
|
$latitude-names as xs:QName*,
|
|
$longitude-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element pair geospatial value lexicon(s)
that match the specified wildcard pattern. Element pair
geospatial value lexicons
are implemented using geospatial indexes; consequently this function
requires an element pair geospatial index for each combination of elements
specified in the
function. If there is not a geospatial index configured for each of the
specified combinations, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$latitude-names
:
One or more latitude element QNames.
|
$longitude-names
:
One or more longitude element QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-pair-geospatial-value-match(
xs:QName("location"),xs:QName("lat"),xs:QName("long"),cts:point(10,20))
=> 10,20
|
|
|
|
cts:element-pair-geospatial-values(
|
|
$element-names as xs:QName*,
|
|
$latitude-names as xs:QName*,
|
|
$longitude-names as xs:QName*,
|
|
[$start as cts:point?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as cts:point* |
|
 |
Summary:
Returns values from the specified element-pair geospatial value lexicon(s).
element-pair geospatial value lexicons are implemented using geospatial
indexes;
consequently this function requires an element-pair geospatial index
of for each of the combinatation specified in the function.
If there is not a geospatial index configured for each of the specified
combinations, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$latitude-names
:
One or more latitude element QNames.
|
$longitude-names
:
One or more longitude element QNames.
|
$start
(optional):
A starting value.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- Use the lexicon with the coordinate system specified by
name.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N fragments after
skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as a
cts:point* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or child QNames are specified,
then all possible element/child QName combinations are used
to select the matching values.
|
Example:
cts:element-pair-geospatial-values(
xs:QName("location"), xs:QName("position"), cts:point(0,0) )
=> (cts:point(0,0),cts:point(0,10),cts:point(0,20),...)
|
|
|
|
cts:element-query(
|
|
$element-name as xs:QName*,
|
|
$query as cts:query
|
| ) as cts:element-query |
|
 |
Summary:
Returns a cts:query matching elements by name
with the content constrained by the given cts:query in the
second parameter.
Searches for matches in the specified element and all of its descendants.
If the specified query in the second parameter has any
cts:element-attribute-*-query constructors, it will search
attributes directly on the specified element and attributes on any
descendant elements (see the second example below).
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$query
:
A query for the element to match. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
|
Usage Notes:
Enabling both the word position and element position indexes ("word position"
and "element word position" in the database configuration screen of the
Admin Interface) will speed up query performance for many queries that use
cts:element-query. The position indexes enable MarkLogic
Server to eliminate many false-positive results, which can reduce
disk I/O and processing, thereby speeding the performance of many queries.
The amount of benefit will vary depending on your data.
|
Example:
cts:search(//module,
cts:element-query(
xs:QName("function"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' elements
ancestors (or self) of elements with QName 'function'
and text content containing the phrase 'MarkLogic
Corporation'.
|
Example:
let $x := <a attr="something">hello</a>
return
cts:contains($x, cts:element-query(xs:QName("a"),
cts:and-query((
cts:element-attribute-word-query(xs:QName("a"),
xs:QName("attr"), "something"),
cts:word-query("hello")))))
(: returns true :)
|
|
|
|
cts:element-range-query(
|
|
$element-name as xs:QName*,
|
|
$operator as xs:string,
|
|
$value as xs:anyAtomicType*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-range-query |
|
 |
Summary:
Returns a cts:query matching elements by name with a
range-index entry equal a given value. Searches with the
cts:element-range-query
constructor require an element range index on the specified QName(s);
if there is no range index configured, then an exception is thrown.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$operator
:
A comparison operator.
Operators include:
- "<"
- Match range index values less than $value.
- "<="
- Match range index values less than or equal to $value.
- ">"
- Match range index values greater than $value.
- ">="
- Match range index values greater than or equal to $value.
- "="
- Match range index values equal to $value.
- "!="
- Match range index values not equal to $value.
|
$value
:
One or more element values to match.
When multiple values are specified,
the query matches if any value matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "collation=URI"
- Use the range index with the collation specified by
URI. If not specified, then the default collation
from the query is used. If a range index with the specified
collation does not exist, an error is thrown.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0. In the current release,
this option is ignored; range queries do not contribute to the score.
|
|
Usage Notes:
If you want to constrain on a range of values, you can combine multiple
cts:element-range-query constructors together
with cts:and-query or any of the other composable
cts:query constructors, as in the last part of the example
below.
If neither "cached" nor "uncached" is present, it specifies "cached".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
(: create a document with test data :)
xdmp:document-insert("/dates.xml",
<root>
<entry>
<date>2007-01-01</date>
<info>Some information.</info>
</entry>
<entry>
<date>2006-06-23</date>
<info>Some other information.</info>
</entry>
<entry>
<date>1971-12-23</date>
<info>Some different information.</info>
</entry>
</root>);
(:
requires an element (range) index of
type xs:date on "date"
:)
cts:search(doc("/dates.xml")/root/entry,
cts:element-range-query(xs:QName("date"), "<=",
xs:date("2000-01-01")))
(:
returns the following node:
<entry>
<date>1971-12-23</date>
<info>Some different information.</info>
</entry>
:)
;
(:
requires an element (range) index of
type xs:date on "date"
:)
cts:search(doc("/dates.xml")/root/entry,
cts:and-query((
cts:element-range-query(xs:QName("date"), ">",
xs:date("2006-01-01")),
cts:element-range-query(xs:QName("date"), "<",
xs:date("2008-01-01")))))
(:
returns the following 2 nodes:
<entry>
<date>2007-01-01</date>
<info>Some information.</info>
</entry>
<entry>
<date>2006-06-23</date>
<info>Some other information.</info>
</entry>
:)
|
|
|
|
cts:element-value-co-occurrences(
|
|
$element-name-1 as xs:QName,
|
|
$element-name-2 as xs:QName,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences (that is, pairs of values, both of which appear
in the same fragment) from the specified element value lexicon(s). The
values are returned as an XML element with two children, each child
containing one of the co-occurring values. You can use
cts:frequency on each item returned to find how many times
the pair occurs.
Value lexicons are implemented using range indexes; consequently
this function requires an element range index for each element specified
in the function, and the range index must have range value positions
set to true. If there is not a range index configured for each
of the specified elements, and if the range value positions is not
enabled for the any of the range indexes, an exception is thrown.
|
Parameters:
$element-name-1
:
An element QName.
|
$element-name-2
:
An element QName.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For both lexicons, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-1=type"
- For the first lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-2=type"
- For the second lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- For both lexicons, use the collation specified by
URI.
- "collation-1=URI"
- For the first lexicon, use the collation specified by
URI.
- "collation-2=URI"
- For the second lexicon, use the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "sample=N"
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Co-occurrences from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
element(cts:co-occurrence)* sequence.
|
$query
(optional):
Only include co-occurrences in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
(:
this query has the database fragmented on SPEECH and
finds the first 3 SPEAKERs that co-occur in a SPEECH
:)
cts:element-value-co-occurrences(
xs:QName("SPEAKER"),xs:QName("SPEAKER"),
("frequency-order","ordered"),
cts:document-query("hamlet.xml"))[1 to 3]
=>
<cts:co-occurrence xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">MARCELLUS</cts:value>
<cts:value xsi:type="xs:string">BERNARDO</cts:value>
</cts:co-occurrence>
<cts:co-occurrence xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">ROSENCRANTZ</cts:value>
<cts:value xsi:type="xs:string">GUILDENSTERN</cts:value>
</cts:co-occurrence>
<cts:co-occurrence xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">HORATIO</cts:value>
<cts:value xsi:type="xs:string">MARCELLUS</cts:value>
</cts:co-occurrence>
|
Example:
(:
this query has the database fragmented on SPEECH and
finds SPEAKERs that co-occur in a SPEECH, returned
as a map
:)
cts:element-value-co-occurrences(
xs:QName("SPEAKER"),xs:QName("SPEAKER"),
("frequency-order","ordered", "map"),
cts:document-query("hamlet.xml"))
=>
map:map(
<map:map xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:map="http://marklogic.com/xdmp/map">
<map:entry key="HORATIO">
<map:value xsi:type="xs:string">MARCELLUS</map:value>
</map:entry>
<map:entry key="CORNELIUS">
<map:value xsi:type="xs:string">VOLTIMAND</map:value>
</map:entry>
<map:entry key="MARCELLUS">
<map:value xsi:type="xs:string">BERNARDO</map:value>
<map:value xsi:type="xs:string">HORATIO</map:value>
</map:entry>
<map:entry key="ROSENCRANTZ">
<map:value xsi:type="xs:string">GUILDENSTERN</map:value>
</map:entry>
</map:map>)
|
|
|
|
cts:element-value-geospatial-co-occurrences(
|
|
$element-name-1 as xs:QName,
|
|
$geo-element-name as xs:QName,
|
|
$child-name-1 as xs:QName?,
|
|
$child-name-2 as xs:QName?,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences from the specified element
value lexicon with the specified geospatial lexicon.
Value lexicons are implemented using range indexes;
consequently this function requires a range index for the element
specified in the function.
If there is not a range index configured for the specified
element, then an exception is thrown.
Geospatial lexicons are implemented using geospatial indexes;
consequently this function requires a geospatial index for the
element/attribute combination specified in the function.
If there is not a geospatial index configured for the specified
element/attribute combination, then an exception is thrown.
|
Parameters:
$element-name-1
:
An element QName.
|
$geo-element-name
:
An element QName.
|
$child-name-1
:
An element or attribute QName or empty sequence.
The empty sequence specifies an element geospatial lexicon.
|
$child-name-2
:
An element or attribute QName or empty sequence.
The empty sequence specifies either an element lexicon or an
element-child geospatial lexicon.
|
$options
(optional):
Options. The default is ().
Options include:
- "geospatial-format=format"
- Use the kind of geospatial lexicon specified by format
(element, element-child, element-pair, or element-attribute-pair).
If neither of the child QNames is specified, the default is
"element"; if only the first of the child QNames is specified,
the default is "element-child:; if both child QNames are specified,
the default is "element-pair". If the selection is not compatible
with the number of geospatial QNames specified, an error is raised.
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For the non-geospatial lexicon, use the type specified by
type (int, unsignedInt, long, unsignedLong, float, double,
decimal, dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-2=type"
- For the geospatial lexicon, use the type specified by
type-2 (point or long-lat-point)
- "collation=URI"
- For the non-geospatial lexicon, use the collation specified by
URI.
- "coordinate-system=URI"
- For the geospatial lexicons, use the coordinate system specified
by name.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "reversed"
- Consider the second lexicon as the first and vice versa.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Co-occurrences from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
element(cts:co-occurrence)* sequence.
|
$query
(optional):
Only include co-occurrences in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-value-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element value lexicon(s)
that match the specified wildcard pattern. Element value lexicons
are implemented using range indexes; consequently this function
requires an element range index for each element specified in the
function. If there is not a range index configured for each of the
specified elements, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
String parameters may include wildcard characters.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:anyAtomicType* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:element-value-match(xs:QName("animal"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-value-query(
|
|
$element-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-value-query |
|
 |
Summary:
Returns a query matching elements by name with text content equal a
given phrase. cts:element-value-query only matches against
simple elements (that is, elements that contain only text and have no element
children).
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
One or more element values to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
Note that the text content for the value in a
cts:element-value-query is treated the same as a phrase in a
cts:word-query, where the phrase is the element value.
Therefore, any wildcard and/or stemming rules are treated like a phrase.
For example, if you have an element value of "hello friend" with wildcarding
enabled for a query, a cts:element-value-query for "he*" will
not match because the wildcard matches do not span word boundaries, but a
cts:element-value-query for "hello *" will match. A search
for "*" will match, because a "*" wildcard by itself is defined to match
the value. Similarly, stemming rules are applied to each term, so a
search for "hello friends" would match when stemming is enabled for the query
because "friends" matches "friend". For an example, see the
fourth example below.
Similarly, because a "*" wildcard by itself is defined to match
the value, the following query will match any element with the
QName my-element, regardless of the wildcard indexes enabled in
the database configuration:
cts:element-value-query(xs:QName("my-element"), "*", "wildcarded")
|
Example:
cts:search(//module,
cts:element-value-query(
xs:QName("function"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements whose text
content equals 'MarkLogic Corporation'.
|
Example:
cts:search(//module,
cts:element-value-query(
xs:QName("function"),
"MarkLogic Corporation", "case-insensitive"))
=> .. relevance-ordered sequence of 'module' element
ancestors of 'function' elements whose text
content equals 'MarkLogic Corporation', or any other
case-shift like 'MARKLOGIC CorpoRation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-value-query(
xs:QName("function"),
"MarkLogic Corporation",
"punctuation-insensitive", 0.5),
cts:element-value-query(
xs:QName("title"),
"Word Query"))))
=> .. relevance-ordered sequence of 'module' elements
which are ancestors of both:
(a) 'function' elements with text content equal to
'MarkLogic Corporation', ignoring embedded
punctuation,
AND
(b) 'title' elements with text content equal to
'Word Query', with the results of the first sub-query
query given weight 0.5, and the results of the second
sub-query given the default weight 1.0. As a result,
the title phrase 'Word Query' counts more heavily
towards the relevance score.
|
Example:
let $node := <my-node>hello friend</my-node>
return (
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
"hello friends", "stemmed")),
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
"he*", "wildcarded")),
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
"hello f*", "wildcarded"))
)
=> true
false
true
|
|
|
|
cts:element-value-ranges(
|
|
$element-names as xs:QName*,
|
|
[$bounds as xs:anyAtomicType*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:range)* |
|
 |
Summary:
Returns value ranges from the specified element value lexicon(s).
Value lexicons are implemented using range indexes; consequently this
function requires an element range index for each element specified
in the function. If there is not a range index configured for each
of the specified elements, an exception is thrown.
The values are divided into buckets. The $bounds parameter specifies
the number of buckets and the size of each bucket.
All included values are bucketed, even those less than the lowest bound
or greater than the highest bound. An empty sequence for $bounds specifies
one bucket, a single value specifies two buckets, two values specify
three buckets, and so on.
If you have string values and you pass a $bounds parameter
as in the following call:
cts:element-value-ranges(xs:QName("myElement"), ("f", "m"))
The first bucket contains string values that are less than the
string f, the second bucket contains string values greater than
or equal to f but less than m, and the third bucket
contains string values that are greater than or equal to m.
For each non-empty bucket, a cts:range element is returned.
Each cts:range element has a cts:minimum child
and a cts:maximum child. If a bucket is bounded, its
cts:range element will also have a
cts:lower-bound child if it is bounded from below, and
a cts:upper-bound element if it is bounded from above.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$element-names
:
One or more element QNames.
|
$bounds
(optional):
A sequence of range bounds.
The types must match the lexicon type.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Ranges should be returned in ascending order.
- "descending"
- Ranges should be returned in descending order.
- "empties"
- Include fully-bounded ranges whose frequency is 0. These ranges
will have no minimum or maximum value. Only empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Ranges should be returned ordered by frequency.
- "item-order"
- Ranges should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N ranges.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only ranges for buckets with at least one value from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then ranges with all included values may be returned. If a
$query parameter is not present, then "sample=N"
has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
(: Run the following to load data for this example.
Make sure you have an int element range index on
number. :)
for $x in (1 to 10)
return
xdmp:document-insert(fn:concat("/doc", fn:string($x), ".xml"),
<root><number>{$x}</number></root>) ;
(: The following is based on the above setup :)
cts:element-value-ranges(xs:QName("number"),
(5, 10, 15, 20), "empties")
=>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">1</cts:minimum>
<cts:maximum xsi:type="xs:int">4</cts:maximum>
<cts:upper-bound xsi:type="xs:int">5</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">5</cts:minimum>
<cts:maximum xsi:type="xs:int">9</cts:maximum>
<cts:lower-bound xsi:type="xs:int">5</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">10</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">10</cts:minimum>
<cts:maximum xsi:type="xs:int">10</cts:maximum>
<cts:lower-bound xsi:type="xs:int">10</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">15</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">15</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">20</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">20</cts:lower-bound>
</cts:range>
|
Example:
(: this query has the database fragmented on SPEECH and
finds four ranges of SPEAKERs :)
cts:element-value-ranges(xs:QName("SPEAKER"),("F","N","S"));
=>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">All</cts:minimum>
<cts:maximum xsi:type="xs:string">Danes</cts:maximum>
<cts:upper-bound xsi:type="xs:string">F</cts:maximum>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">First Ambassador</cts:minimum>
<cts:maximum xsi:type="xs:string">Messenger</cts:maximum>
<cts:lower-bound xsi:type="xs:string">F</cts:maximum>
<cts:upper-bound xsi:type="xs:string">N</cts:maximum>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">OPHELIA</cts:minimum>
<cts:maximum xsi:type="xs:string">ROSENCRANTZ</cts:maximum>
<cts:lower-bound xsi:type="xs:string">N</cts:maximum>
<cts:upper-bound xsi:type="xs:string">S</cts:maximum>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">Second Clown</cts:minimum>
<cts:maximum xsi:type="xs:string">VOLTIMAND</cts:maximum>
<cts:lower-bound xsi:type="xs:string">S</cts:maximum>
</cts:range>
|
Example:
(: this is the same query has above, but it is getting the counts
of the number of SPEAKERs for each bucket :)
for $bucket in cts:element-value-ranges(xs:QName("SPEAKER"),("F","N","S"))
return cts:frequency($bucket);
=>
9602
11329
5167
4983
|
|
|
|
cts:element-values(
|
|
$element-names as xs:QName*,
|
|
[$start as xs:anyAtomicType?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element value lexicon(s).
Value lexicons are implemented using range indexes; consequently this
function requires an element range index for each element specified
in the function. If there is not a range index configured for each
of the specified elements, an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$start
(optional):
A starting value. The parameter type must match the lexicon type.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:anyAtomicType* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-values(xs:QName("animal"),"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-word-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element word lexicon(s) that match
a wildcard pattern. This function requires an element word lexicon
configured for each of the specified elements in the function. If there
is not an element word lexicon configured for any of the specified
elements, an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
Only words that can be matched with element-word-query are included.
That is, only words present in immediate text node children of the
specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs or
phrase-throughs.
|
Example:
cts:element-word-match(xs:QName("animal"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-word-query(
|
|
$element-name as xs:QName*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:element-word-query |
|
 |
Summary:
Returns a query matching elements by name with text content containing
a given phrase. Searches only through immediate text node children of
the specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs
or phrase-throughs; does not search through any other children of
the specified element.
|
Parameters:
$element-name
:
One or more element QNames to match.
When multiple QNames are specified,
the query matches if any QName matches.
|
$text
:
Some words or phrases to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
Relevance adjustment for the "distance-weight" option depends on
the closest proximity of any two matches of the query. For example,
cts:element-word-query(xs:QName("p"),("dog","cat"),("distance-weight=10"))
will adjust relevance based on the distance between the closest pair of
matches of either "dog" or "cat" within an element named "p"
(the pair may consist only of matches of
"dog", only of matches of "cat", or a match of "dog" and a match of "cat").
|
Example:
cts:search(//module,
cts:element-word-query(
xs:QName("function"),
"MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'module' elements
ancestors (or self) of elements with QName 'function'
and text content containing the phrase 'MarkLogic
Corporation'.
|
Example:
cts:search(//module,
cts:element-word-query(
xs:QName("function"),
"MarkLogic Corporation", "case-sensitive"))
=> .. relevance-ordered sequence of 'module' elements
ancestors (or self) of elements with QName 'function'
and text content containing the phrase 'MarkLogic
Corporation',
or any other case-shift, like 'MarkLogic Corporation'.
|
Example:
cts:search(//module,
cts:and-query((
cts:element-word-query(
xs:QName("function"),
"MarkLogic Corporation",
("case-insensitive", "punctuation-insensitive"), 0.5),
cts:element-word-query(
xs:QName("title"),
"faster"))))
=> .. relevance-ordered sequence of 'module' element
ancestors (or self) of both:
(a) 'function' elements with text content containing
the phrase 'MarkLogic Corporation', ignoring embedded
punctuation,
AND
(b) 'title' elements containing the word 'faster',
with the results of the first sub-query query given
weight 0.5, and the results of the second sub-query
given the default weight 1.0. As a result, the title
term 'faster' counts more towards the relevance
score.
|
|
|
|
cts:element-words(
|
|
$element-names as xs:QName*,
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element word lexicon. This function
requires an element word lexicon for each of the element specified in the
function. If there is not an element word lexicon configured for any
of the specified elements, an exception is thrown. The words are
returned in collation order.
|
Parameters:
$element-names
:
One or more element QNames.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
Only words that can be matched with element-word-query are included.
That is, only words present in immediate text node children of the
specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs or
phrase-throughs.
|
Example:
cts:element-words(xs:QName("animal"),"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:entity-highlight(
|
|
$node as node(),
|
|
$expr as item()*
|
| ) as node() |
|
 |
Summary:
Returns a copy of the node, replacing any entities found
with the specified expression. You can use this function
to easily highlight any entities in an XML document in an arbitrary manner.
If you do not need fine-grained control of the XML markup returned,
you can use the entity:enrich XQuery module function instead.
A valid entity enrichment license key is required
to use cts:entity-highlight;
without a valid license key, it throws an exception. If you
have a valid license for entity enrichment, you can entity enrich text
in English and in any other languages for which you have a valid license
key. For languages in which you do not have a valid license key,
cts:entity-highlight finds no entities for text in that
language.
|
Parameters:
$node
:
A node to run entity highlight on. The node must be either a document node
or an element node; it cannot be a text node.
|
$expr
:
An expression with which to replace each match. You can use the
variables $cts:text, $cts:node,
$cts:entity-type and $cts:normalized-text,
$cts:start, and $cts:action
(described below) in the expression.
|
|
Usage Notes:
In addition to a valid Entity Enrichment license key, this function
requires that you have installed the Entity Enrichment package. For
details on installing the Entity Enrichment package, see the
Installation Guide and the "Marking Up Documents With
Entity Enrichment" chapter of the Search Developer's Guide.
There are six built-in variables to represent an entity match.
These variables can be used inline in the expression parameter.
$cts:text as xs:string
The matched text.
$cts:node as text()
The node containing the matched text.
$cts:start as xs:integer
The string-length position of the first character of
$cts:text in $cts:node. Therefore, the following
always returns true:
fn:substring($cts:node, $cts:start,
fn:string-length($cts:text)) eq $cts:text
$cts:action as xs:string
Use xdmp:set on this to specify what should happen
next
- "continue"
- (default) Walk the next match.
If there are no more matches, return all evaluation results.
- "skip"
- Skip walking any more matches and return all evaluation results.
- "break"
- Stop walking matches and return all evaluation results.
$cts:entity-type as xs:string
The type of the matching entity.
$cts:normalized-text as xs:string
The normalized entity text (only applicable for some
languages).
The following are the entity types returned from the
$cts:entity-type built-in variable (in alphabetical order):
FACILITY
- A place used as a facility.
GPE
- Geo-political entity. Differs from location because it has a
person-made aspect to it (for example, California is a GPE because
its boundaries were defined by a government).
IDENTIFIER:CREDIT_CARD_NUM
- A number identifying a credit card number.
IDENTIFIER:DISTANCE
- A number identifying a distance.
IDENTIFIER:EMAIL
- Identifies an email address.
IDENTIFIER:LATITUDE_LONGITUDE
- Latitude and longitude coordinates.
IDENTIFIER:MONEY
- Identifies currency (dollars, euros, and so on).
IDENTIFIER:NUMBER
- Identifies a number.
IDENTIFIER:PERSONAL_ID_NUM
- A number identifying a social security number or other ID
number.
IDENTIFIER:PHONE_NUMBER
- A number identifying a telephone number.
IDENTIFIER:URL
- Identifies a web site address (URL).
IDENTIFIER:UTM
- Identifies Universal Transverse Mercator coordinates.
LOCATION
- A geographic location (Mount Everest, for example).
NATIONALITY
- The nationality of someone or something (for example, American).
ORGANIZATION
- An organization.
PERSON
- A person.
RELIGION
- A religion.
TEMPORAL:DATE
- Date-related.
TEMPORAL:TIME
- Time-related.
TITLE
- Appellation or honorific associated with a person.
URL
- A URL on the world wide web.
UTM
- A point in the Universal Transverse Mercator (UTM)
coordinate system.
|
Example:
let $myxml := <node>George Washington never visited Norway.
If he had a Social Security number,
it might be 000-00-0001.</node>
return
cts:entity-highlight($myxml,
element { fn:replace($cts:entity-type, ":", "-") } { $cts:text })
=>
<node>
<PERSON>George Washington</PERSON> never visited <GPE>Norway</GPE>.
If he had a Social Security number, it might be
<IDENTIFIER-PERSONAL_ID_NUM>000-00-0001</IDENTIFIER-PERSONAL_ID_NUM>.
</node>
|
|
|
|
cts:field-range-query(
|
|
$field-name as xs:string*,
|
|
$operator as xs:string,
|
|
$value as xs:anyAtomicType*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:field-range-query |
|
 |
Summary:
Returns a cts:query matching fields by name with a
range-index entry equal a given value. Searches with the
cts:field-range-query
constructor require a field range index on the specified field name(s);
if there is no range index configured, then an exception is thrown.
|
Parameters:
$field-name
:
One or more field names to match. When multiple field names are specified,
the query matches if any field name matches.
|
$operator
:
A comparison operator.
Operators include:
- "<"
- Match range index values less than $value.
- "<="
- Match range index values less than or equal to $value.
- ">"
- Match range index values greater than $value.
- ">="
- Match range index values greater than or equal to $value.
- "="
- Match range index values equal to $value.
- "!="
- Match range index values not equal to $value.
|
$value
:
One or more field values to match.
When multiple values are specified,
the query matches if any value matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "collation=URI"
- Use the range index with the collation specified by
URI. If not specified, then the default collation
from the query is used. If a range index with the specified
collation does not exist, an error is thrown.
- "cached"
- Cache the results of this query in the list cache.
- "uncached"
- Do not cache the results of this query in the list cache.
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query. The default is 1.0. In the current release,
this option is ignored; range queries do not contribute to the score.
|
|
Usage Notes:
If you want to constrain on a range of values, you can combine multiple
cts:field-range-query constructors together
with cts:and-query or any of the other composable
cts:query constructors.
If neither "cached" nor "uncached" is present, it specifies "cached".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
(: Insert few documents with test data :)
let $content1 := <name><fname>John</fname><mname>Rob</mname><lname>Goldings</lname></name>
let $content2 := <name><fname>Jim</fname><mname>Ken</mname><lname>Kurla</lname></name>
let $content3 := <name><fname>Ooi</fname><mname>Ben</mname><lname>Fu</lname></name>
let $content4 := <name><fname>James</fname><mname>Rick</mname><lname>Tod</lname></name>
return (
xdmp:document-insert("/aname1.xml",$content1),
xdmp:document-insert("/aname2.xml",$content2),
xdmp:document-insert("/aname3.xml",$content3),
xdmp:document-insert("/aname4.xml",$content4));
(:
requires a field (range) index of
type xs:string on field "aname"
:)
(:
returns the following:
<?xml version="1.0" encoding="UTF-8"?>
<name><fname>John</fname><mname>Rob</mname><lname>Goldings</lname></name>
<?xml version="1.0" encoding="UTF-8"?>
<name><fname>Ooi</fname><mname>Ben</mname><lname>Fu</lname></name>
:)
;
(:
requires an element (range) index of
type xs:string on "aname"
:)
cts:contains(doc(),cts:field-range-query("aname",">","Jim Kurla"))
(:
returns "true".
:)
|
|
|
|
cts:field-value-co-occurrences(
|
|
$field-name-1 as xs:string,
|
|
$field-name-2 as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences (that is, pairs of values, both of which appear
in the same fragment) from the specified field value lexicon(s). The
values are returned as an XML element with two children, each child
containing one of the co-occurring values. You can use
cts:frequency on each item returned to find how many times
the pair occurs.
Value lexicons are implemented using range indexes; consequently
this function requires an field range index for each field specified
in the function, and the range index must have range value positions
set to true. If there is not a range index configured for each
of the specified fields, and if the range value positions is not
enabled for the any of the range indexes, an exception is thrown.
|
Parameters:
$field-name-1
:
An a string.
|
$field-name-2
:
An a string.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For both lexicons, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-1=type"
- For the first lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-2=type"
- For the second lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- For both lexicons, use the collation specified by
URI.
- "collation-1=URI"
- For the first lexicon, use the collation specified by
URI.
- "collation-2=URI"
- For the second lexicon, use the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Co-occurrences from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
element(cts:co-occurrence)* sequence.
|
$query
(optional):
Only include co-occurrences in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
(: Suppose we insert these two documents in the database.
Document 1:
<doc>
<name1>
<i11>John</i11><e12>Smith</e12><i13>Griffith</i13>
</name1>
<name2>
<i21>Will</i21><e22>Tim</e22><i23>Shields</i23>
</name2>
</doc>
Document 2:
<doc>
<name1>
<i11>Will<e12>Frank</e12>Shields</i11>
</name1>
<name2>
<i21>John<e22>Tim</e22>Griffith</i21>
</name2>
</doc>
:)
(: Now suppose we have two fields aname1 and aname2 defined on the database.
The field aname1 includes element "name1" and excludes "e12".
The field aname2 includes element "name2" and excludes "e22".
Both the fields have field range indexes configures with positions ON.
:)
cts:field-value-co-occurrences("aname1","aname2")
=>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">John Griffith</cts:value>
<cts:value xsi:type="xs:string">Will Shields</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">Will Shields</cts:value>
<cts:value xsi:type="xs:string">John Griffith</cts:value>
</cts:co-occurrence>
|
Example:
(: Here is another example that finds co-occurence between field value
and an element-value using cts:element-value-co-occurences() API. :)
(: Suppose we have the following document in the database/fragment. :)
<doc>
<person>
<name>
<first-name>Will</first-name>
<middle-name>Frank</middle-name>
<last-name>Shields</last-name>
</name>
<address>
<ZIP>92341</ZIP>
</address>
<phoneNumber>650-472-4444</phoneNumber>
</person>
<person>
<name>
<first-name>John</first-name>
<middle-name>Tim</middle-name>
<last-name>Hearst</last-name>
</name>
<address>
<ZIP>96345</ZIP>
</address>
<phoneNumber>750-947-5555</phoneNumber>
</person>
</doc>
(: This database has element range indexes defined on elements
ZIP and phoneNumber. Positions are set true on the range indexes.
There is a field, named "aname" defined on this database
which excludes element middle-name.
A string range index is configured on the field "aname".
Position is set true on the database.
In the following query we are using lexicons on field values of
"aname" and element value "ZIP" to determine value co-occurences.
However, notice the the field is being treated as if it were an
element with a MarkLogic predefined namespace
"http://marklogic.com/fields".
:)
declare namespace my="http://marklogic.com/fields";
cts:element-value-co-occurrences(xs:QName("ZIP"),xs:QName("my:aname"))
=>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">68645</cts:value>
<cts:value xsi:type="xs:string">Jill Tom Lawless</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">68645</cts:value>
<cts:value xsi:type="xs:string">Nancy Smith Finkman</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">92341</cts:value>
<cts:value xsi:type="xs:string">John Tim Hearst</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">92341</cts:value>
<cts:value xsi:type="xs:string">Will Frank Shields</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">93452</cts:value>
<cts:value xsi:type="xs:string">Jill Tom Lawless</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">93452</cts:value>
<cts:value xsi:type="xs:string">Nancy Smith Finkman</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">96345</cts:value>
<cts:value xsi:type="xs:string">John Tim Hearst</cts:value>
</cts:co-occurrence>
<cts:co-occurrence
xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:int">96345</cts:value>
<cts:value xsi:type="xs:string">Will Frank Shields</cts:value>
</cts:co-occurrence>
|
|
|
|
cts:field-value-match(
|
|
$field-names as xs:string*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified field value lexicon(s)
that match the specified wildcard pattern. Field value lexicons
are implemented using range indexes; consequently this function
requires a field range index for each field specified in the
function. If there is not a range index configured for each of the
specified fields, then an exception is thrown.
|
Parameters:
$field-names
:
One or more field names.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
String parameters may include wildcard characters.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:anyAtomicType* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:field-value-match("aname","Jim *")
=> "Jim Kurla"
|
|
|
|
cts:field-value-query(
|
|
$field-name as xs:string*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:field-value-query |
|
 |
Summary:
Returns a query matching text content containing a given value in the
specified field. If the specified field does not exist,
cts:field-value-query throws an exception.
If the specified field does have the index setting
field value searches enabled, either for the database or
for the specified field, then a cts:search with a
cts:field-value-query throws an exception. A field
is a named object that specified elements to include and exclude
from a search, and can include score weights for any included elements.
You create fields at the database level using the Admin Interface. For
details on fields, see the chapter on "Fields Database Settings" in the
Administrator's Guide.
|
Parameters:
$field-name
:
One or more field names to search over. If multiple field names are
supplied, the match can be in any of the specified fields (or-query
semantics).
|
$text
:
The value to match. If multiple strings are specified,
the query matches if any of the values match (or-query
semantics).
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If you use cts:near-query with
cts:field-value-query, the distance supplied in the near query
applies to the whole document, not just to the field. For example, if
you specify a near query with a distance of 3, it will return matches
when the values are within 3 words in the whole document,
For a code example illustrating this, see the second example
below.
Values are determined based on words (tokens)of values of elements that are
included in the field. Field values span all the included elements. They
cannot span excluded elements (this is because MarkLogic Server breaks
out of the field when it encounters the excluded element and start it again
field when it encounters the next included element). Field
values will also span included sibling elements.
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "stemmed" nor "unstemmed"
is present, the database configuration determines stemming.
If the database has "stemmed searches" enabled, it specifies "stemmed".
Otherwise it specifies "unstemmed".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
let $contents := <Employee><name><fname>Jaz</fname><mname>Roy</mname><lname>Smith</lname></name></Employee>
return
cts:contains($contents,cts:field-value-query("myField","Jaz Roy Smith"))
=> check if the filed myField has a value matching to "Jaz Roy Smith"
in node $contents. The field must exist in the database against which
this query is evaluated. "myField" in thic case includes element "name" and excludes "mname". This expression returns false.
|
Example:
let $contents := <Employee><name><fname>Jaz</fname><mname>Roy</mname><lname>Smith</lname></name></Employee>
return
cts:contains($contents,cts:field-value-query("myField","Jaz Smith"))
=> Returns true.
|
Example:
In this query, the search is fully resolved in the index.
cts:search(fn:doc("/Employee/jaz.xml"),cts:field-value-query("myField","Jaz Smith"),"unfiltered")
=> Returns the doc which has field "myField" and a match with the value of the field.
|
|
|
|
cts:field-value-ranges(
|
|
$field-names as xs:string*,
|
|
[$bounds as xs:anyAtomicType*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:range)* |
|
 |
Summary:
Returns value ranges from the specified field value lexicon(s).
Value lexicons are implemented using range indexes; consequently this
function requires a field range index for each element specified
in the function. If there is not a range index configured for each
of the specified fields, an exception is thrown.
The values are divided into buckets. The $bounds parameter specifies
the number of buckets and the size of each bucket.
All included values are bucketed, even those less than the lowest bound
or greater than the highest bound. An empty sequence for $bounds specifies
one bucket, a single value specifies two buckets, two values specify
three buckets, and so on.
If you have string values and you pass a $bounds parameter
as in the following call:
cts:field-value-ranges("myField", ("f", "m"))
The first bucket contains string values that are less than the
string f, the second bucket contains string values greater than
or equal to f but less than m, and the third bucket
contains string values that are greater than or equal to m.
For each non-empty bucket, a cts:range element is returned.
Each cts:range element has a cts:minimum child
and a cts:maximum child. If a bucket is bounded, its
cts:range element will also have a
cts:lower-bound child if it is bounded from below, and
a cts:upper-bound element if it is bounded from above.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$field-names
:
One or more element QNames.
|
$bounds
(optional):
A sequence of range bounds.
The types must match the lexicon type.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Ranges should be returned in ascending order.
- "descending"
- Ranges should be returned in descending order.
- "empties"
- Include fully-bounded ranges whose frequency is 0. These ranges
will have no minimum or maximum value. Only empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Ranges should be returned ordered by frequency.
- "item-order"
- Ranges should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N ranges.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only ranges for buckets with at least one value from
the first N fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then ranges with all included values may be returned. If a
$query parameter is not present, then "sample=N"
has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
(: Run the following to load data for this example.
Make sure you have a string field range index on
field aname. :)
let $content1 := <name><fname>John</fname><mname>Rob</mname><lname>Goldings</lname></name>
let $content2 := <name><fname>Jim</fname><mname>Ken</mname><lname>Kurla</lname></name>
let $content3 := <name><fname>Ooi</fname><mname>Ben</mname><lname>Fu</lname></name>
let $content4 := <name><fname>James</fname><mname>Rick</mname><lname>Tod</lname></name>
let $content5 := <name><fname>Anthony</fname><mname>Rob</mname><lname>Flemings</lname></name>
let $content6 := <name><fname>Charles</fname><mname>Ken</mname><lname>Winter</lname></name>
let $content7 := <name><fname>Nancy</fname><mname>Ben</mname><lname>Schmidt</lname></name>
let $content8 := <name><fname>Robert</fname><mname>Rick</mname><lname>Hanson</lname></name>
return (
xdmp:document-insert("/aname1.xml",$content1),
xdmp:document-insert("/aname2.xml",$content2),
xdmp:document-insert("/aname3.xml",$content3),
xdmp:document-insert("/aname4.xml",$content4),
xdmp:document-insert("/aname5.xml",$content5),
xdmp:document-insert("/aname6.xml",$content6),
xdmp:document-insert("/aname7.xml",$content7),
xdmp:document-insert("/aname8.xml",$content8)
)
(: The following is based on the above setup :)
cts:field-value-ranges("aname",("A","J","O"));
=>
<cts:range xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">Anthony Flemings</cts:minimum>
<cts:maximum xsi:type="xs:string">Charles Winter</cts:maximum>
<cts:lower-bound xsi:type="xs:string">A</cts:lower-bound>
<cts:upper-bound xsi:type="xs:string">J</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">James Tod</cts:minimum>
<cts:maximum xsi:type="xs:string">Nancy Schmidt</cts:maximum>
<cts:lower-bound xsi:type="xs:string">J</cts:lower-bound>
<cts:upper-bound xsi:type="xs:string">O</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">Ooi Fu</cts:minimum>
<cts:maximum xsi:type="xs:string">Robert Hanson</cts:maximum>
<cts:lower-bound xsi:type="xs:string">O</cts:lower-bound>
</cts:range>
|
|
|
|
cts:field-values(
|
|
$field-names as xs:string*,
|
|
[$start as xs:anyAtomicType?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified field value lexicon(s).
Value lexicons are implemented using range indexes; consequently this
function requires an field range index for each field specified
in the function. If there is not a range index configured for each
of the specified fields, an exception is thrown.
|
Parameters:
$field-names
:
One or more field names.
|
$start
(optional):
A starting value. The parameter type must match the lexicon type.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Values from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only values from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
after skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:anyAtomicType* sequence.
|
$query
(optional):
Only include values in fragments selected by the cts:query,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:field-values("my_field","John Goldings")
=> ("John Goldings","Ooi Fu",...)
|
|
|
|
cts:field-word-match(
|
|
$field-names as xs:string*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified field word lexicon(s) that match
a wildcard pattern. This function requires an field word lexicon
configured for each of the specified fields in the function. If there
is not an field word lexicon configured for any of the specified
fields, an exception is thrown.
|
Parameters:
$field-names
:
One or more field names.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after
skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after
skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
Only words that can be matched with field-word-query are included.
That is, only words present in immediate text node children of the
specified field as well as any text node children of child fields
defined in the Admin Interface as field-word-query-throughs or
phrase-throughs.
|
Example:
cts:field-word-match("animal","aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:field-word-query(
|
|
$field-name as xs:string*,
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:field-word-query |
|
 |
Summary:
Returns a query matching text content containing a given phrase in the
specified field. If the specified field does not exist,
cts:field-word-query throws an exception. A field
is a named object that specified elements to include and exclude
from a search, and can include score weights for any included elements.
You create fields at the database level using the Admin Interface. For
details on fields, see the chapter on "Fields Database Settings" in the
Administrator's Guide.
|
Parameters:
$field-name
:
One or more field names to search over. If multiple field names are
supplied, the match can be in any of the specified fields (or-query
semantics).
|
$text
:
The word or phrase to match. If multiple strings are specified,
the query matches if any of the words or phrases match (or-query
semantics).
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If you use cts:near-query with
cts:field-word-query, the distance supplied in the near query
applies to the whole document, not just to the field. For example, if
you specify a near query with a distance of 3, it will return matches
when the words or phrases are within 3 words in the whole document,
even if some of those words are not in the specified field. For a code
example illustrating this, see the second example
below.
Phrases are determined based on words being next to each other
(word positions with a distance of 1) and words being in the same
instance of the field. Because field word positions
are determined based on the fragment, not on the field, field phrases
cannot span excluded elements (this is because MarkLogic Server breaks
out of the field when it encounters the excluded element and start a new
field when it encounters the next included element). Similarly, field
phrases will not span included sibling elements. The
second code example below illustrates this.
Field phrases will automatically phrase-through all child elements of
an included element, until it encounters an explicitly excluded
element. The third example below illustrates this.
An example of when this automatic phrase-through behavior might be
convenient is if you create a field that includes only the element
ABSTRACT. Then all child elements of ABSTRACT
are included in the field, and phrases would span all of the child
elements (that is, phrases would "phrase-through" all the child elements).
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
|
Example:
cts:search(fn:doc(), cts:field-word-query("myField", "my phrase"))
=> a list of documents that contain the phrase
"my phrase" in the field "myField". The field
must exist in the database against which this query
is evaluated.
|
Example:
(:
Assume the database has a field named
"buzz" with the element "buzz"
included and the element "baz" excluded.
:)
let $x :=
<hello>word1 word2 word3
<buzz>word4 word5</buzz>
<baz>word6 word7 word8</baz>
<buzz>word9 word10</buzz>
</hello>
return (
cts:contains($x, cts:near-query(
(cts:field-word-query("buzz", "word5"),
cts:field-word-query("buzz", "word9")), 3)),
cts:contains($x, cts:near-query(
(cts:field-word-query("buzz", "word5"),
cts:field-word-query("buzz", "word9")), 4)),
cts:contains($x,
cts:field-word-query("buzz", "word5 word9")))
(:
Returns the sequence ("false", "true", "false").
The first part does not match because
the distance between "word5" and "word9"
is 4. This is because the distance is
calculated based on the whole node (if the
document was in a database, based on the
fragment), not based on the field. The
second part specifies a distance of 4, and
therefore matches and returns true. The third
part does not match because the phrase is
based on the entire node, not on the field,
and there are words between "word5" and "word9"
in the node (even though not in the field).
:)
|
Example:
(:
Assume the database has a field named
"buzz" with the element "buzz"
included and the element "baz" excluded.
:)
let $x :=
<hello>
<buzz>word1 word2
<gads>word3 word4 word5</gads>
<zukes>word6 word7 word8</zukes>
word9 word10
</buzz>
</hello>
return (
cts:contains($x,
cts:field-word-query("buzz", "word2 word3")))
(:
Returns "true" because the children of
"buzz" are not excluded, and are therefore
automatically phrased through.
:)
|
|
|
|
cts:field-words(
|
|
$field-names as xs:string*,
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified field word lexicon. This function
requires an field lexicon for each of the field specified in the
function. If there is not an field word lexicon configured for any
of the specified fields, an exception is thrown. The words are
returned in collation order.
|
Parameters:
$field-names
:
One or more field names.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after
skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after
skip selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
Only words that can be matched with field-word-query are included.
That is, only words present in immediate text node children of the
specified field as well as any text node children of child fields
defined in the Admin Interface as field-word-query-throughs or
phrase-throughs.
|
Example:
cts:field-words("animal","aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:frequency(
|
|
$value as item()
|
| ) as xs:integer |
|
 |
Summary:
Returns an integer representing the number of times in which a particular
value occurs in a value lexicon lookup (for example,
cts:element-values). When using the
fragment-frequency lexicon option, cts:frequency
returns the number of fragments in which the lexicon value occurs. When
using the item-frequency lexicon option,
cts:frequency returns the total number of times
in which the lexicon value occurs in each item.
|
Parameters:
|
Usage Notes:
You must have a Range index configured to use the value lexicon APIs
(cts:element-values, cts:element-value-match,
cts:element-attribute-values, or
cts:element-attribute-value-match).
If the value specified is not from a value lexicon lookup,
cts:frequency returns a frequency of 0.
The frequency returned from cts:frequency is fragment-based
by default (using the default fragment-frequency option in the
lexicon API). If there are multiple occurences of the value in any given
fragment, the frequency is still one per fragment when using
fragment-frequency. Therefore, if the value
returned is 13, it means that the value occurs in 13 fragments.
If you want the total frequency instead of the fragment-based frequency
(that is, the total number of occurences of the value in the items specified
in the cts:query option of the lexicon API),
you must specify the item-frequency option to the lexicon
API value input to cts:frequency. For example, the second
example below specifies an item-frequency and a
cts:document-query in the lexicon
API, so the item frequency is how many times each speaker speaks in the
play (because the constraining query is a document query of hamlet.xml, which
contains the whole play).
|
Example:
<results>{
let $x := cts:element-values(xs:QName("SPEAKER"),"",(),
cts:document-query("/shakespeare/plays/hamlet.xml"))
for $speaker in $x
return
(
<result>
<SPEAKER>{$speaker}</SPEAKER>
<NUMBER-OF-SPEECHES>{cts:frequency($speaker)}</NUMBER-OF-SPEECHES>
</result>
)
}</results>
=> Returns the names of the speakers in Hamlet
with the number of times they speak. If the
play is fragmented at the SCENE level, then
it returns the number of scenes in which each
speaker speaks.
|
Example:
<results>{
let $x := cts:element-values(xs:QName("SPEAKER"),
"", "item-frequency",
cts:document-query("/shakespeare/plays/hamlet.xml"))
for $speaker in $x
return
(
<result>
<SPEAKER>{$speaker}</SPEAKER>
<NUMBER-OF-SPEECHES>
{cts:frequency($speaker)}
</NUMBER-OF-SPEECHES>
</result>
)
}</results>
=> Returns the names of the speakers in Hamlet
with the number of times they speak. Returns
the total times they speak, regardless
of fragmentation.
|
|
|
|
cts:geospatial-co-occurrences(
|
|
$geo-element-name-1 as xs:QName,
|
|
$child-1-name-1 as xs:QName?,
|
|
$child-1-name-2 as xs:QName?,
|
|
$geo-element-name-2 as xs:QName,
|
|
$child-2-name-1 as xs:QName?,
|
|
$child-2-name-2 as xs:QName?,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences from the geospatial lexicons.
Geospatial lexicons are implemented using geospatial indexes;
consequently this function requires a geospatial index for each
combination of elements and attributes specified in the function.
If there is not a geospatial index configured for the specified
element/attribute combination, then an exception is thrown.
|
Parameters:
$geo-element-name-1
:
An element QName.
|
$child-1-name-1
:
An element or attribute QName or empty sequence.
The empty sequence specifies an element geospatial lexicon.
|
$child-1-name-2
:
An element or attribute QName or empty sequence.
The empty sequence specifies either an element lexicon or an
element-child geospatial lexicon.
|
$geo-element-name-2
:
An element QName.
|
$child-2-name-1
:
An element or attribute QName or empty sequence.
The empty sequence specifies an element geospatial lexicon.
|
$child-2-name-2
:
An element or attribute QName or empty sequence.
The empty sequence specifies either an element lexicon or an
element-child geospatial lexicon.
|
$options
(optional):
Options. The default is ().
Options include:
- "geospatial-format=format"
- For both geospatial lexicons, use the kind of geospatial lexicon
specified by format
(element, element-child, element-pair, or element-attribute-pair).
If neither of the child QNames is specified, the default is
"element"; if only the first of the child QNames is specified,
the default is "element-child:; if both child QNames are specified,
the default is "element-pair". If the selection is not compatible
with the number of geospatial QNames specified, an error is raised.
- "geospatial-format-1=format"
- For the first geospatial lexicon, use the kind of geospatial lexicon
specified by format
(element, element-child, element-pair, or element-attribute-pair).
If neither of the child QNames is specified, the default is
"element"; if only the first of the child QNames is specified,
the default is "element-child:; if both child QNames are specified,
the default is "element-pair". If the selection is not compatible
with the number of geospatial QNames specified, an error is raised.
- "geospatial-format-2=format"
- For the second geospatial lexicons, use the kind of geospatial
lexicon specified by format
(element, element-child, element-pair, or element-attribute-pair).
If neither of the child QNames is specified, the default is
"element"; if only the first of the child QNames is specified,
the default is "element-child:; if both child QNames are specified,
the default is "element-pair". If the selection is not compatible
with the number of geospatial QNames specified, an error is raised.
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "coordinate-system=URI"
- For both geospatial lexicons, use the coordinate system specified by
name.
- "coordinate-system-1=URI"
- For the first geospatial lexicon, use the coordinate system
specified by name.
- "coordinate-system-2=URI"
- For the second geospatial lexicons, use the coordinate system
specified by name.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "reversed"
- Consider the second lexicon as the first and vice versa.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth matching fragment as the first fragment.
Co-occurrences from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N
fragments after skip selected by the
cts:query.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
element(cts:co-occurrence)* sequence.
|
$query
(optional):
Only include co-occurrences in fragments selected by the cts:query,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "coordinate-system=name" is not specified in the options
parameter, then the default coordinate system is used. If a lexicon with
that coordinate system does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:highlight(
|
|
$node as node(),
|
|
$query as cts:query,
|
|
$expr as item()*
|
| ) as node() |
|
 |
Summary:
Returns a copy of the node, replacing any text matching the query
with the specified expression. You can use this function
to easily highlight any text found in a query. Unlike
fn:replace and other XQuery string functions that match
literal text, cts:highlight matches every term that
matches the search, including stemmed matches or matches with
different capitalization.
|
Parameters:
$node
:
A node to highlight. The node must be either a document node
or an element node; it cannot be a text node.
|
$query
:
A query specifying the text to highlight. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$expr
:
An expression with which to replace each match. You can use the
variables $cts:text, $cts:node,
$cts:queries, $cts:start, and
$cts:action (described below) in the expression.
|
|
Usage Notes:
There are five built-in variables to represent a query match.
These variables can be used inline in the expression parameter.
$cts:text as xs:string
The matched text.
$cts:node as text()
The node containing the matched text.
$cts:queries as cts:query*
The matching queries.
$cts:start as xs:integer
The string-length position of the first character of
$cts:text in $cts:node. Therefore, the following
always returns true:
fn:substring($cts:node, $cts:start,
fn:string-length($cts:text)) eq $cts:text
$cts:action as xs:string
Use xdmp:set on this to specify what should happen
next
- "continue"
- (default) Walk the next match.
If there are no more matches, return all evaluation results.
- "skip"
- Skip walking any more matches and return all evaluation results.
- "break"
- Stop walking matches and return all evaluation results.
You cannot use cts:highlight to highlight results matching
cts:similar-query and cts:element-attribute-*-query
items. Using cts:highlight with these queries will
return the nodes without any highlighting.
You can also use cts:highlight as a general search
and replace function. The specified expression will replace any matching
text. For example, you could replace the word "hello" with "goodbye"
in a query similar to the following:
cts:highlight($node, "hello", "goodbye")
Because the expressions can be any XQuery expression, they can be very
simple like the above example or they can be extremely complex.
|
Example:
To highlight "MarkLogic" with bold in the following paragraph:
let $x := <p>MarkLogic Server is an enterprise-class
database specifically built for content.</p>
return
cts:highlight($x, "MarkLogic", <b>{$cts:text}</b>)
Returns:
<p><b>MarkLogic</b> Server is an enterprise-class
database specifically built for content.</p>
|
Example:
Given the following document with the URI "hellogoodbye.xml":
<root>
<a>It starts with hello and ends with goodbye.</a>
</root>
The following query will highlight the word "hello" in
blue, and everything else in red.
cts:highlight(doc("hellogoodbye.xml"),
cts:and-query((cts:word-query("hello"),
cts:word-query("goodbye"))),
if (cts:word-query-text($cts:queries) eq "hello")
then (<font color="blue">{$cts:text}</font>)
else (<font color="red">{$cts:text}</font>))
returns:
<root>
<a>It starts with <font color="blue">hello</font>
and ends with <font color="red">goodbye</font>.</a>
</root>
|
Example:
for $x in cts:search(collection(), "MarkLogic")
return
cts:highlight($x, "MarkLogic", <b>{$cts:text}</b>)
returns all of the nodes that contain "MarkLogic",
placing bold markup around the matched words.
|
|
|
|
cts:linestring(
|
|
$vertices as cts:point*
|
| ) as cts:linestring |
|
 |
Summary:
Returns a geospatial linestring value.
|
Parameters:
$vertices
:
The waypoints of the linestring, given in order.
vertexes.
|
|
Example:
let $points := (cts:point(0.373899653086420E+02, -0.122078578406509E+03),
cts:point(0.373765400000000E+02, -0.122063772000000E+03),
cts:point(0.373781400000000E+02, -0.122067972000000E+03),
cts:point(0.373825650000000E+02, -0.122068365000000E+03),
cts:point(0.373797400000000E+02, -0.122072172000000E+03),
cts:point(0.373899400000000E+02, -0.122092573000000E+03) )
return
cts:linestring($points)
|
|
|
|
cts:near-query(
|
|
$queries as cts:query*,
|
|
[$distance as xs:double?],
|
|
[$options as xs:string*],
|
|
[$distance-weight as xs:double?]
|
| ) as cts:near-query |
|
 |
Summary:
Returns a query matching all of the specified queries, where
the matches occur within the specified distance from each other.
|
Parameters:
$queries
:
A sequence of queries to match.
|
$distance
(optional):
A distance, in number of words, between any two matching queries.
The results match if two queries match and the distance between the
two matches is equal to or less than the specified distance. A
distance of 0 matches when the text is the exact same text or when
there is overlapping text (see the third example below). A negative
distance is treated as 0. The default value is 10.
|
$options
(optional):
Options to this query. The default value is ().
Options include:
- "ordered"
- Any near-query matches must occur in the order of
the specified sub-queries.
- "unordered"
- Any near-query matches will satisfy the query,
regardless of the order they were specified.
|
$distance-weight
(optional):
A weight attributed to the distance for this query. Higher
weights add to the importance of distance (as opposed to term matches)
when the relevance order is calculated. The default value is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score. This parameter has no effect if the word positions
index is not enabled.
|
|
Usage Notes:
If the options parameter contains neither "ordered" nor "unordered",
then the default is "unordered".
The word positions index will speed the performance of
queries that use cts:near-query. The element word
positions index will speed the performance of element-queries
that use cts:near-query.
If you use cts:near-query with a field, the distance
specified is the distance in the whole document, not the distance
in the field. For example, if the distance between two words is 20 in
the document, but the distance is 10 if you look at a view of the document
that only includes the elements in a field, a cts:near-query
must have a distance of 20 or more to match; a distance of 10 would not
match.
If you use cts:near-query with
cts:field-word-query, the distance supplied in the near query
applies to the whole document, not just to the field. For details, see
cts:field-word-query.
Expressions using the ordered option are more efficient
than those using the unordered option, especially if they
specify many queries to match.
|
Example:
The following query searches for paragraphs containing
both "MarkLogic" and "Server" within 3 words of each
other, given the following paragraphs in a database:
<p>MarkLogic Server is an enterprise-class
database specifically built for content.</p>
<p>MarkLogic is an excellent XML Content Server.</p>
cts:search(//p,
cts:near-query(
(cts:word-query("MarkLogic"),
cts:word-query("Server")),
3))
=>
<p>MarkLogic Server is an enterprise-class
database specifically built for content.</p>
|
Example:
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("discontent", "winter"),
3, "ordered"))
=> false because "discontent" comes after "winter"
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("discontent", "winter"),
3, "unordered"))
=> true because the query specifies "unordered",
and it is still a match even though
"discontent" comes after "winter"
|
Example:
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("is the winter", "winter of"),
0))
=> true because the phrases overlap
let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
("is the winter", "of our"),
0))
=> false because the phrases do not overlap
(they have 1 word distance, not 0)
|
|
|
|
cts:not-query(
|
|
$query as cts:query
|
| ) as cts:not-query |
|
 |
Summary:
Returns a query specifying the matches not specified by its sub-query.
|
Parameters:
$query
:
A negative query, specifying the search results
to filter out.
|
|
Usage Notes:
The cts:not-query constructor is fragment-based, so
it returns true only if the specified query does not produce a match
anywhere in a fragment. Therefore, a search using
cts:not-query is only guaranteed to be accurate if the underlying
query that is being negated is accurate from its index resolution (that is,
if the unfiltered results of the $query parameter to
cts:not-query are accurate). The accuracy of the index
resolution depends on the many factors such as the query, if you search
at a fragment root (that is, if the first parameter of
cts:search specifies an XPath that resolves to a fragment root),
the index options enabled on the database, the search options,
and other factors.
In cases where the $query parameter has false-positive matches,
the negation of the query can miss matches (have false negative matches).
In these cases,
searches with cts:not-query can miss results, even if those
searches are filtered.
|
Example:
cts:search(//PLAY,
cts:not-query(
cts:word-query("summer")))
=> ... sequence of 'PLAY' elements not containing
any text node with the word 'summer'.
|
Example:
let $doc :=
<doc>
<p n="1">Dogs, cats, and pigs</p>
<p n="2">Trees, frogs, and cats</p>
<p n="3">Dogs, alligators, and wolves</p>
</doc>
return
$doc//p[cts:contains(., cts:not-query("cat"))]
(: Returns the third p element (the one without
a "cat" term). Note that the
cts:contains forces the constraint to happen
in the filtering stage of the query. :)
|
|
|
|
cts:polygon(
|
|
$vertices as cts:point*
|
| ) as cts:polygon |
|
 |
Summary:
Returns a geospatial polygon value.
|
Parameters:
$vertices
:
The vertices of the polygon, given in order. No edge may cover
more than 180 degrees of either latitude or longitude.
The polygon as a whole may not encompass both
poles. These constraints are necessary to ensure an unambiguous
interpretation of the polygon. There must be at least three vertices.
The first vertex should be identical to the last vertex to close the
polygon.
vertexes.
|
|
Example:
(: this polygon approximates the 94041 zip code :)
let $points := (cts:point(0.373899653086420E+02, -0.122078578406509E+03),
cts:point(0.373765400000000E+02, -0.122063772000000E+03),
cts:point(0.373781400000000E+02, -0.122067972000000E+03),
cts:point(0.373825650000000E+02, -0.122068365000000E+03),
cts:point(0.373797400000000E+02, -0.122072172000000E+03),
cts:point(0.373899400000000E+02, -0.122092573000000E+03),
cts:point(0.373941400000000E+02, -0.122095573000000E+03),
cts:point(0.373966400000000E+02, -0.122094173000000E+03),
cts:point(0.373958400000000E+02, -0.122092373000000E+03),
cts:point(0.374004400000000E+02, -0.122091273000000E+03),
cts:point(0.374004400000000E+02, -0.122091273000000E+03),
cts:point(0.373873400000000E+02, -0.122057872000000E+03),
cts:point(0.373873400000000E+02, -0.122057872000000E+03),
cts:point(0.373854400000000E+02, -0.122052672000000E+03),
cts:point(0.373833400000000E+02, -0.122053372000000E+03),
cts:point(0.373819400000000E+02, -0.122057572000000E+03),
cts:point(0.373775400000000E+02, -0.122060872000000E+03),
cts:point(0.373765400000000E+02, -0.122063772000000E+03) )
return
cts:polygon($points)
|
|
|
|
cts:polygon-contains(
|
|
$polygon as cts:polygon,
|
|
$region as cts:region*,
|
|
[$options as xs:string*]
|
| ) as xs:boolean |
|
 |
Summary:
Returns true if the polygon contains a region.
|
Parameters:
$polygon
:
A geographic polygon.
|
$region
:
One or more geographic regions (boxes, circles, polygons, or points).
Where multiple regions are specified, return true if any region contains
the target polygon.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
|
|
|
|
cts:polygon-intersects(
|
|
$polygon as cts:polygon,
|
|
$region as cts:region*,
|
|
[$options as xs:string*]
|
| ) as xs:boolean |
|
 |
Summary:
Returns true if the polygon intersects with a region.
|
Parameters:
$polygon
:
A geographic polygon.
|
$region
:
One or more geographic regions (boxes, circles, polygons, or points).
Where multiple regions are specified, return true if any region intersects
the target polygon.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance (for circles) is measured in miles.
- "boundaries-included"
- Points on boxes', circles', and polygons' boundaries are counted as matching. This is the default.
- "boundaries-excluded"
- Points on boxes', circles', and polygons' boundaries are not counted as matching.
- "boundaries-latitude-excluded"
- Points on boxes' latitude boundaries are not counted as matching.
- "boundaries-longitude-excluded"
- Points on boxes' longitude boundaries are not counted as matching.
- "boundaries-south-excluded"
- Points on the boxes' southern boundaries are not counted as matching.
- "boundaries-west-excluded"
- Points on the boxes' western boundaries are not counted as matching.
- "boundaries-north-excluded"
- Points on the boxes' northern boundaries are not counted as matching.
- "boundaries-east-excluded"
- Points on the boxes' eastern boundaries are not counted as matching.
- "boundaries-circle-excluded"
- Points on circles' boundary are not counted as matching.
|
|
Example:
|
|
|
|
cts:polygon-vertices(
|
|
$polygon as cts:polygon
|
| ) as cts:point* |
|
 |
Summary:
Returns a polygon's vertices.
The first vertex and last vertex will always be the same.
|
Parameters:
$polygon
:
The polygon.
|
|
Example:
let $node :=
<polygon zip="94041">
0.373899653086420E+02, -0.122078578406509E+03
0.373765400000000E+02, -0.122063772000000E+03
0.373781400000000E+02, -0.122067972000000E+03
0.373825650000000E+02, -0.122068365000000E+03
0.373797400000000E+02, -0.122072172000000E+03
0.373899400000000E+02, -0.122092573000000E+03
0.373941400000000E+02, -0.122095573000000E+03
0.373966400000000E+02, -0.122094173000000E+03
0.373958400000000E+02, -0.122092373000000E+03
0.374004400000000E+02, -0.122091273000000E+03
0.374004400000000E+02, -0.122091273000000E+03
0.373873400000000E+02, -0.122057872000000E+03
0.373873400000000E+02, -0.122057872000000E+03
0.373854400000000E+02, -0.122052672000000E+03
0.373833400000000E+02, -0.122053372000000E+03
0.373819400000000E+02, -0.122057572000000E+03
0.373775400000000E+02, -0.122060872000000E+03
0.373765400000000E+02, -0.122063772000000E+03
</polygon>
return
cts:polygon-vertices(cts:polygon(fn:data($node)))
|
|
|
|
cts:registered-query(
|
|
$ids as xs:unsignedLong*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:registered-query |
|
 |
Summary:
Returns a query matching fragments specified by previously registered
queries (see cts:register). If a
registered query with the specified ID(s) is not found, then
a cts:search operation with an invalid
cts:registered-query throws an XDMP-UNREGISTERED
exception.
|
Parameters:
$ids
:
Some registered query identifiers.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "filtered"
- A filtered query (the default). Filtered queries
eliminate any false-positive results and properly resolve
cases where there are multiple candidate matches within the same
fragment, thereby guaranteeing
that the results fully satisfy the original
cts:query
item that was registered. This option is not available in
the 4.0 release.
- "unfiltered"
- An unfiltered query. Unfiltered registered queries
select fragments from the indexes that are candidates to satisfy
the
cts:query.
Depending on the original cts:query, the
structure of the documents in the database, and the configuration
of the database,
unfiltered registered queries may result in false-positive results
or in incorrect matches when there are multiple candidate matches
within the same fragment.
To avoid these problems, you should only use unfiltered queries
on top-level XPath expressions (for example, document nodes,
collections, directories) or on fragment roots. Using unfiltered
queries on complex XPath expressions or on XPath expressions that
traverse below a fragment root can result in unexpected results.
This option is required in the 4.0 release.
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If the options parameter does not contain "unfiltered",
then an error is returned, as the "unfiltered" option is required.
Registered queries are persisted as a soft state only; they can
become unregistered through an explicit direction (using
cts:deregister),
as a result of the cache growing too large, or because of a server restart.
Consequently, either your XQuery code or your middleware layer should handle
the case when an XDMP-UNREGISTERED exception occurs (for example, you can
wrap your cts:registered-query code in a try/catch block
or your Java or .NET code can catch and handle the exception).
|
Example:
cts:search(//function,
cts:registered-query(1234567890123456,"unfiltered"))
=> .. relevance-ordered sequence of 'function' elements
in any document that also matches the registered query
|
Example:
(: wrap the registered query in a try/catch :)
try {
cts:search(fn:doc(),cts:registered-query(995175721241192518,"unfiltered")))
} catch ($e) {
if ($e/err:code = "XDMP-UNREGISTERED")
then ("Retry this query with the following registered query ID: ",
cts:register(cts:word-query("hello*world","wildcarded")))
else $e
}
|
|
|
|
cts:remainder(
|
|
[$node as node()]
|
| ) as xs:integer |
|
 |
Summary:
Returns an estimated search result size for a node,
or of the context node if no node is provided.
The search result size for a node is the number of fragments remaining
(including the current node) in the result sequence containing the node.
This is useful to quickly estimate the size of a search result sequence,
without using fn:count() or xdmp:estimate().
|
Parameters:
$node
(optional):
A node. Typically this is an item in the result sequence of a
cts:search operation. If you specify the first item
from a cts:search expression,
then cts:remainder will return an estimate of the number
of fragments that match that expression.
|
|
Usage Notes:
This function makes it efficient to estimate the size of a search result
and execute that search in the same query. If you only need an estimate of
the size of a search but do not need to run the search, then
xdmp:estimate is more efficient.
To return the estimated size of a search with cts:remainder,
use the first item of a cts:search result sequence as the
parameter to cts:remainder. For example, the following
query returns the estimated number of fragments that contain the word
"dog":
cts:remainder(cts:search(collection(), "dog")[1])
When you put the position predicate on the cts:search result
sequence, MarkLogic Server will filter all of the false-positive results
up to the specified position, but not the false-positive results beyond
the specified
position. Because of this, when you increase the position number in the
parameter, the result from cts:remainder might decrease
by a larger number than the increase in position number, or it might not
decrease at all. For example, if
the query above returned 10, then the following query might return 9, it
might return 10, or it might return less than 9, depending on how the
results are dispersed throughout different fragments:
cts:remainder(cts:search(collection(), "dog")[2])
If you run cts:remainder on a constructed node, it always
returns 0; it is primarily intended to run on nodes that are the retrieved
from the database (an item from a cts:search result or an
item from the result of an XPath expression that searches through the
database).
|
Example:
let $x := cts:search(collection(), "dog")
return
(cts:remainder($x[1]), $x)
=> Returns the estimated number of items in the search
for "dog" followed by the results of the search.
|
Example:
xdmp:document-insert("/test.xml", <a>my test</a>);
for $x in cts:search(collection(),"my test")
return cts:remainder($x) => 1
|
Example:
for $a in cts:search(collection(),"my test")
where $a[cts:remainder() eq 1]
return xdmp:node-uri($a) => /test.xml
|
|
|
|
cts:search(
|
|
$expression as node()*,
|
|
$query as cts:query?,
|
|
[$options as xs:string*],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as node()* |
|
 |
Summary:
Returns a relevance-ordered sequence of nodes specified by a given query.
|
Parameters:
$expression
:
An expression to be searched.
This must be an inline fully searchable path expression.
|
$query
:
A cts:query specifying the search to perform. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$options
(optional):
Options to this search. The default is ().
Options include:
"filtered"
A filtered search (the default). Filtered searches
eliminate any false-positive matches and properly resolve cases where
there are multiple candidate matches within the same fragment.
Filtered search results fully satisfy the specified
cts:query.
"unfiltered"
An unfiltered search. An unfiltered search
selects fragments from the indexes that are candidates to satisfy
the specified cts:query, and then it returns
a single node from within each fragment that satisfies the specified
searchable path expression. Unfiltered searches are useful because
of the performance they afford when jumping deep into the
result set (for example, when paginating a long result set and
jumping to the 1,000,000th result). However, depending on the
searchable path expression, the
cts:query specified, the structure of the documents in
the database, and the configuration of the database, unfiltered
searches may yield false-positive results being included in the
search results. Unfiltered searches may also result in missed
matches or in incorrect matches, especially when there are
multiple candidate matches within a single fragment.
To avoid these problems, you should only use unfiltered searches
on top-level XPath expressions (for example, document nodes,
collections, directories) or on fragment roots. Using unfiltered
searches on complex XPath expressions or on XPath expressions that
traverse below a fragment root can result in unexpected results.
"score-logtfidf"
Compute scores using the logtfidf method (the default scoring
method). This uses the formula:
log(term frequency) * (inverse document frequency)
"score-logtf"
Compute scores using the logtf method. This does not take into
account how many documents have the term and uses the formula:
log(term frequency)
"score-simple"
Compute scores using the simple method. The score-simple
method gives a score of 8*weight for each matching term in the
cts:query expression. It does not matter how
many times a given term matches (that is, the term
frequency does not matter); each match contributes 8*weight
to the score. For example, the following query (assume the
default weight of 1) would give a score of 8 for
any fragment with one or more matches for "hello", a score of 16
for any fragment that also has one or more matches for "goodbye",
or a score of zero for fragments that have no matches for
either term:
cts:or-query(("hello", "goodbye"))
"score-random"
Compute scores using the random method. The score-random
method gives a random value to the score. You can use this
to randomly choose fragments matching a query.
- "checked"
Word positions are checked (the default) when resolving
the query. Checked searches eliminate false-positive matches for
phrases during the index resolution phase of search processing.
- "unchecked"
Word positions are not checked when resolving the
query. Unchecked searches do not take into account word positions
and can lead to false-positive matches during the index resolution
phase of search processing. This setting is useful
for debugging, but not recommended for normal use.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is (). You can use cts:search with this
parameter and an empty cts:and-query to specify a
forest-specific XPath statement (see the third
example below). If you
use this to constrain an XPath to one or more forests, you should set
the quality-weight to zero to keep the XPath document
order.
|
|
Usage Notes:
Queries that use cts:search require that the XPath expression
searched is fully searchable. A fully searchable path is one that
has no steps that are unsearchable and whose last step is searchable.
You can use the
xdmp:query-trace() function to see if the path is fully
searchable. If there are no entries in the xdmp:query-trace()
output indicating that a step is unsearchable, and if the last step
is searchable, then that path is fully
searchable. Queries that use cts:search on unsearchable
XPath expressions will fail with an an error message. You can often make
the path expressions fully searchable by rewriting the query or adding
new indexes.
Each node that cts:search returns has a score with which
it is associated. To access the score, use the cts:score
function. The nodes are returned in relevance order (most relevant to least
relevant), where more relevant nodes have a higher score.
Only one of the "filtered" or "unfiltered" options may be specified
in the options parameter. If neither "filtered" nor "unfiltered", is
specified then the default is "filtered".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter. If the neither "checked" nor "unchecked" are
specified, then the default is "checked".
If the cts:query specified is the empty string (equivalent
to cts:word-query("")), then the search returns the empty
sequence.
|
Example:
cts:search(//SPEECH,
cts:word-query("with flowers"))
=> ... a sequence of 'SPEECH' element ancestors (or self)
of any node containing the phrase 'with flowers'.
|
Example:
cts:search(collection("self-help")/book,
cts:element-query(xs:QName("title"), "meditation"),
"score-simple", 1.0, (xdmp:forest("prod"),xdmp:forest("preview")))
=> ... a sequence of book elements matching the XPath
expression which are members of the "self-help"
collection, reside in the the "prod" or "preview" forests and
contain "meditation" in the title element, using the
"score-simple" option.
|
Example:
cts:search(/some/xpath, cts:and-query(()), (), 0.0,
xdmp:forest("myForest"))
=> ... a sequence of /some/xpath elements that are
in the forest named "myForest". Note the
empty and-query, which matches all documents (and
scores them all the same) and the quality-weight
of 0, which together make each result have a score
of 0, which keeps the results in document order.
|
|
|
|
cts:shortest-distance(
|
|
$p1 as cts:point,
|
|
$region as cts:region+,
|
|
[$options as xs:string*]
|
| ) as xs:double |
|
 |
Summary:
Returns the great circle distance (in miles) between a point and an
region. The region is defined by a cts:region.
|
Parameters:
$p1
:
The first point.
|
$region
:
A region such as a circle, box, polygon, linestring, or complex-polygon.
For compatibility with previous versions, a sequence of points
is interpreted as a sequence of arcs (defined pairwise) and the
distance returned is the shortest distance to one of those points.
If the first
parameter is a point within the region specified in this parameter,
then cts:shortest-distance returns 0. If the point
specified in the first parameter in not in the region specified in this
parameter, then cts:shortest-distance returns the
shortest distance to the boundary of the region.
|
$options
(optional):
Options for the operation. The default is ().
Options include:
- "coordinate-system=wgs84"
- Use the WGS84 coordinate system.
- "units=miles"
- Distance is measured in miles.
|
|
Example:
cts:shortest-distance(
cts:point(37.494965, -122.267654),
cts:linestring((cts:point(40.720921, -74.008878),
cts:point(38.950224, -77.019714)))
)
=> 2431.82739813132, which is the shortest distance (in miles)
between San Carlos, CA and an arc between New York City and
Washington DC.
|
|
|
|
cts:similar-query(
|
|
$nodes as node()*,
|
|
[$weight as xs:double?],
|
|
[$options as element()?]
|
| ) as cts:similar-query |
|
 |
Summary:
Returns a query matching nodes similar to the model nodes. It uses an
algorithm which finds the most "relevant" terms in the model nodes
(that is, the terms with the highest scores), and then creates a
query equivalent to a cts:or-query of those terms. By default
16 terms are used.
|
Parameters:
$nodes
:
Some model nodes.
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
$options
(optional):
An XML representation of the options for defining which terms to
generate and how to evaluate them.
The options node must be in the cts:distinctive-terms
namespace. The following is a sample options node:
<options xmlns="cts:distinctive-terms">
<max-terms>20</max-terms>
</options>
See the
cts:distinctive-terms
options for the valid options to use with this function.
Note that enabling index settings that
are disabled in the database configuration will not affect the results,
as similar documents will not be found on the basis of terms that do
not exist in the actual database index.
|
|
Usage Notes:
As the number of fragments in a database grows, the results
of cts:similar-query become increasingly accurate.
For best results, there should be at least 10,000 fragments for 32-bit
systems, and 1,000 fragments for 64-bit systems.
|
Example:
cts:search(//function,
cts:similar-query((//function)[1]))
=> .. relevance-ordered sequence of 'function' element
ancestors (or self) of any node similar to the first
'function' element.
|
Example:
xdmp:estimate(
cts:search(//function,
cts:similar-query((//function)[1], (),
<options xmlns="cts:distinctive-terms">
<max-terms>20</max-terms>
<use-db-config>true</use-db-config>
</options>)))
=> the number of fragments containing any node similar
to the first 'function' element.
|
|
|
|
cts:sum(
|
|
$arg as xs:anyAtomicType*,
|
|
[$zero as xs:anyAtomicType?]
|
| ) as xs:anyAtomicType? |
|
 |
Summary:
Returns a frequency-weighted sum of a sequence.
This function works like fn:sum except each item in the
sequence is multiplied by cts:frequency before summing.
|
Parameters:
$arg
:
The sequence of values to be summed. The values should be the result of
a lexicon lookup.
|
$zero
(optional):
The value to return as zero if the input sequence is the empty sequence.
|
|
Usage Notes:
The cts:frequency of the result is the sum of the
frequencies of the sequence.
This function is designed to take a sequence of values returned
by a lexicon function (for example, cts:element-values); if you
input non-lexicon values, the result will always be 0.
|
Example:
xquery version "1.0-ml";
(:
This query assumes an int range index
is configured in the database. It
generates some sample data and then
performs the aggregation in a separate
transaction.
:)
for $x in 1 to 10
return
xdmp:document-insert(fn:concat($x, ".xml"),
<my-element>{
for $y in 1 to $x
return <int>{$x}</int>
}</my-element>);
cts:sum(cts:element-values(xs:QName("int"), (),
("type=int", "item-frequency"))),
cts:sum(cts:element-values(xs:QName("int"), (),
("type=int", "fragment-frequency")))
=>
385
55
|
|
|
|
cts:tokenize(
|
|
$text as xs:string,
|
|
[$language as xs:string?]
|
| ) as cts:token* |
|
 |
Summary:
Tokenizes text into words, punctuation, and spaces. Returns output in
the type cts:token, which has subtypes
cts:word, cts:punctuation, and
cts:space, all of which are subtypes of
xs:string.
|
Parameters:
$text
:
A word or phrase to tokenize.
|
$language
(optional):
A language to use for tokenization. If not supplied, it uses the
database default language.
|
|
Usage Notes:
When you tokenize a string with cts:tokenize, each word is
represented by an instance of
cts:word, each punctuation character
is represented by an instance of cts:punctuation,
each set of adjacent spaces is represented by an instance of
cts:space, and each set of adjacent line breaks
is represented by an instance of cts:space.
Unlike the standard XQuery function fn:tokenize,
cts:tokenize returns words, punctuation, and spaces
as different types. You can therefore use a typeswitch to handle each type
differently. For example, you can use cts:tokenize to remove
all punctuation from a string, or create logic to test for the type and
return different things for different types, as shown in the first
two examples below.
You can use xdmp:describe to show how a given string will be
tokenized. When run on the results of cts:tokenize, the
xdmp:describe function returns the types and the values
for each token. For a sample of this pattern, see the third example below.
|
Example:
(: Remove all punctuation :)
let $string := "The red, blue, green, and orange
balloons were launched!"
let $noPunctuation :=
for $token in cts:tokenize($string)
return
typeswitch ($token)
case $token as cts:punctuation return ""
case $token as cts:word return $token
case $token as cts:space return $token
default return ()
return string-join($noPunctuation, "")
=> The red blue green and orange
balloons were launched
|
Example:
(: Insert the string "XX" before and after
all punctuation tokens :)
let $string := "The red, blue, green, and orange
balloons were launched!"
let $tokens := cts:tokenize($string)
return string-join(
for $x in $tokens
return if ($x instance of cts:punctuation)
then (concat("XX",
$x, "XX"))
else ($x) , "")
=> The redXX,XX blueXX,XX greenXX,XX and orange
balloons were launchedXX!XX
|
Example:
(: show the types and tokens for a string :)
xdmp:describe(cts:tokenize("blue, green"))
=> (cts:word("blue"), cts:punctuation(","),
cts:space(" "), cts:word("green"))
|
|
|
|
cts:uri-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the URI lexicon
that match the specified wildcard pattern.
This function requires the uri-lexicon database configuration
parameter to be enabled. If the uri-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
URIs from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only URIs from the first N fragments after
skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- Include only URIs from the first N fragments after
skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include URIs from fragments selected by the cts:query,
and compute frequencies from this set of included URIs.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included URIs may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then URIs from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:uri-match("http://foo.com/*.html")
=> ("http://foo.com/bar.html", "http://foo.com/baz/bork.html", ...)
|
|
|
|
cts:uris(
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the URI lexicon.
This function requires the uri-lexicon database configuration
parameter to be enabled. If the uri-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$start
(optional):
A starting value. Return only this value and following values. If
the empty string, return all values. If the parameter is is not in
the lexicon, then it returns the values beginning with the next
value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "sample=N"
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
URIs from skipped fragments are not included.
This option affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- Return only URIs from the first N fragments after
skip selected by the
cts:query.
This option does not affect the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- Include only URIs from the first N fragments after
skip selected by the
cts:query.
This option also affects the number of fragments selected
by the cts:query to calculate frequencies.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include URIs from fragments selected by the cts:query,
and compute frequencies from this set of included URIs.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included URIs may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then URIs from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:uris("http://foo.com/")
=> ("http://foo.com/", "http://foo.com/bar.html", ...)
|
|
|
|
cts:walk(
|
|
$node as node(),
|
|
$query as cts:query,
|
|
$expr as item()*
|
| ) as item()* |
|
 |
Summary:
Walks a node, evaluating an expression with any text matching a query.
It returns a sequence of all the values returned by the expression
evaluations. This is similar to cts:highlight in how it
evaluates its expression, but it is different in what it returns.
|
Parameters:
$node
:
A node to walk. The node must be either a document node
or an element node; it cannot be a text node.
|
$query
:
A query specifying the text on which to evaluate the expression.
If a string is entered, the string is treated as a
cts:word-query of the specified string.
|
$expr
:
An expression to evaluate with matching text. You can use the
variables $cts:text, $cts:node,
$cts:queries, $cts:start, and
$cts:action (described below) in the expression.
|
|
Usage Notes:
There are five built-in variables to represent a query match.
These variables can be used inline in the expression parameter.
$cts:text as xs:string
The matched text.
$cts:node as text()
The node containing the matched text.
$cts:queries as cts:query*
The matching queries.
$cts:start as xs:integer
The string-length position of the first character of
$cts:text in $cts:node. Therefore, the following
always returns true:
fn:substring($cts:node, $cts:start,
fn:string-length($cts:text)) eq $cts:text
$cts:action as xs:string
Use xdmp:set on this to specify what should happen
next
- "continue"
- (default) Walk the next match.
If there are no more matches, return all evaluation results.
- "skip"
- Skip walking any more matches and return all evaluation results.
- "break"
- Stop walking matches and return all evaluation results.
You cannot use cts:walk to walk results matching
cts:similar-query and cts:element-attribute-*-query
items.
Because the expressions can be any XQuery expression, they can be very
simple like the above example or they can be extremely complex.
|
Example:
(:
Return all text nodes containing matches to the query "the".
:)
let $x := <p>the quick brown fox <b>jumped</b> over the lazy dog's back</p>
return cts:walk($x, "the", $cts:node)
=>
(text{"the quick brown fox "}, text{" over the lazy dog's back"})
|
Example:
xquery version "1.0-ml";
(:
Do not show any more matches that occur after
$threshold characters.
:)
let $x := <p>This is 1, this is 2, this is 3, this is 4, this is 5.</p>
let $pos := 1
let $threshold := 20
return
cts:walk($x, "this is",
(if ( $pos gt $threshold )
then xdmp:set($cts:action, "break")
else ($cts:text, xdmp:set($pos, $cts:start)) ) )
=>
("This is", "this is", "this is")
|
Example:
xquery version "1.0-ml";
(:
Show the first two matches.
:)
let $x := <p>This is 1, this is 2, this is 3, this is 4, this is 5.</p>
let $match := 0
let $threshold := 2
return
cts:walk($x, "this is",
(if ( $match ge $threshold )
then xdmp:set($cts:action, "break")
else ($cts:text, xdmp:set($match, $match + 1)) ) )
=>
("This is", "this is")
|
|
|
|
cts:word-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the word lexicon that match the wildcard pattern.
This function requires the word lexicon to be enabled. If the word
lexicon is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
A wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:word-match("aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:word-query(
|
|
$text as xs:string*,
|
|
[$options as xs:string*],
|
|
[$weight as xs:double?]
|
| ) as cts:word-query |
|
 |
Summary:
Returns a query matching text content containing a given phrase.
|
Parameters:
$text
:
Some words or phrases to match.
When multiple strings are specified,
the query matches if any string matches.
|
$options
(optional):
Options to this query. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive query.
- "case-insensitive"
- A case-insensitive query.
- "diacritic-sensitive"
- A diacritic-sensitive query.
- "diacritic-insensitive"
- A diacritic-insensitive query.
- "punctuation-sensitive"
- A punctuation-sensitive query.
- "punctuation-insensitive"
- A punctuation-insensitive query.
- "whitespace-sensitive"
- A whitespace-sensitive query.
- "whitespace-insensitive"
- A whitespace-insensitive query.
- "stemmed"
- A stemmed query.
- "unstemmed"
- An unstemmed query.
- "wildcarded"
- A wildcarded query.
- "unwildcarded"
- An unwildcarded query.
- "exact"
- An exact match query. Shorthand for "case-sensitive",
"diacritic-sensitive", "punctuation-sensitive",
"whitespace-sensitive", "unstemmed", and "unwildcarded".
- "lang=iso639code"
- Specifies the language of the query. The iso639code
code portion is case-insensitive, and uses the languages
specified by
ISO 639.
The default is specified in the database configuration.
- "distance-weight=number"
- A weight applied based on the minimum distance between matches
of this query. Higher weights add to the importance of
proximity (as opposed to term matches) when the relevance order is
calculated.
The default value is 0.0 (no impact of proximity). The
weight should be less than or equal to the absolute value of 16
(between -16 and 16); weights greater than 16 will have the
same effect as a weight of 16.
This parameter has no effect if the
word positions
index is not enabled. This parameter has no effect on searches that
use score-simple or score-random (because those scoring algorithms
do not consider term frequency, proximity is irrelevant).
- "min-occurs=number"
- Specifies the minimum number of occurrences required. If
fewer that this number of words occur, the fragment does not match.
The default is 1.
- "max-occurs=number"
- Specifies the maximum number of occurrences required. If
more than this number of words occur, the fragment does not match.
The default is unbounded.
- "synonym"
- Specifies that all of the terms in the $text parameter are
considered synonyms for scoring purposes. The result is that
occurances of more than one of the synonyms are scored as if
there are more occurance of the same term (as opposed to
having a separate term that contributes to score).
|
$weight
(optional):
A weight for this query.
Higher weights move search results up in the relevance
order. The default is 1.0. The
weight should be less than or equal to the absolute value of 16 (between
-16 and 16); weights greater than 16 will have the same effect as a
weight of 16.
Weights less than the absolute value of 0.0625 (between -0.0625 and
0.0625) are rounded to 0, which means that they do not contribute to the
score.
|
|
Usage Notes:
If neither "case-sensitive" nor "case-insensitive"
is present, $text is used to determine case sensitivity.
If $text contains no uppercase, it specifies "case-insensitive".
If $text contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $text is used to determine diacritic sensitivity.
If $text contains no diacritics, it specifies "diacritic-insensitive".
If $text contains diacritics, it specifies "diacritic-sensitive".
If neither "punctuation-sensitive" nor "punctuation-insensitive"
is present, $text is used to determine punctuation sensitivity.
If $text contains no punctuation, it specifies "punctuation-insensitive".
If $text contains punctuation, it specifies "punctuation-sensitive".
If neither "whitespace-sensitive" nor "whitespace-insensitive"
is present, the query is "whitespace-insensitive".
If neither "wildcarded" nor "unwildcarded"
is present, the database configuration and $text determine wildcarding.
If the database has any wildcard indexes enabled ("three character
searches", "two character searches", "one character searches", or
"trailing wildcard searches") and if $text contains either of the
wildcard characters '?' or '*', it specifies "wildcarded".
Otherwise it specifies "unwildcarded".
If neither "stemmed" nor "unstemmed"
is present, then the database configuration determines if a query
is run as "stemmed" (stemmed searches enabled) or "unstemmed"
(word searches enabled and stemmed searches disabled).
If the query is a wildcard
query and is also a phrase query (contains two or more terms),
then any wildcard terms in the query will be "unstemmed".
Negative "min-occurs" or "max-occurs" values will be treated as 0 and
non-integral values will be rounded down. An error will be raised if
the "min-occurs" value is greater than the "max-occurs" value.
Relevance adjustment for the "distance-weight" option depends on
the closest proximity of any two matches of the query. For example,
cts:word-query(("dog","cat"),("distance-weight=10"))
will adjust relevance based on the distance between the closest pair of
matches of either "dog" or "cat" (the pair may consist only of matches of
"dog", only of matches of "cat", or a match of "dog" and a match of "cat").
|
Example:
cts:search(//function,
cts:word-query("MarkLogic Corporation"))
=> .. relevance-ordered sequence of 'function' element
ancestors (or self) of any node containing the phrase
'MarkLogic Corporation'.
|
Example:
cts:search(//function,
cts:word-query("MarkLogic Corporation",
"case-insensitive"))
=> .. relevance-ordered sequence of 'function'
element ancestors (or self) of any node containing
the phrase 'MarkLogic Corporation' or any other
case-shift like 'MarkLogic Corporation',
'MARKLOGIC Corporation', etc.
|
Example:
cts:search(//SPEECH,
cts:word-query("to be, or not to be",
"punctuation-insensitive"))
=> .. relevance-ordered sequence of 'SPEECH'
element ancestors (or self) of any node
containing the phrase 'to be, or not to be',
ignoring punctuation.
|
|
|
|
cts:words(
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the word lexicon. This function requires the word
lexicon to be enabled. If the word lexicon is not enabled, an
exception is thrown. The words are returned in collation order.
|
Parameters:
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "skip=N"
- Skip over fragments selected by the
cts:query
to treat the Nth fragment as the first fragment.
Words from skipped fragments are not included.
Only applies when a $query parameter is specified.
- "sample=N"
- Return only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments after skip
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "score-random"
- Compute scores using the random method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
- "map"
- Return results as a single map:map value instead of as an
xs:string* sequence.
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
or "score-random" options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", or
"score-random" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:words("aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|