ibm_watson.discovery_v2 module¶
IBM Watson™ Discovery for IBM Cloud Pak for Data is a cognitive search and content analytics engine that you can add to applications to identify patterns, trends and actionable insights to drive better decision-making. Securely unify structured and unstructured data with pre-enriched content, and use a simplified query language to eliminate the need for manual filtering of results.
-
class
DiscoveryV2
(version, authenticator=None)[source]¶ Bases:
ibm_cloud_sdk_core.base_service.BaseService
The Discovery V2 service.
-
default_service_url
= None¶
-
list_collections
(project_id, **kwargs)[source]¶ List collections.
Lists existing collections for the specified project.
- Parameters
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
query
(project_id, *, collection_ids=None, filter=None, query=None, natural_language_query=None, aggregation=None, count=None, return_=None, offset=None, sort=None, highlight=None, spelling_suggestions=None, table_results=None, suggested_refinements=None, passages=None, **kwargs)[source]¶ Query a project.
By using this method, you can construct queries. For details, see the [Discovery documentation](https://cloud.ibm.com/docs/services/discovery-data?topic=discovery-data-query-concepts).
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_ids (list[str]) – (optional) A comma-separated list of collection IDs to be queried against.
filter (str) – (optional) A cacheable query that excludes documents that don’t mention the query content. Filter searches are better for metadata-type searches and for assessing the concepts in the data set.
query (str) – (optional) A query search returns all documents in your data set with full enrichments and full text, but with the most relevant documents listed first. Use a query search when you want to find the most relevant search results.
natural_language_query (str) – (optional) A natural language query that returns relevant documents by utilizing training data and natural language understanding.
aggregation (str) – (optional) An aggregation search that returns an exact answer by combining query search with filters. Useful for applications to build lists, tables, and time series. For a full list of possible aggregations, see the Query reference.
count (int) – (optional) Number of results to return.
return (list[str]) – (optional) A list of the fields in the document hierarchy to return. If this parameter not specified, then all top-level fields are returned.
offset (int) – (optional) The number of query results to skip at the beginning. For example, if the total number of results that are returned is 10 and the offset is 8, it returns the last two results.
sort (str) – (optional) A comma-separated list of fields in the document to sort on. You can optionally specify a sort direction by prefixing the field with - for descending or + for ascending. Ascending is the default sort direction if no prefix is specified. This parameter cannot be used in the same query as the bias parameter.
highlight (bool) – (optional) When true, a highlight field is returned for each result which contains the fields which match the query with <em></em> tags around the matching query terms.
spelling_suggestions (bool) – (optional) When true and the natural_language_query parameter is used, the natural_language_query parameter is spell checked. The most likely correction is returned in the suggested_query field of the response (if one exists).
table_results (QueryLargeTableResults) – (optional) Configuration for table retrieval.
suggested_refinements (QueryLargeSuggestedRefinements) – (optional) Configuration for suggested refinements.
passages (QueryLargePassages) – (optional) Configuration for passage retrieval.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
get_autocompletion
(project_id, prefix, *, collection_ids=None, field=None, count=None, **kwargs)[source]¶ Get Autocomplete Suggestions.
Returns completion query suggestions for the specified prefix.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
prefix (str) – The prefix to use for autocompletion. For example, the prefix Ho could autocomplete to Hot, Housing, or How do I upgrade. Possible completions are.
collection_ids (list[str]) – (optional) Comma separated list of the collection IDs. If this parameter is not specified, all collections in the project are used.
field (str) – (optional) The field in the result documents that autocompletion suggestions are identified from.
count (int) – (optional) The number of autocompletion suggestions to return.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
query_notices
(project_id, *, filter=None, query=None, natural_language_query=None, count=None, offset=None, **kwargs)[source]¶ Query system notices.
Queries for notices (errors or warnings) that might have been generated by the system. Notices are generated when ingesting documents and performing relevance training.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
filter (str) – (optional) A cacheable query that excludes documents that don’t mention the query content. Filter searches are better for metadata-type searches and for assessing the concepts in the data set.
query (str) – (optional) A query search returns all documents in your data set with full enrichments and full text, but with the most relevant documents listed first.
natural_language_query (str) – (optional) A natural language query that returns relevant documents by utilizing training data and natural language understanding.
count (int) – (optional) Number of results to return. The maximum for the count and offset values together in any one query is 10000.
offset (int) – (optional) The number of query results to skip at the beginning. For example, if the total number of results that are returned is 10 and the offset is 8, it returns the last two results. The maximum for the count and offset values together in any one query is 10000.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
list_fields
(project_id, *, collection_ids=None, **kwargs)[source]¶ List fields.
Gets a list of the unique fields (and their types) stored in the the specified collections.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_ids (list[str]) – (optional) Comma separated list of the collection IDs. If this parameter is not specified, all collections in the project are used.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
get_component_settings
(project_id, **kwargs)[source]¶ Configuration settings for components.
Returns default configuration settings for components.
- Parameters
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
add_document
(project_id, collection_id, *, file=None, filename=None, file_content_type=None, metadata=None, x_watson_discovery_force=None, **kwargs)[source]¶ Add a document.
- Add a document to a collection with optional metadata.
- Returns immediately after the system has accepted the document for processing.
The user must provide document content, metadata, or both. If the request is
- missing both document content and metadata, it is rejected.
The user can set the Content-Type parameter on the file part to
indicate the media type of the document. If the Content-Type parameter is missing or is one of the generic media types (for example, application/octet-stream), then the service attempts to automatically detect the document’s media type.
The following field names are reserved and will be filtered out if present
after normalization: id, score, highlight, and any field with the prefix of: _, +, or -
Fields with empty name values after normalization are filtered out before
- indexing.
Fields containing the following characters after normalization are filtered
- out before indexing: # and ,
If the document is uploaded to a collection that has it’s data shared with
- another collection, the X-Watson-Discovery-Force header must be set to true.
Note: Documents can be added with a specific document_id by using the
_/v2/projects/{project_id}/collections/{collection_id}/documents method. Note: This operation only works on collections created to accept direct file uploads. It cannot be used to modify a collection that conects to an external source such as Microsoft SharePoint.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
file (file) – (optional) The content of the document to ingest. The maximum supported file size when adding a file to a collection is 50 megabytes, the maximum supported file size when testing a confiruration is 1 megabyte. Files larger than the supported size are rejected.
filename (str) – (optional) The filename for file.
file_content_type (str) – (optional) The content type of file.
metadata (str) –
(optional) The maximum supported metadata file size is 1 MB. Metadata parts larger than 1 MB are rejected. Example: ``` {
”Creator”: “Johnny Appleseed”, “Subject”: “Apples”
x_watson_discovery_force (bool) – (optional) When true, the uploaded document is added to the collection even if the data for that collection is shared with other collections.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
update_document
(project_id, collection_id, document_id, *, file=None, filename=None, file_content_type=None, metadata=None, x_watson_discovery_force=None, **kwargs)[source]¶ Update a document.
Replace an existing document or add a document with a specified document_id. Starts ingesting a document with optional metadata. If the document is uploaded to a collection that has it’s data shared with another collection, the X-Watson-Discovery-Force header must be set to true. Note: When uploading a new document with this method it automatically replaces any document stored with the same document_id if it exists. Note: This operation only works on collections created to accept direct file uploads. It cannot be used to modify a collection that conects to an external source such as Microsoft SharePoint.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
document_id (str) – The ID of the document.
file (file) – (optional) The content of the document to ingest. The maximum supported file size when adding a file to a collection is 50 megabytes, the maximum supported file size when testing a confiruration is 1 megabyte. Files larger than the supported size are rejected.
filename (str) – (optional) The filename for file.
file_content_type (str) – (optional) The content type of file.
metadata (str) –
(optional) The maximum supported metadata file size is 1 MB. Metadata parts larger than 1 MB are rejected. Example: ``` {
”Creator”: “Johnny Appleseed”, “Subject”: “Apples”
x_watson_discovery_force (bool) – (optional) When true, the uploaded document is added to the collection even if the data for that collection is shared with other collections.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
delete_document
(project_id, collection_id, document_id, *, x_watson_discovery_force=None, **kwargs)[source]¶ Delete a document.
If the given document ID is invalid, or if the document is not found, then the a success response is returned (HTTP status code 200) with the status set to ‘deleted’. Note: This operation only works on collections created to accept direct file uploads. It cannot be used to modify a collection that conects to an external source such as Microsoft SharePoint.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
document_id (str) – The ID of the document.
x_watson_discovery_force (bool) – (optional) When true, the uploaded document is added to the collection even if the data for that collection is shared with other collections.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
list_training_queries
(project_id, **kwargs)[source]¶ List training queries.
List the training queries for the specified project.
- Parameters
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
delete_training_queries
(project_id, **kwargs)[source]¶ Delete training queries.
Removes all training queries for the specified project.
- Parameters
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
create_training_query
(project_id, natural_language_query, examples, *, filter=None, **kwargs)[source]¶ Create training query.
Add a query to the training data for this project. The query can contain a filter and natural language query.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
natural_language_query (str) – The natural text query for the training query.
examples (list[TrainingExample]) – Array of training examples.
filter (str) – (optional) The filter used on the collection before the natural_language_query is applied.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
get_training_query
(project_id, query_id, **kwargs)[source]¶ Get a training data query.
Get details for a specific training data query, including the query string and all examples.
- Parameters
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
update_training_query
(project_id, query_id, natural_language_query, examples, *, filter=None, **kwargs)[source]¶ Update a training query.
Updates an existing training query and it’s examples.
- Parameters
project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
query_id (str) – The ID of the query used for training.
natural_language_query (str) – The natural text query for the training query.
examples (list[TrainingExample]) – Array of training examples.
filter (str) – (optional) The filter used on the collection before the natural_language_query is applied.
headers (dict) – A dict containing the request headers
- Returns
A DetailedResponse containing the result, headers and HTTP status code.
- Return type
DetailedResponse
-
-
class
AddDocumentEnums
[source]¶ Bases:
object
-
class
FileContentType
[source]¶ Bases:
enum.Enum
The content type of file.
-
APPLICATION_JSON
= 'application/json'¶
-
APPLICATION_MSWORD
= 'application/msword'¶
-
APPLICATION_VND_OPENXMLFORMATS_OFFICEDOCUMENT_WORDPROCESSINGML_DOCUMENT
= 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'¶
-
APPLICATION_PDF
= 'application/pdf'¶
-
TEXT_HTML
= 'text/html'¶
-
APPLICATION_XHTML_XML
= 'application/xhtml+xml'¶
-
-
class
-
class
UpdateDocumentEnums
[source]¶ Bases:
object
-
class
FileContentType
[source]¶ Bases:
enum.Enum
The content type of file.
-
APPLICATION_JSON
= 'application/json'¶
-
APPLICATION_MSWORD
= 'application/msword'¶
-
APPLICATION_VND_OPENXMLFORMATS_OFFICEDOCUMENT_WORDPROCESSINGML_DOCUMENT
= 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'¶
-
APPLICATION_PDF
= 'application/pdf'¶
-
TEXT_HTML
= 'text/html'¶
-
APPLICATION_XHTML_XML
= 'application/xhtml+xml'¶
-
-
class
-
class
Collection
(*, collection_id=None, name=None)[source]¶ Bases:
object
A collection for storing documents.
- Attr str collection_id
(optional) The unique identifier of the collection.
- Attr str name
(optional) The name of the collection.
-
class
Completions
(*, completions=None)[source]¶ Bases:
object
An object containing an array of autocompletion suggestions.
- Attr list[str] completions
(optional) Array of autcomplete suggestion based on the provided prefix.
-
class
ComponentSettingsAggregation
(*, name=None, label=None, multiple_selections_allowed=None, visualization_type=None)[source]¶ Bases:
object
Display settings for aggregations.
- Attr str name
(optional) Identifier used to map aggregation settings to aggregation configuration.
- Attr str label
(optional) User-friendly alias for the aggregation.
- Attr bool multiple_selections_allowed
(optional) Whether users is allowed to select more than one of the aggregation terms.
- Attr str visualization_type
(optional) Type of visualization to use when rendering the aggregation.
-
class
ComponentSettingsFieldsShown
(*, body=None, title=None)[source]¶ Bases:
object
Fields shown in the results section of the UI.
- Attr ComponentSettingsFieldsShownBody body
(optional) Body label.
- Attr ComponentSettingsFieldsShownTitle title
(optional) Title label.
-
class
ComponentSettingsFieldsShownBody
(*, use_passage=None, field=None)[source]¶ Bases:
object
Body label.
- Attr bool use_passage
(optional) Use the whole passage as the body.
- Attr str field
(optional) Use a specific field as the title.
-
class
ComponentSettingsFieldsShownTitle
(*, field=None)[source]¶ Bases:
object
Title label.
- Attr str field
(optional) Use a specific field as the title.
-
class
ComponentSettingsResponse
(*, fields_shown=None, autocomplete=None, structured_search=None, results_per_page=None, aggregations=None)[source]¶ Bases:
object
A response containing the default component settings.
- Attr ComponentSettingsFieldsShown fields_shown
(optional) Fields shown in the results section of the UI.
- Attr bool autocomplete
(optional) Whether or not autocomplete is enabled.
- Attr bool structured_search
(optional) Whether or not structured search is enabled.
- Attr int results_per_page
(optional) Number or results shown per page.
- Attr list[ComponentSettingsAggregation] aggregations
(optional) a list of component setting aggregations.
-
class
DeleteDocumentResponse
(*, document_id=None, status=None)[source]¶ Bases:
object
Information returned when a document is deleted.
- Attr str document_id
(optional) The unique identifier of the document.
- Attr str status
(optional) Status of the document. A deleted document has the status deleted.
-
class
DocumentAccepted
(*, document_id=None, status=None)[source]¶ Bases:
object
Information returned after an uploaded document is accepted.
- Attr str document_id
(optional) The unique identifier of the ingested document.
- Attr str status
(optional) Status of the document in the ingestion process. A status of processing is returned for documents that are ingested with a version date before 2019-01-01. The pending status is returned for all others.
-
class
DocumentAttribute
(*, type=None, text=None, location=None)[source]¶ Bases:
object
List of document attributes.
- Attr str type
(optional) The type of attribute.
- Attr str text
(optional) The text associated with the attribute.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
-
class
Field
(*, field=None, type=None, collection_id=None)[source]¶ Bases:
object
Object containing field details.
- Attr str field
(optional) The name of the field.
- Attr str type
(optional) The type of the field.
- Attr str collection_id
(optional) The collection Id of the collection where the field was found.
-
class
ListCollectionsResponse
(*, collections=None)[source]¶ Bases:
object
Response object containing an array of collection details.
- Attr list[Collection] collections
(optional) An array containing information about each collection in the project.
-
class
ListFieldsResponse
(*, fields=None)[source]¶ Bases:
object
The list of fetched fields. The fields are returned using a fully qualified name format, however, the format differs slightly from that used by the query operations.
Fields which contain nested objects are assigned a type of “nested”.
Fields which belong to a nested object are prefixed with .properties (for
example, warnings.properties.severity means that the warnings object has a property called severity).
- Attr list[Field] fields
(optional) An array containing information about each field in the collections.
-
class
Notice
(*, notice_id=None, created=None, document_id=None, collection_id=None, query_id=None, severity=None, step=None, description=None)[source]¶ Bases:
object
A notice produced for the collection.
- Attr str notice_id
(optional) Identifies the notice. Many notices might have the same ID. This field exists so that user applications can programmatically identify a notice and take automatic corrective action. Typical notice IDs include: index_failed, index_failed_too_many_requests, index_failed_incompatible_field, index_failed_cluster_unavailable, ingestion_timeout, ingestion_error, bad_request, internal_error, missing_model, unsupported_model, smart_document_understanding_failed_incompatible_field, smart_document_understanding_failed_internal_error, smart_document_understanding_failed_internal_error, smart_document_understanding_failed_warning, smart_document_understanding_page_error, smart_document_understanding_page_warning. Note: This is not a complete list, other values might be returned.
- Attr datetime created
(optional) The creation date of the collection in the format yyyy-MM-dd’T’HH:mm:ss.SSS’Z’.
- Attr str document_id
(optional) Unique identifier of the document.
- Attr str collection_id
(optional) Unique identifier of the collection.
- Attr str query_id
(optional) Unique identifier of the query used for relevance training.
- Attr str severity
(optional) Severity level of the notice.
- Attr str step
(optional) Ingestion or training step in which the notice occurred.
- Attr str description
(optional) The description of the notice.
-
class
QueryAggregation
(type)[source]¶ Bases:
object
An abstract aggregation type produced by Discovery to analyze the input provided.
- Attr str type
The type of aggregation command used. Options include: term, histogram, timeslice, nested, filter, min, max, sum, average, unique_count, and top_hits.
-
class
QueryCalculationAggregation
(type, field, *, value=None)[source]¶ Bases:
object
Returns a scalar calculation across all documents for the field specified. Possible calculations include min, max, sum, average, and unique_count.
- Attr str field
The field to perform the calculation on.
- Attr float value
(optional) The value of the calculation.
-
class
QueryFilterAggregation
(type, match, matching_results, *, aggregations=None)[source]¶ Bases:
object
A modifier that will narrow down the document set of the sub aggregations it precedes.
- Attr str match
The filter written in Discovery Query Language syntax applied to the documents before sub aggregations are run.
- Attr int matching_results
Number of documents matching the filter.
- Attr list[QueryAggregation] aggregations
(optional) An array of sub aggregations.
-
class
QueryHistogramAggregation
(type, field, interval, *, results=None)[source]¶ Bases:
object
Numeric interval segments to categorize documents by using field values from a single numeric field to describe the category.
- Attr str field
The numeric field name used to create the histogram.
- Attr int interval
The size of the sections the results are split into.
- Attr list[QueryHistogramAggregationResult] results
(optional) Array of numeric intervals.
-
class
QueryHistogramAggregationResult
(key, matching_results, *, aggregations=None)[source]¶ Bases:
object
Histogram numeric interval result.
- Attr int key
The value of the upper bound for the numeric segment.
- Attr int matching_results
Number of documents with the specified key as the upper bound.
- Attr list[QueryAggregation] aggregations
(optional) An array of sub aggregations.
-
class
QueryLargePassages
(*, enabled=None, per_document=None, max_per_document=None, fields=None, count=None, characters=None)[source]¶ Bases:
object
Configuration for passage retrieval.
- Attr bool enabled
(optional) A passages query that returns the most relevant passages from the results.
- Attr bool per_document
(optional) When true, passages will be returned whithin their respective result.
- Attr int max_per_document
(optional) Maximum number of passages to return per result.
- Attr list[str] fields
(optional) A list of fields that passages are drawn from. If this parameter not specified, then all top-level fields are included.
- Attr int count
(optional) The maximum number of passages to return. The search returns fewer passages if the requested total is not found. The default is 10. The maximum is 100.
- Attr int characters
(optional) The approximate number of characters that any one passage will have.
-
class
QueryLargeSuggestedRefinements
(*, enabled=None, count=None)[source]¶ Bases:
object
Configuration for suggested refinements.
- Attr bool enabled
(optional) Whether to perform suggested refinements.
- Attr int count
(optional) Maximum number of suggested refinements texts to be returned. The default is 10. The maximum is 100.
-
class
QueryLargeTableResults
(*, enabled=None, count=None)[source]¶ Bases:
object
Configuration for table retrieval.
- Attr bool enabled
(optional) Whether to enable table retrieval.
- Attr int count
(optional) Maximum number of tables to return.
-
class
QueryNestedAggregation
(type, path, matching_results, *, aggregations=None)[source]¶ Bases:
object
A restriction that alter the document set used for sub aggregations it precedes to nested documents found in the field specified.
- Attr str path
The path to the document field to scope sub aggregations to.
- Attr int matching_results
Number of nested documents found in the specified field.
- Attr list[QueryAggregation] aggregations
(optional) An array of sub aggregations.
-
class
QueryNoticesResponse
(*, matching_results=None, notices=None)[source]¶ Bases:
object
Object containing notice query results.
- Attr int matching_results
(optional) The number of matching results.
- Attr list[Notice] notices
(optional) Array of document results that match the query.
-
class
QueryResponse
(*, matching_results=None, results=None, aggregations=None, retrieval_details=None, suggested_query=None, suggested_refinements=None, table_results=None)[source]¶ Bases:
object
A response containing the documents and aggregations for the query.
- Attr int matching_results
(optional) The number of matching results for the query.
- Attr list[QueryResult] results
(optional) Array of document results for the query.
- Attr list[QueryAggregation] aggregations
(optional) Array of aggregations for the query.
- Attr RetrievalDetails retrieval_details
(optional) An object contain retrieval type information.
- Attr str suggested_query
(optional) Suggested correction to the submitted natural_language_query value.
- Attr list[QuerySuggestedRefinement] suggested_refinements
(optional) Array of suggested refinments.
- Attr list[QueryTableResult] table_results
(optional) Array of table results.
-
class
QueryResult
(document_id, result_metadata, *, metadata=None, document_passages=None, **kwargs)[source]¶ Bases:
object
Result document for the specified query.
- Attr str document_id
The unique identifier of the document.
- Attr dict metadata
(optional) Metadata of the document.
- Attr QueryResultMetadata result_metadata
Metadata of a query result.
- Attr list[QueryResultPassage] document_passages
(optional) Passages returned by Discovery.
-
class
QueryResultMetadata
(collection_id, *, document_retrieval_source=None, confidence=None)[source]¶ Bases:
object
Metadata of a query result.
- Attr str document_retrieval_source
(optional) The document retrieval source that produced this search result.
- Attr str collection_id
The collection id associated with this training data set.
- Attr float confidence
(optional) The confidence score for the given result. Calculated based on how relevant the result is estimated to be. confidence can range from 0.0 to 1.0. The higher the number, the more relevant the document. The confidence value for a result was calculated using the model specified in the document_retrieval_strategy field of the result set. This field is only returned if the natural_language_query parameter is specified in the query.
-
class
QueryResultPassage
(*, passage_text=None, start_offset=None, end_offset=None, field=None)[source]¶ Bases:
object
A passage query result.
- Attr str passage_text
(optional) The content of the extracted passage.
- Attr int start_offset
(optional) The position of the first character of the extracted passage in the originating field.
- Attr int end_offset
(optional) The position of the last character of the extracted passage in the originating field.
- Attr str field
(optional) The label of the field from which the passage has been extracted.
-
class
QuerySuggestedRefinement
(*, text=None)[source]¶ Bases:
object
A suggested additional query term or terms user to filter results.
- Attr str text
(optional) The text used to filter.
-
class
QueryTableResult
(*, table_id=None, source_document_id=None, collection_id=None, table_html=None, table_html_offset=None, table=None)[source]¶ Bases:
object
A tables whose content or context match a search query.
- Attr str table_id
(optional) The identifier for the retrieved table.
- Attr str source_document_id
(optional) The identifier of the document the table was retrieved from.
- Attr str collection_id
(optional) The identifier of the collection the table was retrieved from.
- Attr str table_html
(optional) HTML snippet of the table info.
- Attr int table_html_offset
(optional) The offset of the table html snippet in the original document html.
- Attr TableResultTable table
(optional) Full table object retrieved from Table Understanding Enrichment.
-
class
QueryTermAggregation
(type, field, *, count=None, results=None)[source]¶ Bases:
object
Returns the top values for the field specified.
- Attr str field
The field in the document used to generate top values from.
- Attr int count
(optional) The number of top values returned.
- Attr list[QueryTermAggregationResult] results
(optional) Array of top values for the field.
-
class
QueryTermAggregationResult
(key, matching_results, *, aggregations=None)[source]¶ Bases:
object
Top value result for the term aggregation.
- Attr str key
Value of the field with a non-zero frequency in the document set.
- Attr int matching_results
Number of documents containing the ‘key’.
- Attr list[QueryAggregation] aggregations
(optional) An array of sub aggregations.
-
class
QueryTimesliceAggregation
(type, field, interval, *, results=None)[source]¶ Bases:
object
A specialized histogram aggregation that uses dates to create interval segments.
- Attr str field
The date field name used to create the timeslice.
- Attr str interval
The date interval value. Valid values are seconds, minutes, hours, days, weeks, and years.
- Attr list[QueryTimesliceAggregationResult] results
(optional) Array of aggregation results.
-
class
QueryTimesliceAggregationResult
(key_as_string, key, matching_results, *, aggregations=None)[source]¶ Bases:
object
A timeslice interval segment.
- Attr str key_as_string
String date value of the upper bound for the timeslice interval in ISO-8601 format.
- Attr int key
Numeric date value of the upper bound for the timeslice interval in UNIX miliseconds since epoch.
- Attr int matching_results
Number of documents with the specified key as the upper bound.
- Attr list[QueryAggregation] aggregations
(optional) An array of sub aggregations.
-
class
QueryTopHitsAggregation
(type, size, *, hits=None)[source]¶ Bases:
object
Returns the top documents ranked by the score of the query.
- Attr int size
The number of documents to return.
- Attr QueryTopHitsAggregationResult hits
(optional)
-
class
QueryTopHitsAggregationResult
(matching_results, *, hits=None)[source]¶ Bases:
object
A query response containing the matching documents for the preceding aggregations.
- Attr int matching_results
Number of matching results.
- Attr list[dict] hits
(optional) An array of the document results.
-
class
RetrievalDetails
(*, document_retrieval_strategy=None)[source]¶ Bases:
object
An object contain retrieval type information.
- Attr str document_retrieval_strategy
(optional) Indentifies the document retrieval strategy used for this query. relevancy_training indicates that the results were returned using a relevancy trained model.
Note: In the event of trained collections being queried, but the trained
model is not used to return results, the document_retrieval_strategy will be listed as untrained.
-
class
DocumentRetrievalStrategyEnum
[source]¶ Bases:
enum.Enum
Indentifies the document retrieval strategy used for this query. relevancy_training indicates that the results were returned using a relevancy trained model.
Note: In the event of trained collections being queried, but the trained
model is not used to return results, the document_retrieval_strategy will be listed as untrained.
-
UNTRAINED
= 'untrained'¶
-
RELEVANCY_TRAINING
= 'relevancy_training'¶
-
-
class
TableBodyCells
(*, cell_id=None, location=None, text=None, row_index_begin=None, row_index_end=None, column_index_begin=None, column_index_end=None, row_header_ids=None, row_header_texts=None, row_header_texts_normalized=None, column_header_ids=None, column_header_texts=None, column_header_texts_normalized=None, attributes=None)[source]¶ Bases:
object
Cells that are not table header, column header, or row header cells.
- Attr str cell_id
(optional) The unique ID of the cell in the current table.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
- Attr str text
(optional) The textual contents of this cell from the input document without associated markup content.
- Attr int row_index_begin
(optional) The begin index of this cell’s row location in the current table.
- Attr int row_index_end
(optional) The end index of this cell’s row location in the current table.
- Attr int column_index_begin
(optional) The begin index of this cell’s column location in the current table.
- Attr int column_index_end
(optional) The end index of this cell’s column location in the current table.
- Attr list[TableRowHeaderIds] row_header_ids
(optional) A list of table row header ids.
- Attr list[TableRowHeaderTexts] row_header_texts
(optional) A list of table row header texts.
- Attr list[TableRowHeaderTextsNormalized] row_header_texts_normalized
(optional) A list of table row header texts normalized.
- Attr list[TableColumnHeaderIds] column_header_ids
(optional) A list of table column header ids.
- Attr list[TableColumnHeaderTexts] column_header_texts
(optional) A list of table column header texts.
- Attr list[TableColumnHeaderTextsNormalized] column_header_texts_normalized
(optional) A list of table column header texts normalized.
- Attr list[DocumentAttribute] attributes
(optional) A list of document attributes.
-
class
TableCellKey
(*, cell_id=None, location=None, text=None)[source]¶ Bases:
object
A key in a key-value pair.
- Attr str cell_id
(optional) The unique ID of the key in the table.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
- Attr str text
(optional) The text content of the table cell without HTML markup.
-
class
TableCellValues
(*, cell_id=None, location=None, text=None)[source]¶ Bases:
object
A value in a key-value pair.
- Attr str cell_id
(optional) The unique ID of the value in the table.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
- Attr str text
(optional) The text content of the table cell without HTML markup.
-
class
TableColumnHeaderIds
(*, id=None)[source]¶ Bases:
object
An array of values, each being the id value of a column header that is applicable to the current cell.
- Attr str id
(optional) The id value of a column header.
-
class
TableColumnHeaderTexts
(*, text=None)[source]¶ Bases:
object
An array of values, each being the text value of a column header that is applicable to the current cell.
- Attr str text
(optional) The text value of a column header.
-
class
TableColumnHeaderTextsNormalized
(*, text_normalized=None)[source]¶ Bases:
object
If you provide customization input, the normalized version of the column header texts according to the customization; otherwise, the same value as column_header_texts.
- Attr str text_normalized
(optional) The normalized version of a column header text.
-
class
TableColumnHeaders
(*, cell_id=None, location=None, text=None, text_normalized=None, row_index_begin=None, row_index_end=None, column_index_begin=None, column_index_end=None)[source]¶ Bases:
object
Column-level cells, each applicable as a header to other cells in the same column as itself, of the current table.
- Attr str cell_id
(optional) The unique ID of the cell in the current table.
- Attr object location
(optional) The location of the column header cell in the current table as defined by its begin and end offsets, respectfully, in the input document.
- Attr str text
(optional) The textual contents of this cell from the input document without associated markup content.
- Attr str text_normalized
(optional) If you provide customization input, the normalized version of the cell text according to the customization; otherwise, the same value as text.
- Attr int row_index_begin
(optional) The begin index of this cell’s row location in the current table.
- Attr int row_index_end
(optional) The end index of this cell’s row location in the current table.
- Attr int column_index_begin
(optional) The begin index of this cell’s column location in the current table.
- Attr int column_index_end
(optional) The end index of this cell’s column location in the current table.
-
class
TableElementLocation
(begin, end)[source]¶ Bases:
object
The numeric location of the identified element in the document, represented with two integers labeled begin and end.
- Attr int begin
The element’s begin index.
- Attr int end
The element’s end index.
-
class
TableHeaders
(*, cell_id=None, location=None, text=None, row_index_begin=None, row_index_end=None, column_index_begin=None, column_index_end=None)[source]¶ Bases:
object
The contents of the current table’s header.
- Attr str cell_id
(optional) The unique ID of the cell in the current table.
- Attr object location
(optional) The location of the table header cell in the current table as defined by its begin and end offsets, respectfully, in the input document.
- Attr str text
(optional) The textual contents of the cell from the input document without associated markup content.
- Attr int row_index_begin
(optional) The begin index of this cell’s row location in the current table.
- Attr int row_index_end
(optional) The end index of this cell’s row location in the current table.
- Attr int column_index_begin
(optional) The begin index of this cell’s column location in the current table.
- Attr int column_index_end
(optional) The end index of this cell’s column location in the current table.
-
class
TableKeyValuePairs
(*, key=None, value=None)[source]¶ Bases:
object
Key-value pairs detected across cell boundaries.
- Attr TableCellKey key
(optional) A key in a key-value pair.
- Attr list[TableCellValues] value
(optional) A list of values in a key-value pair.
-
class
TableResultTable
(*, location=None, text=None, section_title=None, title=None, table_headers=None, row_headers=None, column_headers=None, key_value_pairs=None, body_cells=None, contexts=None)[source]¶ Bases:
object
Full table object retrieved from Table Understanding Enrichment.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
- Attr str text
(optional) The textual contents of the current table from the input document without associated markup content.
- Attr TableTextLocation section_title
(optional) Text and associated location within a table.
- Attr TableTextLocation title
(optional) Text and associated location within a table.
- Attr list[TableHeaders] table_headers
(optional) An array of table-level cells that apply as headers to all the other cells in the current table.
- Attr list[TableRowHeaders] row_headers
(optional) An array of row-level cells, each applicable as a header to other cells in the same row as itself, of the current table.
- Attr list[TableColumnHeaders] column_headers
(optional) An array of column-level cells, each applicable as a header to other cells in the same column as itself, of the current table.
- Attr list[TableKeyValuePairs] key_value_pairs
(optional) An array of key-value pairs identified in the current table.
- Attr list[TableBodyCells] body_cells
(optional) An array of cells that are neither table header nor column header nor row header cells, of the current table with corresponding row and column header associations.
- Attr list[TableTextLocation] contexts
(optional) An array of lists of textual entries across the document related to the current table being parsed.
-
class
TableRowHeaderIds
(*, id=None)[source]¶ Bases:
object
An array of values, each being the id value of a row header that is applicable to this body cell.
- Attr str id
(optional) The id values of a row header.
-
class
TableRowHeaderTexts
(*, text=None)[source]¶ Bases:
object
An array of values, each being the text value of a row header that is applicable to this body cell.
- Attr str text
(optional) The text value of a row header.
-
class
TableRowHeaderTextsNormalized
(*, text_normalized=None)[source]¶ Bases:
object
If you provide customization input, the normalized version of the row header texts according to the customization; otherwise, the same value as row_header_texts.
- Attr str text_normalized
(optional) The normalized version of a row header text.
-
class
TableRowHeaders
(*, cell_id=None, location=None, text=None, text_normalized=None, row_index_begin=None, row_index_end=None, column_index_begin=None, column_index_end=None)[source]¶ Bases:
object
Row-level cells, each applicable as a header to other cells in the same row as itself, of the current table.
- Attr str cell_id
(optional) The unique ID of the cell in the current table.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
- Attr str text
(optional) The textual contents of this cell from the input document without associated markup content.
- Attr str text_normalized
(optional) If you provide customization input, the normalized version of the cell text according to the customization; otherwise, the same value as text.
- Attr int row_index_begin
(optional) The begin index of this cell’s row location in the current table.
- Attr int row_index_end
(optional) The end index of this cell’s row location in the current table.
- Attr int column_index_begin
(optional) The begin index of this cell’s column location in the current table.
- Attr int column_index_end
(optional) The end index of this cell’s column location in the current table.
-
class
TableTextLocation
(*, text=None, location=None)[source]¶ Bases:
object
Text and associated location within a table.
- Attr str text
(optional) The text retrieved.
- Attr TableElementLocation location
(optional) The numeric location of the identified element in the document, represented with two integers labeled begin and end.
-
class
TrainingExample
(document_id, collection_id, relevance, *, created=None, updated=None)[source]¶ Bases:
object
Object containing example response details for a training query.
- Attr str document_id
The document ID associated with this training example.
- Attr str collection_id
The collection ID associated with this training example.
- Attr int relevance
The relevance of the training example.
- Attr date created
(optional) The date and time the example was created.
- Attr date updated
(optional) The date and time the example was updated.
-
class
TrainingQuery
(natural_language_query, examples, *, query_id=None, filter=None, created=None, updated=None)[source]¶ Bases:
object
Object containing training query details.
- Attr str query_id
(optional) The query ID associated with the training query.
- Attr str natural_language_query
The natural text query for the training query.
- Attr str filter
(optional) The filter used on the collection before the natural_language_query is applied.
- Attr date created
(optional) The date and time the query was created.
- Attr date updated
(optional) The date and time the query was updated.
- Attr list[TrainingExample] examples
Array of training examples.