ibm_watson.discovery_v2 module¶

IBM Watson™ Discovery is a cognitive search and content analytics engine that you can add to applications to identify patterns, trends and actionable insights to drive better decision-making. Securely unify structured and unstructured data with pre-enriched content, and use a simplified query language to eliminate the need for manual filtering of results.

class DiscoveryV2(version: str, authenticator: ibm_cloud_sdk_core.authenticators.authenticator.Authenticator = None, service_name: str = 'discovery')[source]¶

Bases: ibm_cloud_sdk_core.base_service.BaseService

The Discovery V2 service.

DEFAULT_SERVICE_URL = 'https://api.us-south.discovery.watson.cloud.ibm.com'¶

DEFAULT_SERVICE_NAME = 'discovery'¶

list_collections(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

List collections.

Lists existing collections for the specified project.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

create_collection(project_id: str, name: str, *, description: str = None, language: str = None, enrichments: List[CollectionEnrichment] = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Create a collection.

Create a new collection in the specified project.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
name (str) – The name of the collection.
description (str) – (optional) A description of the collection.
language (str) – (optional) The language of the collection.
enrichments (List[CollectionEnrichment]) – (optional) An array of enrichments that are applied to this collection.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

get_collection(project_id: str, collection_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Get collection.

Get details about the specified collection.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

update_collection(project_id: str, collection_id: str, *, name: str = None, description: str = None, enrichments: List[CollectionEnrichment] = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Update a collection.

Updates the specified collection’s name, description, and enrichments.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
name (str) – (optional) The name of the collection.
description (str) – (optional) A description of the collection.
enrichments (List[CollectionEnrichment]) – (optional) An array of enrichments that are applied to this collection.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

delete_collection(project_id: str, collection_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Delete a collection.

Deletes the specified collection from the project. All documents stored in the specified collection and not shared is also deleted.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

query(project_id: str, *, collection_ids: List[str] = None, filter: str = None, query: str = None, natural_language_query: str = None, aggregation: str = None, count: int = None, return_: List[str] = None, offset: int = None, sort: str = None, highlight: bool = None, spelling_suggestions: bool = None, table_results: Optional[ibm_watson.discovery_v2.QueryLargeTableResults] = None, suggested_refinements: Optional[ibm_watson.discovery_v2.QueryLargeSuggestedRefinements] = None, passages: Optional[ibm_watson.discovery_v2.QueryLargePassages] = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Query a project.

By using this method, you can construct queries. For details, see the [Discovery documentation](https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-query-concepts). The default query parameters are defined by the settings for this project, see the [Discovery documentation](https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-project-defaults) for an overview of the standard default settings, and see [the Projects API documentation](#create-project) for details about how to set custom default query settings.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_ids (List[str]) – (optional) A comma-separated list of collection IDs to be queried against.
filter (str) – (optional) A cacheable query that excludes documents that don’t mention the query content. Filter searches are better for metadata-type searches and for assessing the concepts in the data set.
query (str) – (optional) A query search returns all documents in your data set with full enrichments and full text, but with the most relevant documents listed first. Use a query search when you want to find the most relevant search results.
natural_language_query (str) – (optional) A natural language query that returns relevant documents by utilizing training data and natural language understanding.
aggregation (str) – (optional) An aggregation search that returns an exact answer by combining query search with filters. Useful for applications to build lists, tables, and time series. For a full list of possible aggregations, see the Query reference.
count (int) – (optional) Number of results to return.
return (List[str]) – (optional) A list of the fields in the document hierarchy to return. If this parameter not specified, then all top-level fields are returned.
offset (int) – (optional) The number of query results to skip at the beginning. For example, if the total number of results that are returned is 10 and the offset is 8, it returns the last two results.
sort (str) – (optional) A comma-separated list of fields in the document to sort on. You can optionally specify a sort direction by prefixing the field with - for descending or + for ascending. Ascending is the default sort direction if no prefix is specified. This parameter cannot be used in the same query as the bias parameter.
highlight (bool) – (optional) When true, a highlight field is returned for each result which contains the fields which match the query with <em></em> tags around the matching query terms.
spelling_suggestions (bool) – (optional) When true and the natural_language_query parameter is used, the natural_language_query parameter is spell checked. The most likely correction is returned in the suggested_query field of the response (if one exists).
table_results (QueryLargeTableResults) – (optional) Configuration for table retrieval.
suggested_refinements (QueryLargeSuggestedRefinements) – (optional) Configuration for suggested refinements.
passages (QueryLargePassages) – (optional) Configuration for passage retrieval.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

get_autocompletion(project_id: str, prefix: str, *, collection_ids: List[str] = None, field: str = None, count: int = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Get Autocomplete Suggestions.

Returns completion query suggestions for the specified prefix.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
prefix (str) – The prefix to use for autocompletion. For example, the prefix Ho could autocomplete to Hot, Housing, or How do I upgrade. Possible completions are.
collection_ids (List[str]) – (optional) Comma separated list of the collection IDs. If this parameter is not specified, all collections in the project are used.
field (str) – (optional) The field in the result documents that autocompletion suggestions are identified from.
count (int) – (optional) The number of autocompletion suggestions to return.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

query_notices(project_id: str, *, filter: str = None, query: str = None, natural_language_query: str = None, count: int = None, offset: int = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Query system notices.

Queries for notices (errors or warnings) that might have been generated by the system. Notices are generated when ingesting documents and performing relevance training.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
filter (str) – (optional) A cacheable query that excludes documents that don’t mention the query content. Filter searches are better for metadata-type searches and for assessing the concepts in the data set.
query (str) – (optional) A query search returns all documents in your data set with full enrichments and full text, but with the most relevant documents listed first.
natural_language_query (str) – (optional) A natural language query that returns relevant documents by utilizing training data and natural language understanding.
count (int) – (optional) Number of results to return. The maximum for the count and offset values together in any one query is 10000.
offset (int) – (optional) The number of query results to skip at the beginning. For example, if the total number of results that are returned is 10 and the offset is 8, it returns the last two results. The maximum for the count and offset values together in any one query is 10000.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

list_fields(project_id: str, *, collection_ids: List[str] = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

List fields.

Gets a list of the unique fields (and their types) stored in the the specified collections.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_ids (List[str]) – (optional) Comma separated list of the collection IDs. If this parameter is not specified, all collections in the project are used.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

get_component_settings(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

List component settings.

Returns default configuration settings for components.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

add_document(project_id: str, collection_id: str, *, file: BinaryIO = None, filename: str = None, file_content_type: str = None, metadata: str = None, x_watson_discovery_force: bool = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Add a document.

Add a document to a collection with optional metadata.

Returns immediately after the system has accepted the document for processing.

The user must provide document content, metadata, or both. If the request is

missing both document content and metadata, it is rejected.

The user can set the Content-Type parameter on the file part to

indicate the media type of the document. If the Content-Type parameter is missing or is one of the generic media types (for example, application/octet-stream), then the service attempts to automatically detect the document’s media type.

The following field names are reserved and will be filtered out if present

after normalization: id, score, highlight, and any field with the prefix of: _, +, or -

Fields with empty name values after normalization are filtered out before

indexing.

Fields containing the following characters after normalization are filtered

out before indexing: # and ,

If the document is uploaded to a collection that has it’s data shared with

another collection, the X-Watson-Discovery-Force header must be set to true.

Note: Documents can be added with a specific document_id by using the

_/v2/projects/{project_id}/collections/{collection_id}/documents method. Note: This operation only works on collections created to accept direct file uploads. It cannot be used to modify a collection that connects to an external source such as Microsoft SharePoint.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
file (TextIO) – (optional) The content of the document to ingest. The maximum supported file size when adding a file to a collection is 50 megabytes, the maximum supported file size when testing a configuration is 1 megabyte. Files larger than the supported size are rejected.
filename (str) – (optional) The filename for file.
file_content_type (str) – (optional) The content type of file.
metadata (str) –
(optional) The maximum supported metadata file size is 1 MB. Metadata parts larger than 1 MB are rejected. Example: ``` {

”Creator”: “Johnny Appleseed”, “Subject”: “Apples”

} ```.
x_watson_discovery_force (bool) – (optional) When true, the uploaded document is added to the collection even if the data for that collection is shared with other collections.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

update_document(project_id: str, collection_id: str, document_id: str, *, file: BinaryIO = None, filename: str = None, file_content_type: str = None, metadata: str = None, x_watson_discovery_force: bool = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Update a document.

Replace an existing document or add a document with a specified document_id. Starts ingesting a document with optional metadata. If the document is uploaded to a collection that has it’s data shared with another collection, the X-Watson-Discovery-Force header must be set to true. Note: When uploading a new document with this method it automatically replaces any document stored with the same document_id if it exists. Note: This operation only works on collections created to accept direct file uploads. It cannot be used to modify a collection that connects to an external source such as Microsoft SharePoint. Note: If an uploaded document is segmented, all segments will be overwritten, even if the updated version of the document has fewer segments.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
document_id (str) – The ID of the document.
file (TextIO) – (optional) The content of the document to ingest. The maximum supported file size when adding a file to a collection is 50 megabytes, the maximum supported file size when testing a configuration is 1 megabyte. Files larger than the supported size are rejected.
filename (str) – (optional) The filename for file.
file_content_type (str) – (optional) The content type of file.
metadata (str) –
(optional) The maximum supported metadata file size is 1 MB. Metadata parts larger than 1 MB are rejected. Example: ``` {

”Creator”: “Johnny Appleseed”, “Subject”: “Apples”

} ```.
x_watson_discovery_force (bool) – (optional) When true, the uploaded document is added to the collection even if the data for that collection is shared with other collections.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

delete_document(project_id: str, collection_id: str, document_id: str, *, x_watson_discovery_force: bool = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Delete a document.

If the given document ID is invalid, or if the document is not found, then the a success response is returned (HTTP status code 200) with the status set to ‘deleted’. Note: This operation only works on collections created to accept direct file uploads. It cannot be used to modify a collection that connects to an external source such as Microsoft SharePoint. Note: Segments of an uploaded document cannot be deleted individually. Delete all segments by deleting using the parent_document_id of a segment result.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
document_id (str) – The ID of the document.
x_watson_discovery_force (bool) – (optional) When true, the uploaded document is added to the collection even if the data for that collection is shared with other collections.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

list_training_queries(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

List training queries.

List the training queries for the specified project.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

delete_training_queries(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Delete training queries.

Removes all training queries for the specified project.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

create_training_query(project_id: str, natural_language_query: str, examples: List[TrainingExample], *, filter: str = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Create training query.

Add a query to the training data for this project. The query can contain a filter and natural language query.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
natural_language_query (str) – The natural text query for the training query.
examples (List[TrainingExample]) – Array of training examples.
filter (str) – (optional) The filter used on the collection before the natural_language_query is applied.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

get_training_query(project_id: str, query_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Get a training data query.

Get details for a specific training data query, including the query string and all examples.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
query_id (str) – The ID of the query used for training.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

update_training_query(project_id: str, query_id: str, natural_language_query: str, examples: List[TrainingExample], *, filter: str = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Update a training query.

Updates an existing training query and it’s examples.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
query_id (str) – The ID of the query used for training.
natural_language_query (str) – The natural text query for the training query.
examples (List[TrainingExample]) – Array of training examples.
filter (str) – (optional) The filter used on the collection before the natural_language_query is applied.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

analyze_document(project_id: str, collection_id: str, *, file: BinaryIO = None, filename: str = None, file_content_type: str = None, metadata: str = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Analyze a Document.

Process a document using the specified collection’s settings and return it for realtime use. Note: Documents processed using this method are not added to the specified collection. Note: This method is only supported on IBM Cloud Pak for Data instances of Discovery.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
collection_id (str) – The ID of the collection.
file (TextIO) – (optional) The content of the document to ingest. The maximum supported file size when adding a file to a collection is 50 megabytes, the maximum supported file size when testing a configuration is 1 megabyte. Files larger than the supported size are rejected.
filename (str) – (optional) The filename for file.
file_content_type (str) – (optional) The content type of file.
metadata (str) –
(optional) The maximum supported metadata file size is 1 MB. Metadata parts larger than 1 MB are rejected. Example: ``` {

”Creator”: “Johnny Appleseed”, “Subject”: “Apples”

} ```.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

list_enrichments(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

List Enrichments.

List the enrichments available to this project.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

create_enrichment(project_id: str, enrichment: ibm_watson.discovery_v2.CreateEnrichment, *, file: BinaryIO = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Create an enrichment.

Create an enrichment for use with the specified project/.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
enrichment (CreateEnrichment) –
file (TextIO) – (optional) The enrichment file to upload.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

get_enrichment(project_id: str, enrichment_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Get enrichment.

Get details about a specific enrichment.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
enrichment_id (str) – The ID of the enrichment.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

update_enrichment(project_id: str, enrichment_id: str, name: str, *, description: str = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Update an enrichment.

Updates an existing enrichment’s name and description.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
enrichment_id (str) – The ID of the enrichment.
name (str) – A new name for the enrichment.
description (str) – (optional) A new description for the enrichment.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

delete_enrichment(project_id: str, enrichment_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Delete an enrichment.

Deletes an existing enrichment from the specified project. Note: Only enrichments that have been manually created can be deleted.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
enrichment_id (str) – The ID of the enrichment.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

list_projects(**kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

List projects.

Lists existing projects for this instance.

Parameters: headers (dict) – A dict containing the request headers
Returns: A DetailedResponse containing the result, headers and HTTP status code.
Return type: DetailedResponse

create_project(name: str, type: str, *, default_query_parameters: Optional[ibm_watson.discovery_v2.DefaultQueryParams] = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Create a Project.

Create a new project for this instance.

Parameters

name (str) – The human readable name of this project.
type (str) – The project type of this project.
default_query_parameters (DefaultQueryParams) – (optional) Default query parameters for this project.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

get_project(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Get project.

Get details on the specified project.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

update_project(project_id: str, *, name: str = None, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Update a project.

Update the specified project’s name.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
name (str) – (optional) The new name to give this project.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

delete_project(project_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Delete a project.

Deletes the specified project. Important: Deleting a project deletes everything that is part of the specified project, including all collections.

Parameters

project_id (str) – The ID of the project. This information can be found from the deploy page of the Discovery administrative tooling.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

delete_user_data(customer_id: str, **kwargs) → ibm_cloud_sdk_core.detailed_response.DetailedResponse[source]¶

Delete labeled data.

Deletes all data associated with a specified customer ID. The method has no effect if no data is associated with the customer ID. You associate a customer ID with data by passing the X-Watson-Metadata header with a request that passes data. For more information about personal data and customer IDs, see [Information security](https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-information-security#information-security). Note: This method is only supported on IBM Cloud instances of Discovery.

Parameters

customer_id (str) – The customer ID for which all data is to be deleted.
headers (dict) – A dict containing the request headers

Returns

A DetailedResponse containing the result, headers and HTTP status code.

Return type

DetailedResponse

class AddDocumentEnums[source]¶