SpeechToText Class

This class wraps the Watson Speech to Text service. Speech to Text Service

Inheritance Hierarchy

SystemObject
IBM.Watson.DeveloperCloud.Services.SpeechToText.v1SpeechToText

Namespace: IBM.Watson.DeveloperCloud.Services.SpeechToText.v1
Assembly: unity-documentation (in unity-documentation.exe) Version: 1.0.0.0 (1.0.0.0)

Syntax

Copy

public class SpeechToText : IWatsonService

The SpeechToText type exposes the following members.

Constructors

	Name	Description
	SpeechToText	Speech to Text constructor.

Top

Properties

	Name	Description
	AcousticCustomizationId	Specifies the Globally Unique Identifier (GUID) of a custom acoustic model that is to be used for all requests sent over the connection. The base model of the custom acoustic model must match the value of the model parameter. By default, no custom acoustic model is used. For more information, see https://console.bluemix.net/docs/services/speech-to-text/custom.html.
	AudioSent	True if AudioData has been sent and we are recognizing speech.
	Credentials	Gets and sets the credentials of the service. Replace the default endpoint if endpoint is defined.
	CustomizationId	Specifies the Globally Unique Identifier (GUID) of a custom language model that is to be used for all requests sent over the connection. The base model of the custom language model must match the value of the model parameter. By default, no custom language model is used. For more information, see https://console.bluemix.net/docs/services/speech-to-text/custom.html.
	CustomizationWeight	Specifies the weight the service gives to words from a specified custom language model compared to those from the base model for all requests sent over the connection. Specify a value between 0.0 and 1.0; the default value is 0.3. For more information, see https://console.bluemix.net/docs/services/speech-to-text/language-use.html#weight.
	DetectSilence	If true, then we will try not to send silent audio clips to the server. This can save bandwidth when no sound is happening.
	EnableInterimResults	If true, then we will get interim results while recognizing. The user will then need to check the Final flag on the results.
	EnableTimestamps	True to return timestamps of words with results.
	EnableWordConfidence	True to return word confidence with results.
	InactivityTimeout	NON-MULTIPART ONLY: The time in seconds after which, if only silence (no speech) is detected in submitted audio, the connection is closed with a 400 error. Useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity.
	IsListening	True if StartListening() has been called.
	Keywords	NON-MULTIPART ONLY: Array of keyword strings to spot in the audio. Each keyword string can include one or more tokens. Keywords are spotted only in the final hypothesis, not in interim results. Omit the parameter or specify an empty array if you do not need to spot keywords.
	KeywordsThreshold	NON-MULTIPART ONLY: Confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0 and 1 inclusive. No keyword spotting is performed if you omit the parameter. If you specify a threshold, you must also specify one or more keywords.
	LoadFile	Set this property to overload the internal file loading of this class.
	MaxAlternatives	Returns the maximum number of alternatives returned by recognize.
	OnError	This delegate is invoked when an error occurs.
	ProfanityFilter	NON-MULTIPART ONLY: If true (the default), filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only.
	RecognizeModel	This property controls which recognize model we use when making recognize requests of the server.
	SilenceThreshold	A value from 1.0 to 0.0 that determines what is considered silence. If the absolute value of the audio level is below this value then we consider it silence.
	SmartFormatting	NON-MULTIPART ONLY: If true, converts dates, times, series of digits and numbers, phone numbers, currency values, and Internet addresses into more readable, conventional representations in the final transcript of a recognition request. If false (the default), no formatting is performed. Applies to US English transcription only.
	SpeakerLabels	NON-MULTIPART ONLY: Indicates whether labels that identify which words were spoken by which participants in a multi-person exchange are to be included in the response. If true, speaker labels are returned; if false (the default), they are not. Speaker labels can be returned only for the following language models: en-US_NarrowbandModel, en-US_BroadbandModel, es-ES_NarrowbandModel, es-ES_BroadbandModel, ja-JP_NarrowbandModel, and ja-JP_BroadbandModel. Setting speaker_labels to true forces the timestamps parameter to be true, regardless of whether you specify false for the parameter.
	StreamMultipart	If true sets `Transfer-Encoding` request header to `chunked` causing the audio to be streamed to the service. By default, audio is sent all at once as a one-shot delivery. See https://console.bluemix.net/docs/services/speech-to-text/input.html#transmission.
	Url	Gets and sets the endpoint URL for the service.
	WordAlternativesThreshold	NON-MULTIPART ONLY: Confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0 and 1 inclusive. No alternative words are computed if you omit the parameter.

Top

Methods

	Name	Description
	AddAcousticResource	Adds an audio resource to a custom acoustic model.
	AddCustomCorpus	Overload method for AddCustomCorpus that takes string training data.
	AddCustomWords(SpeechToTextSuccessCallbackBoolean, SpeechToTextFailCallback, String, Words, DictionaryString, Object)	Adds one or more custom words to a custom language model.The service populates the words resource for a custom model with out-of-vocabulary(OOV) words found in each corpus added to the model.You can use this method to add additional words or to modify existing words in the words resource.Only the owner of a custom model can use this method to add or modify custom words associated with the model.Adding or modifying custom words does not affect the custom model until you train the model for the new data by using the POST /v1/customizations/{customization_id}/train method. You add custom words by providing a Words object, which is an array of Word objects, one per word.You must use the object's word parameter to identify the word that is to be added. You can also provide one or both of the optional sounds_like and display_as fields for each word. The sounds_like field provides an array of one or more pronunciations for the word. Use the parameter to specify how the word can be pronounced by users.Use the parameter for words that are difficult to pronounce, foreign words, acronyms, and so on.For example, you might specify that the word IEEE can sound like i triple e.You can specify a maximum of five sounds-like pronunciations for a word, and each pronunciation must adhere to the following rules: Use English alphabetic characters: a-z and A-Z. To pronounce a single letter, use the letter followed by a period, for example, N.C.A.A. for the word NCAA. Use real or made-up words that are pronounceable in the native language, for example, shuchensnie for the word Sczcesny. Substitute equivalent English letters for non-English letters, for example, s for ç or ny for ñ. Substitute non-accented letters for accented letters, for example a for à or e for è. Use the spelling of numbers, for example, seventy-five for 75. You can include multiple words separated by spaces, but the service enforces a maximum of 40 total characters not including spaces. The display_as field provides a different way of spelling the word in a transcript. Use the parameter when you want the word to appear different from its usual representation or from its spelling in corpora training data.For example, you might indicate that the word IBM(trademark) is to be displayed as IBM™. If you add a custom word that already exists in the words resource for the custom model, the new definition overrides the existing data for the word.If the service encounters an error with the input data, it returns a failure code and does not add any of the words to the words resource. The call returns an HTTP 201 response code if the input data is valid.It then asynchronously pre-processes the words to add them to the model's words resource. The time that it takes for the analysis to complete depends on the number of new words that you add but is generally faster than adding a corpus or training a model. You can use the GET /v1/customizations/{ customization_id}/words or GET /v1/customizations/{customization_id}/words/{word_name} method to review the words that you add.Words with an invalid sounds_like field include an error field that describes the problem.You can use other words methods to correct errors, eliminate typos, and modify how words are pronounced as needed. Note: This method is currently a beta release that is available for US English only.
	AddCustomWords(SpeechToTextSuccessCallbackBoolean, SpeechToTextFailCallback, String, String, DictionaryString, Object)	Adds one or more custom words to a custom language model.The service populates the words resource for a custom model with out-of-vocabulary(OOV) words found in each corpus added to the model.You can use this method to add additional words or to modify existing words in the words resource.Only the owner of a custom model can use this method to add or modify custom words associated with the model.Adding or modifying custom words does not affect the custom model until you train the model for the new data by using the POST /v1/customizations/{customization_id}/train method. You add custom words by providing a Words object, which is an array of Word objects, one per word.You must use the object's word parameter to identify the word that is to be added. You can also provide one or both of the optional sounds_like and display_as fields for each word. The sounds_like field provides an array of one or more pronunciations for the word. Use the parameter to specify how the word can be pronounced by users.Use the parameter for words that are difficult to pronounce, foreign words, acronyms, and so on.For example, you might specify that the word IEEE can sound like i triple e.You can specify a maximum of five sounds-like pronunciations for a word, and each pronunciation must adhere to the following rules: Use English alphabetic characters: a-z and A-Z. To pronounce a single letter, use the letter followed by a period, for example, N.C.A.A. for the word NCAA. Use real or made-up words that are pronounceable in the native language, for example, shuchensnie for the word Sczcesny. Substitute equivalent English letters for non-English letters, for example, s for ç or ny for ñ. Substitute non-accented letters for accented letters, for example a for à or e for è. Use the spelling of numbers, for example, seventy-five for 75. You can include multiple words separated by spaces, but the service enforces a maximum of 40 total characters not including spaces. The display_as field provides a different way of spelling the word in a transcript. Use the parameter when you want the word to appear different from its usual representation or from its spelling in corpora training data.For example, you might indicate that the word IBM(trademark) is to be displayed as IBM™. If you add a custom word that already exists in the words resource for the custom model, the new definition overrides the existing data for the word.If the service encounters an error with the input data, it returns a failure code and does not add any of the words to the words resource. The call returns an HTTP 201 response code if the input data is valid.It then asynchronously pre-processes the words to add them to the model's words resource. The time that it takes for the analysis to complete depends on the number of new words that you add but is generally faster than adding a corpus or training a model. You can use the GET /v1/customizations/{ customization_id}/words or GET /v1/customizations/{customization_id}/words/{word_name} method to review the words that you add.Words with an invalid sounds_like field include an error field that describes the problem.You can use other words methods to correct errors, eliminate typos, and modify how words are pronounced as needed. Note: This method is currently a beta release that is available for US English only.
	AddCustomWords(SpeechToTextSuccessCallbackBoolean, SpeechToTextFailCallback, String, Boolean, String, DictionaryString, Object)	Adds one or more custom words to a custom language model.The service populates the words resource for a custom model with out-of-vocabulary(OOV) words found in each corpus added to the model.You can use this method to add additional words or to modify existing words in the words resource.Only the owner of a custom model can use this method to add or modify custom words associated with the model.Adding or modifying custom words does not affect the custom model until you train the model for the new data by using the POST /v1/customizations/{customization_id}/train method. You add custom words by providing a Words object, which is an array of Word objects, one per word.You must use the object's word parameter to identify the word that is to be added. You can also provide one or both of the optional sounds_like and display_as fields for each word. The sounds_like field provides an array of one or more pronunciations for the word. Use the parameter to specify how the word can be pronounced by users.Use the parameter for words that are difficult to pronounce, foreign words, acronyms, and so on.For example, you might specify that the word IEEE can sound like i triple e.You can specify a maximum of five sounds-like pronunciations for a word, and each pronunciation must adhere to the following rules: Use English alphabetic characters: a-z and A-Z. To pronounce a single letter, use the letter followed by a period, for example, N.C.A.A. for the word NCAA. Use real or made-up words that are pronounceable in the native language, for example, shuchensnie for the word Sczcesny. Substitute equivalent English letters for non-English letters, for example, s for ç or ny for ñ. Substitute non-accented letters for accented letters, for example a for à or e for è. Use the spelling of numbers, for example, seventy-five for 75. You can include multiple words separated by spaces, but the service enforces a maximum of 40 total characters not including spaces. The display_as field provides a different way of spelling the word in a transcript. Use the parameter when you want the word to appear different from its usual representation or from its spelling in corpora training data.For example, you might indicate that the word IBM(trademark) is to be displayed as IBM™. If you add a custom word that already exists in the words resource for the custom model, the new definition overrides the existing data for the word.If the service encounters an error with the input data, it returns a failure code and does not add any of the words to the words resource. The call returns an HTTP 201 response code if the input data is valid.It then asynchronously pre-processes the words to add them to the model's words resource. The time that it takes for the analysis to complete depends on the number of new words that you add but is generally faster than adding a corpus or training a model. You can use the GET /v1/customizations/{ customization_id}/words or GET /v1/customizations/{customization_id}/words/{word_name} method to review the words that you add.Words with an invalid sounds_like field include an error field that describes the problem.You can use other words methods to correct errors, eliminate typos, and modify how words are pronounced as needed. Note: This method is currently a beta release that is available for US English only.
	CreateAcousticCustomization	Creates a custom acoustic model.
	CreateCustomization	Creates a new custom language model for a specified base language model. The custom language model can be used only with the base language model for which it is created. The new model is owned by the individual whose service credentials are used to create it. Note: This method is currently a beta release that is available for US English only.
	DeleteAcousticCustomization	Deletes a custom acoustic model.
	DeleteAcousticResource	Deletes an audio resource from a custom acoustic model.
	DeleteCustomCorpus	Deletes an existing corpus from a custom language model. The service removes any out-of-vocabulary (OOV) words associated with the corpus from the custom model's words resource unless they were also added by another corpus or they have been modified in some way with the POST /v1/customizations/{customization_id}/words or PUT /v1/customizations/{customization_id}/words/{word_name} method. Removing a corpus does not affect the custom model until you train the model with the POST /v1/customizations/{customization_id}/train method. Only the owner of a custom model can use this method to delete a corpus from the model. Note: This method is currently a beta release that is available for US English only.
	DeleteCustomization	Deletes an existing custom language model. Only the owner of a custom model can use this method to delete the model. Note: This method is currently a beta release that is available for US English only.
	DeleteCustomWord	Deletes a custom word from a custom language model. You can remove any word that you added to the custom model's words resource via any means. However, if the word also exists in the service's base vocabulary, the service removes only the custom pronunciation for the word; the word remains in the base vocabulary. Removing a custom word does not affect the custom model until you train the model with the POST /v1/customizations/{customization_id}/train method.Only the owner of a custom model can use this method to delete a word from the model. Note: This method is currently a beta release that is available for US English only.
	GetCustomAcousticModel	Lists information about a custom acoustic model.
	GetCustomAcousticModels	Lists information about all custom acoustic models.
	GetCustomAcousticResource	Lists information about an audio resource for a custom acoustic model.
	GetCustomAcousticResources	Lists information about all audio resources for a custom acoustic model.
	GetCustomCorpora	Lists information about all corpora that have been added to the specified custom language model. The information includes the total number of words and out-of-vocabulary (OOV) words, name, and status of each corpus. Only the owner of a custom model can use this method to list the model's corpora. Note: This method is currently a beta release that is available for US English only.
	GetCustomCorpus	Lists information about all corpora that have been added to the specified custom language model. The information includes the total number of words and out-of-vocabulary (OOV) words, name, and status of each corpus. Only the owner of a custom model can use this method to list the model's corpora. Note: This method is currently a beta release that is available for US English only.
	GetCustomization	Lists information about a custom language model. Only the owner of a custom model can use this method to query information about the model. Note: This method is currently a beta release that is available for US English only.
	GetCustomizations	Lists information about all custom language models that are owned by the calling user. Use the language query parameter to see all custom models for the specified language; omit the parameter to see all custom models for all languages. Note: This method is currently a beta release that is available for US English only.
	GetCustomWord	Lists information about a custom word from a custom language model. Only the owner of a custom model can use this method to query a word from the model. Note: This method is currently a beta release that is available for US English only.
	GetCustomWords	Lists information about all custom words from a custom language model. You can list all words from the custom model's words resource, only custom words that were added or modified by the user, or only OOV words that were extracted from corpora. Only the owner of a custom model can use this method to query the words from the model. Note: This method is currently a beta release that is available for US English only.
	GetModel	This function retrieves a specified languageModel.
	GetModels	This function retrieves all the language models that the user may use by setting the RecognizeModel public property.
	OnListen	This function should be invoked with the AudioData input after StartListening() method has been invoked. The user should continue to invoke this function until they are ready to call StopListening(), typically microphone input is sent to this function.
	Recognize(SpeechToTextSuccessCallbackSpeechRecognitionEvent, SpeechToTextFailCallback, AudioClip, DictionaryString, Object)	This function POSTs the given audio clip the recognize function and convert speech into text. This function should be used only on AudioClips under 4MB once they have been converted into WAV format. Use the StartListening() for continuous recognition of text.
	Recognize(SpeechToTextSuccessCallbackSpeechRecognitionEvent, SpeechToTextFailCallback, Byte, String, DictionaryString, Object)	This function POSTs the given audio clip the recognize function and convert speech into text. This function should be used only on AudioClips under 4MB once they have been converted into WAV format. Use the StartListening() for continuous recognition of text.
	ResetAcousticCustomization	Resets a custom acoustic model.
	ResetCustomization	Resets a custom language model by removing all corpora and words from the model.Resetting a custom model initializes the model to its state when it was first created. Metadata such as the name and language of the model are preserved.Only the owner of a custom model can use this method to reset the model. Note: This method is currently a beta release that is available for US English only.
	StartListening	This starts the service listening and it will invoke the callback for any recognized speech. OnListen() must be called by the user to queue audio data to send to the service. StopListening() should be called when you want to stop listening.
	StopListening	Invoke this function stop this service from listening.
	TrainAcousticCustomization	Trains an acoustic model
	TrainCustomization	Initiates the training of a custom language model with new corpora, words, or both.After adding training data to the custom model with the corpora or words methods, use this method to begin the actual training of the model on the new data.You can specify whether the custom model is to be trained with all words from its words resources or only with words that were added or modified by the user.Only the owner of a custom model can use this method to train the model. This method is asynchronous and can take on the order of minutes to complete depending on the amount of data on which the service is being trained and the current load on the service.The method returns an HTTP 200 response code to indicate that the training process has begun. You can monitor the status of the training by using the GET /v1/customizations/{customization_id} method to poll the model's status. Use a loop to check the status every 10 seconds. The method returns a Customization object that includes status and progress fields. A status of available means that the custom model is trained and ready to use. If training is in progress, the progress field indicates the progress of the training as a percentage complete. Note: For this beta release, the progress field does not reflect the current progress of the training. The field changes from 0 to 100 when training is complete. Training can fail to start for the following reasons: No training data (corpora or words) have been added to the custom model. Pre-processing of corpora to generate a list of out-of-vocabulary (OOV) words is not complete. Pre-processing of words to validate or auto-generate sounds-like pronunciations is not complete. One or more words that were added to the custom model have invalid sounds-like pronunciations that you must fix. Note: This method is currently a beta release that is available for US English only.
	UpgradeCustomization	Upgrades a custom language model to the latest release level of the Speech to Text service. The method bases the upgrade on the latest trained data stored for the custom model. If the corpora or words for the model have changed since the model was last trained, you must use the POST /v1/customizations/{customization_id}/train method to train the model on the new data. Only the owner of a custom model can use this method to upgrade the model. Note: This method is not currently implemented.It will be added for a future release of the API.

Top

Reference

IBM.Watson.DeveloperCloud.Services.SpeechToText.v1 Namespace