RecognitionSettings Structure Reference


                    
                    
                    contentType

The format of the audio data. Endianness is automatically detected by the Speech to Text service. For more information about the supported formats, visit: https://cloud.ibm.com/docs/services/speech-to-text/audio-formats.html

Declaration

Swift

public var contentType: String?


                    
                    
                    customizationWeight

If you specify a customization ID when you open the connection, you can use the customization weight to tell the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

Declaration

Swift

public var customizationWeight: Double?


                    
                    
                    inactivityTimeout

The number of seconds after which the connection is to time out due to inactivity. Use -1 to set the timeout to infinity. The default is 30 seconds.

Declaration

Swift

public var inactivityTimeout: Int?


                    
                    
                    keywords

An array of keyword strings to be matched in the input audio. By default, the service does not perform any keyword spotting.

Declaration

Swift

public var keywords: [String]?


                    
                    
                    keywordsThreshold

A minimum level of confidence that the service must have to report a matching keyword in the input audio. The threshold must be a probability value between 0 and 1 inclusive. A match must have at least the specified confidence to be returned. If you specify a valid threshold, you must also specify at least one keyword. By default, the service does not perform any keyword spotting.

Declaration

Swift

public var keywordsThreshold: Double?


                    
                    
                    maxAlternatives

The maximum number of alternative transcriptions to receive. The default is 1.

Declaration

Swift

public var maxAlternatives: Int?


                    
                    
                    interimResults

If true, then interim results (i.e. results that are not final) will be received for the transcription. The default is false.

Declaration

Swift

public var interimResults: Bool?


                    
                    
                    wordAlternativesThreshold

A minimum level of confidence that the service must have to report a hypothesis for a word from the input audio. The threshold must be a probability value between 0 and 1 inclusive. A hypothesis must have at least the specified confidence to be returned as a word alternative. By default, the service does not return any word alternatives.

Declaration

Swift

public var wordAlternativesThreshold: Double?


                    
                    
                    wordConfidence

If true, then a confidence score will be received for each word of the transcription. The default is false.

Declaration

Swift

public var wordConfidence: Bool?


                    
                    
                    timestamps

If true, then per-word start and end times relative to the beginning of the audio will be received. The default is false.

Declaration

Swift

public var timestamps: Bool?


                    
                    
                    filterProfanity

If true, then profanity will be censored from the service’s output, obscuring each occurrence with a set of asterisks. The default is true.

Declaration

Swift

public var filterProfanity: Bool?


                    
                    
                    smartFormatting

Indicates whether dates, times, series of digits and numbers, phone numbers, currency values, and Internet addresses are to be converted into more readable, conventional representations in the final transcript of a recognition request. If true, smart formatting is performed; if false (the default), no formatting is performed. Applies to US English transcription only.

Declaration

Swift

public var smartFormatting: Bool?


                    
                    
                    speakerLabels

If true, then speaker labels will be returned for each timestamp. The default is false.

Declaration

Swift

public var speakerLabels: Bool?


                    
                    
                    grammarName

The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model’s words resource. See Grammars.

Declaration

Swift

public var grammarName: String?


                    
                    
                    redaction

If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, the service performs no redaction. When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1). Note: Applies to US English, Japanese, and Korean transcription only. See Numeric redaction.

Declaration

Swift

public var redaction: Bool?


                    
                    
                    processingMetrics

Undocumented

Declaration

Swift

public var processingMetrics: Bool?


                    
                    
                    processingMetricsInterval

Undocumented

Declaration

Swift

public var processingMetricsInterval: Double?


                    
                    
                    audioMetrics

Undocumented

Declaration

Swift

public var audioMetrics: Bool?


                    
                    
                    endOfPhraseSilenceTime

If true, specifies the duration of the pause interval at which the service splits a transcript into multiple final results. If the service detects pauses or extended silence before it reaches the end of the audio stream, its response can include multiple final results. Silence indicates a point at which the speaker pauses between spoken words or phrases. Specify a value for the pause interval in the range of 0.0 to 120.0.

A value greater than 0 specifies the interval that the service is to use for speech recognition.
A value of 0 indicates that the service is to use the default interval. It is equivalent to omitting the parameter. The default pause interval for most languages is 0.8 seconds; the default for Chinese is 0.6 seconds. See End of phrase silence time.

Declaration

Swift

public var endOfPhraseSilenceTime: Double?


                    
                    
                    splitTranscriptAtPhraseEnd

If true, directs the service to split the transcript into multiple final results based on semantic features of the input, for example, at the conclusion of meaningful phrases such as sentences. The service bases its understanding of semantic features on the base language model that you use with a request. Custom language models and grammars can also influence how and where the service splits a transcript. By default, the service splits transcripts based solely on the pause interval. See Split transcript at phrase.

Declaration

Swift

public var splitTranscriptAtPhraseEnd: Bool?


                    
                    
                    speechDetectorSensitivity

The sensitivity of speech activity detection that the service is to perform. Use the parameter to suppress word insertions from music, coughing, and other non-speech events. The service biases the audio it passes for speech recognition by evaluating the input audio against prior models of speech and non-speech activity. Specify a value between 0.0 and 1.0:

0.0 suppresses all audio (no speech is transcribed).
0.5 (the default) provides a reasonable compromise for the level of sensitivity.
1.0 suppresses no audio (speech detection sensitivity is disabled). The values increase on a monotonic curve. See Speech Activity Detection.

Declaration

Swift

public var speechDetectorSensitivity: Double?


                    
                    
                    backgroundAudioSuppression

The level to which the service is to suppress background audio based on its volume to prevent it from being transcribed as speech. Use the parameter to suppress side conversations or background noise. Specify a value in the range of 0.0 to 1.0:

0.0 (the default) provides no suppression (background audio suppression is disabled).
0.5 provides a reasonable level of audio suppression for general usage.
1.0 suppresses all audio (no audio is transcribed). The values increase on a monotonic curve. See Speech Activity Detection.

Declaration

Swift

public var backgroundAudioSuppression: Double?


                    
                    
                    lowLatency

Undocumented

Declaration

Swift

public var lowLatency: Bool?


                    
                    
                    init(contentType:)

Initialize a RecognitionSettings object to set the parameters of a Watson Speech to Text recognition request.

Declaration

Swift

public init(contentType: String? = nil)

Parameters


                                contentType

The format of the audio data. Endianness is automatically detected by the Speech to Text service. For more information about the supported formats, visit: https://cloud.ibm.com/docs/services/speech-to-text/audio-formats.html

Return Value

An initialized RecognitionSettings object with the given contentType. Configure additional parameters for the recognition request by directly modifying the returned object’s properties.