The OPENAITRANSCRIPTION workflow application lets you interact with an OpenAI audio model to transcribe an audio file.
The application sends the audio file to OpenAI to transcribe its content.
Application logs are available. These can be specified by setting the value of the OpenAITranscriptionLogLevel
parameter in the web.config
file to 0
to deactivate logs, 1
for error logs, 2
for information logs, or 3
for debug logs; the default value is 0
.
Parameter | Type | Direction | Description |
---|---|---|---|
Parameters | Type | Direction | Description |
---|---|---|---|
FILE
FILE
IN
The audio file to transcribe
API_KEY
TEXT
IN
OpenAI API key
By default, this value comes from the OpenAiApiKey
parameter in the web.config
file.
URL
TEXT
IN
API endpoint; defaults to https://api.openai.com/v1/audio/transcriptions
MODEL
TEXT
IN
ID of the model to use; defaults to whisper-1
TEMPERATURE
NUMERIC
IN
Sampling temperature, between 0
and 1
; defaults to 1
Higher values (e.g. 0.8
) will make the output more random, while lower values (e.g. 0.2
) will make it more focused and deterministic.
LANGUAGE
TEXT
IN
The language of the input audio
Supplying the input language in ISO-639-1 format will improve accuracy and latency.
PROMPT
TEXT
IN
Optional text to guide the model's style or continue a previous audio segment; the prompt should match the audio language
VERBOSE_OUTPUT
TEXT
IN
Specifies (Y
or N
) if the output should be verbose; defaults to N
WORDS_OUTPUT
TEXT
IN
Specifies (Y
or N
) if the verbose output should include detailed words; defaults to N
RESULT_WORDS_SEPARATOR
TEXT
IN
Separator used to separate the word list; defaults to ,
(comma)
APP_RESPONSE_IGNORE_ERROR
TEXT
IN
Specifies (Y
or N
) if errors should be ignored; defaults to N
In case of error, if the parameter has Y
as its value, the error will be ignored and defined OUT parameters (APP_RESPONSE_STATUS
or APP_RESPONSE_CONTENT
) will be mapped. Otherwise, an exception will be thrown.
TEXT
TEXT
OUT
The transcription text
RESULT
TEXT
OUT
The transcription result call
RESULT_DURATION
TEXT
OUT
Audio duration (only if verbose)
RESULT_WORDS
TEXT
OUT
Transcription words separated by RESULT_WORDS_SEPARATOR
(only if word output and verbose enabled)
RESULT_WORDS_COUNT
NUMERIC
OUT
Transcription word count (only if word output and verbose enabled)
RESULT_LANGUAGE
TEXT
OUT
Transcription language (only if verbose)
APP_RESPONSE_STATUS
TEXT
OUT
Response status code
APP_RESPONSE_CONTENT
TEXT
OUT
Response payload or error message