OPENAITRANSCRIPTION

Overview

The OPENAITRANSCRIPTION workflow application lets you interact with an OpenAI audio model to transcribe an audio file.

How it works

  • The application sends the audio file to OpenAI to transcribe its content.

  • Application logs are available. These can be specified by setting the value of the OpenAITranscriptionLogLevel parameter in the web.config file to 0 to deactivate logs, 1 for error logs, 2 for information logs, or 3 for debug logs; the default value is 0.

Required parameter

Parameter
Type
Direction
Description

FILE

FILE

IN

The audio file to transcribe

Optional parameters

Parameters
Type
Direction
Description

API_KEY

TEXT

IN

OpenAI API key By default, this value comes from the OpenAiApiKey parameter in the web.config file.

URL

TEXT

IN

API endpoint; defaults to https://api.openai.com/v1/audio/transcriptions

MODEL

TEXT

IN

ID of the model to use; defaults to whisper-1

TEMPERATURE

NUMERIC

IN

Sampling temperature, between 0 and 1; defaults to 1

Higher values (e.g. 0.8) will make the output more random, while lower values (e.g. 0.2) will make it more focused and deterministic.

LANGUAGE

TEXT

IN

The language of the input audio

Supplying the input language in ISO-639-1 format will improve accuracy and latency.

PROMPT

TEXT

IN

Optional text to guide the model's style or continue a previous audio segment; the prompt should match the audio language

VERBOSE_OUTPUT

TEXT

IN

Specifies (Y or N) if the output should be verbose; defaults to N

WORDS_OUTPUT

TEXT

IN

Specifies (Y or N) if the verbose output should include detailed words; defaults to N

RESULT_WORDS_SEPARATOR

TEXT

IN

Separator used to separate the word list; defaults to , (comma)

APP_RESPONSE_IGNORE_ERROR

TEXT

IN

Specifies (Y or N) if errors should be ignored; defaults to N

In case of error, if the parameter has Y as its value, the error will be ignored and defined OUT parameters (APP_RESPONSE_STATUS or APP_RESPONSE_CONTENT) will be mapped. Otherwise, an exception will be thrown.

TEXT

TEXT

OUT

The transcription text

RESULT

TEXT

OUT

The transcription result call

RESULT_DURATION

TEXT

OUT

Audio duration (only if verbose)

RESULT_WORDS

TEXT

OUT

Transcription words separated by RESULT_WORDS_SEPARATOR (only if word output and verbose enabled)

RESULT_WORDS_COUNT

NUMERIC

OUT

Transcription word count (only if word output and verbose enabled)

RESULT_LANGUAGE

TEXT

OUT

Transcription language (only if verbose)

APP_RESPONSE_STATUS

TEXT

OUT

Response status code

APP_RESPONSE_CONTENT

TEXT

OUT

Response payload or error message