WebSocket(realtime)
Last updated
Last updated
Clients have to provide an access-token
as query parameter to authenticate themselves with the WebSocket Server. You have to use REST API To get an access-token. Please check Obtain an access-token API.
To open a WebSocket connection you have to obtain an access-token
and provide it as query parameter in the url. By default the API only accepts secured connections via wss
.
The Access Token from REST API is for one-time use. It should be issued on every new connection.
access-token
- Your access token.
language
Currently we provide two different language models, eng
for English and kor
for Korean.
final-only
Set to true
if you only want the final result. (default false
)
content-type
Only needed if your audio source is recorded with a microphone. The most common option are:
16 KHz, Mono: audio/x-raw,+layout=(string)interleaved,+rate=(int)16000,+format=(string)S16LE,+channels=(int)1
44 KHz, Mono:
audio/x-raw,+layout=(string)interleaved,+rate=(int)44100,+format=(string)S16LE,+channels=(int)1
It is recommended to read the http status code of the handshake response and handle possible errors. The following list shows possible status codes of the handshake response
101
- OK
400
- Missing mandatory parameter
401
- Invalid access-token
403
- Free usage exceeded and no credit card available
Once a connection is established, the client can start sending audio data (e.g. a file or microphone recording) as binary. We are supporting most of the common audio files like .mp3, .flac, .wav, .ogg, .oga, .mp4,...
After all audio data is sent to the server, the client should send a text message with the content EOS
through the same connection. This message tells the server that the audio transmission is complete.
Transcribed text of the audio will be sent back to the client in real time. The format of the transcription object depends on the parameters the client used to establish the connection.
transcript
: The transcribed text.
final
: Flag to indicate if the result is final or partial.
transcript
: The transcribed text.
likelihood
: Likelihood of the transcribed text.
word-alignment
:
final
: Flag to indicate if the results is final or partial.
segment-start
: Start time of this segment in seconds.
segment-length
: Length of this segment in seconds.
total-length
: Length of all segments in seconds.
The client should not manually close the connection, which would be handled as an error on the server side.
The server will automatically close the connection to the client after the last result was transmitted to the client.
The server closes the websocket connection to the client with one of the following status codes. Consider to check the status code for error handling on the client side or for reporting any issues.
Status Code
Reason
1000
Success
1006
Client closed connection abnormally
1007
Unsupported audio type or bad quality
1011
Terminating connection because of unexpected server error
1013
No free capacity