WebSocket Interface | Mod9 ASR Engine

[ Overview || TCP | C++ | Python | REST | WebSocket || Models | Customization | Deployment | Licensing ]

Mod9 ASR WebSocket Interface

The Mod9 ASR WebSocket Interface is a higher-level interface than the protocol described in the TCP reference documentation, enabling client-side code in web browsers to access the full functionality of the ASR Engine.

Because it is intended for communication over the public Internet, a WebSocket connection should usually be protected with an encryption layer (i.e. using wss:// instead of ws:// in production). The provided WebSocket server can be deployed in a Docker container that (opitonally registers and) loads an SSL certificate to securely encrypt communication.

Quick start

Try websocket-demo.html to use the WebSocket interface directly from this browser.

Inspect the source code for that HTML page, noting in particular these lines of embedded JS:

function createWebSocket({                        // Line 164
  const websocket = new WebSocket(uri);           // Line 173

  websocket.onopen = async () => {                // Line 199
    websocket.send(optionsJSON);                  // Line 233

  websocket.onmessage = async event => {          // Line 246
    const replyJSON = await event.data.text();    // Line 247

function closeWebSocket(websocket) {              // Line 344
  websocket.send(emptyMessage);                   // Line 351

function startStreamingAudio(                     // Line 365
  audioSenderNode.port.onmessage = event => {     // Line 397
    webSocket.send(event.data);                   // Line 398

The Mod9 ASR WebSocket interface enables the straightforward code above to communicate with the ASR Engine.

Protocol

Diagram of ASR Engine protocol over WebSocket

The protocol for communicating with an ASR Engine indirectly via the WebSocket interface is very similar to communicating with the ASR Engine directly via its custom application-level protocol over TCP.

  1. The client establishes a WebSocket connection with the server, enabling duplex communication in browsers.

  2. In its first WebSocket message, the client sends a JSON-formatted object indicating request options. Unlike the protocol over TCP, this does not need to be formatted as a single line terminated by a newline.

  3. Next, two processes may happen concurrently:

    1. The server will send one or more WebSocket messages to the client, each a single-line JSON-formatted object. These replies will be formatted exactly as in the protocol over TCP, except without any newlines.

    2. Depending on the specified request options, the client may send audio data to the server. The data bytes should be sent in non-empty WebSocket messages.

  4. The client should terminate its audio data by sending an empty WebSocket message. The request may also be terminated as in the protocol over TCP, e.g. timing out or sending an end-of-file byte sequence.

  5. The WebSocket server will reply with a final message and close the WebSocket connection.

[top]

Server deployment

The WebSocket server can be run via Docker (recommended) or as a standalone Python application.

Recommended: deploy WebSocket server in a Docker container

Similar to the REST API, use the http-engine entrypoint command to run the WebSocket server locally:

# Runs WebSocket server at ws://localhost:8080 (and also REST API at http://localhost:8080/rest/api)
docker run -it --rm -p 8080:80 mod9/asr http-engine

Or https-engine for a remote server with interactive SSL certificate registration:

# Runs WebSocket server at wss://example.com (and https://example.com/rest/api, with encrypted transport).
docker run -it --rm -p 80:80 -p 443:443 mod9/asr https-engine

Alternative: Run WebSocket server separately from Engine

Click to expand

Install the Mod9 ASR Python SDK, including a WebSocket server

The WebSocket server is distributed within the Mod9 ASR Python SDK, which can be installed from PyPI:

pip3 install mod9-asr

This will install mod9-asr-websocket-server in pip's local scripts directory; it might need to be added to your PATH:

export PATH=~/.local/bin:${PATH}
which mod9-asr-websocket-server

Connect to the Mod9 ASR Engine

A Mod9 ASR Engine server is expected to be run locally (i.e. at localhost), listening for TCP connections on port 9900. These defaults may be reconfigured with the MOD9_ASR_ENGINE_HOST and MOD9_ASR_ENGINE_PORT environment variables.

Follow the Python SDK's instructions to connect to the Mod9 ASR Engine and set the environment variables accordingly.

Alternatively, the WebSocket server can be passed command-line arguments to override the environment variables:

mod9-asr-websocket-server --engine-host=$HOST --engine-port=$PORT

Configure WebSocket server host and port

The WebSocket server will listen at host address 127.0.0.1 and port 9980 by default; these may be set by command-line arguments. For example, to allow external access on a standard HTTP port (which may require root permissions):

mod9-asr-websocket-server --host=0.0.0.0 --port=80

[top]

Command-line WebSocket client

The Mod9 ASR Python SDK also installs a command-line WebSocket client that can facilitate development:

pip3 install mod9-asr

The arguments to the tool are the WebSocket server URI and JSON-encoded request options. For example:

mod9-asr-websocket-client wss://mod9.io '{"command": "ping"}'

Audio data may be relayed from from stdin:

curl -sL mod9.io/hi.wav | mod9-asr-websocket-client wss://mod9.io

In this case, the default request options '{"command": "recognize"}' were implied.

To stream live audio from your microphone (using sox) to a remote WebSocket server:

sox -dqV1 -traw -r16000 -c1 -b16 - | mod9-asr-websocket-client wss://mod9.io '{"format":"raw","rate":16000}'

Note that the examples above are similar to using nc, a command-line TCP client:

nc mod9.io 9900 <<< '{"command": "ping"}'
curl -sL mod9.io/hi.wav | nc mod9.io 9900
(echo '{"format":"raw","rate":16000}'; sox -dqV1 -traw -r16000 -c1 -b16 - ) | nc mod9.io 9900

The critical distinction is that this communication to the server's port 9900 is over unencrypted TCP transport, whereas the communication to wss://mod9.io was using the WebSocket protocol with encryption (on port 443).

[top]


©2019-2022 Mod9 Technologies (Engine 1.9.5 : Python SDK 1.11.6)