Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

@aws-sdk/client-transcribe-streaming

aws266.3kApache-2.03.799.0TypeScript support: included

AWS SDK for JavaScript Transcribe Streaming Client for Node.js, Browser and React Native

readme

@aws-sdk/client-transcribe-streaming

NPM version NPM downloads

Introduction

Amazon Transcribe streaming enables you to send an audio stream and receive back a stream of text in real time. The API makes it easy for developers to add real-time speech-to-text capability to their applications. It can be used for a variety of purposes. For example:

  • Streaming transcriptions can generate real-time subtitles for live broadcast media.
  • Lawyers can make real-time annotations on top of streaming transcriptions during courtroom depositions.
  • Video game chat can be transcribed in real time so that hosts can moderate content or run real-time analysis.
  • Streaming transcriptions can provide assistance to the hearing impaired.

The JavaScript SDK Transcribe Streaming client encapsulates the API into a JavaScript library that can be run on browsers, Node.js and potentially React Native. By default, the client uses HTTP/2 connection on Node.js, and uses WebSocket connection on browsers and React Native.

Installing

To install the this package, simply type add or install @aws-sdk/client-transcribe-streaming using your favorite package manager:

  • npm install @aws-sdk/client-transcribe-streaming
  • yarn add @aws-sdk/client-transcribe-streaming
  • pnpm add @aws-sdk/client-transcribe-streaming

Getting Started

In the sections bellow, we will explain the library by an example of using startStreamTranscription method to transcribe English speech to text.

If you haven't, please read the root README for guidance for creating a sample application and installation. After installation, in the index.js, you can import the Transcribe Streaming client like:

// ES5 example
const { TranscribeStreamingClient, StartStreamTranscriptionCommand } = require("@aws-sdk/client-transcribe-streaming");

If require is not available on the platform you are working on(browsers). You can import the client like:

// ES6+ example
import {
  TranscribeStreamingClient,
  StartMedicalStreamTranscriptionCommand,
} from "@aws-sdk/client-transcribe-streaming";

Constructing the Service Client

You can create a service client like bellow:

const client = new TranscribeStreamingClient({
  region,
  credentials,
});
// region and credentials are optional in Node.js

Acquire Speech Stream

The Transcribe Streaming client accepts streaming speech input as an async iterable. You can construct them from either an async generator or using Symbol.asyncIterable to emit binary chunks.

Here's an example of using async generator:

const audioStream = async function* () {
  await device.start();
  while (device.ends !== true) {
    const chunk = await device.read();
    yield chunk; /* yield binary chunk */
  }
};

Then you need to construct the binary chunk into an audio chunk shape that can be recognized by the SDK:

const audioStream = async function* () {
  for await (const chunk of audioSource()) {
    yield { AudioEvent: { AudioChunk: chunk } };
  }
};

Acquire from Node.js Stream API

In Node.js you will mostly acquire the speech in Stream API, from HTTP request or devices. Stream API in Node.js (>= 10.0.0) itself is an async iterable. You can supply the streaming into the SDK input without explicit convert. You only need to construct the audio chunk shape that can be recognized by the SDK:

const audioSource = req; //Incoming message
const audioStream = async function* () {
  for await (const payloadChunk of audioSource) {
    yield { AudioEvent: { AudioChunk: payloadChunk } };
  }
};

If you see don't limit the chunk size on the client side, for example, streams from fs, you might see The chunk is too big error from the Transcribe Streaming. You can solve it by setting the HighWaterMark:

const { PassThrough } = require("stream");
const { createReadStream } = require("fs");
const audioSource = createReadStream("path/to/speech.wav");
const audioPayloadStream = new PassThrough({ highWaterMark: 1 * 1024 }); // Stream chunk less than 1 KB
audioSource.pipe(audioPayloadStream);
const audioStream = async function* () {
  for await (const payloadChunk of audioPayloadStream) {
    yield { AudioEvent: { AudioChunk: payloadChunk } };
  }
};

Depending on the audio source, you may need to PCM encode you audio chunk.

Acquire from Browsers

The Transcribe Streaming SDK client also supports streaming from browsers. You can acquire the microphone data through getUserMedia API. Note that this API is supported by a subset of browsers. Here's a code snippet of acquiring microphone audio stream using microphone-stream

const mic = require("microphone-stream");
// this part should be put into an async function
micStream.setStream(
  await window.navigator.mediaDevices.getUserMedia({
    video: false,
    audio: true,
  })
);
const audioStream = async function* () {
  for await (const chunk of micStream) {
    yield { AudioEvent: { AudioChunk: pcmEncodeChunk(chunk) /* pcm Encoding is optional depending on the source */ } };
  }
};

You can find the a full front-end example here

PCM encoding

Currently Transcribe Streaming service only accepts PCM encoding. If your audio source is not already encoded, you need to PCM encoding the chunks. Here's an example:

const pcmEncodeChunk = (chunk) => {
  const input = mic.toRaw(chunk);
  var offset = 0;
  var buffer = new ArrayBuffer(input.length * 2);
  var view = new DataView(buffer);
  for (var i = 0; i < input.length; i++, offset += 2) {
    var s = Math.max(-1, Math.min(1, input[i]));
    view.setInt16(offset, s < 0 ? s * 0x8000 : s * 0x7fff, true);
  }
  return Buffer.from(buffer);
};

Send the Speech Stream

const command = new StartStreamTranscriptionCommand({
  // The language code for the input audio. Valid values are en-GB, en-US, es-US, fr-CA, and fr-FR
  LanguageCode: "en-US",
  // The encoding used for the input audio. The only valid value is pcm.
  MediaEncoding: "pcm",
  // The sample rate of the input audio in Hertz. We suggest that you use 8000 Hz for low-quality audio and 16000 Hz for
  // high-quality audio. The sample rate must match the sample rate in the audio file.
  MediaSampleRateHertz: 44100,
  AudioStream: audioStream(),
});
const response = await client.send(command);

Handling Text Stream

If the request succeeds, you will get a response containing the streaming transcript like this. Just like the input speech stream, the transcript stream is an async iterable emitting the partial transcripts. Here is a code snippet of accessing the transcripts

// This snippet should be put into an async function
for await (const event of response.TranscriptResultStream) {
  if (event.TranscriptEvent) {
    const message = event.TranscriptEvent;
    // Get multiple possible results
    const results = event.TranscriptEvent.Transcript.Results;
    // Print all the possible transcripts
    results.map((result) => {
      (result.Alternatives || []).map((alternative) => {
        const transcript = alternative.Items.map((item) => item.Content).join(" ");
        console.log(transcript);
      });
    });
  }
}

Pipe Transcripts Stream

In Node.js, you can pipe this TranscriptResultStream to other destinations easily with the from API:

const { Readable } = require("stream");
const transcriptsStream = Readable.from(response.TranscriptResultStream);
transcriptsStream.pipe(/* some destinations */);

Error Handling

If you are using async...await style code, you are able to catch the errors with try...catch block. There are 2 categories of exceptions can be thrown:

  • Immediate exceptions thrown before transcription is started, like signature exceptions, invalid parameters exceptions, and network errors;
  • Streaming exceptions that happens after transcription is started, like InternalFailureException or ConflictException.

For immediate exceptions, the SDK client will retry the request if the error is retryable, like network errors. You can config the client to behave as you intend to.

For streaming exceptions, because the streaming transcription is already started, client cannot retry the request automatically. The client will throw these exceptions and users can handle the stream behavior accordingly.

Here's an example of error handling flow:

try {
  const response = await client.send(command);
  await handleResponse(response);
} catch (e) {
  if (e instanceof InternalFailureException) {
    /* handle InternalFailureException */
  } else if (e instanceof ConflictException) {
    /* handle ConflictException */
  }
} finally {
  /* clean resources like input stream */
}

Notes for React Native

This package is compatible with React Native (>= 0.60). However, it is not tested with any React Native libraries that converts microphone record into streaming data. Community input for integrating streaming microphone record data is welcome.

Thank you for reading this guide. If you want to know more about how streams are encoded, how connection is established, please refer to the Service API guide.

Contributing

This client code is generated automatically. Any modifications will be overwritten the next time the @aws-sdk/client-transcribe-streaming package is updated. To contribute to client you can check our generate clients scripts.

License

This SDK is distributed under the Apache License, Version 2.0, see LICENSE for more informatio

changelog

Change Log

All notable changes to this project will be documented in this file. See Conventional Commits for commit guidelines.

3.6.0 (2021-02-20)

Bug Fixes

  • revert publish v3.5.1-0 (#2058) (af25697)
  • client-kinesis-video-signaling: remove retry headers (#1963) (8b35943)
  • clients: remove unsupported CORS retry headers in new services (#2041) (82df9d3)
  • credential-provider-ini: refactor provider options interfaces (#2048) (34cecf1)
  • credential-provider-node: read config and credentials files only once (#2045) (7db14b1)
  • deps: add @aws-sdk/middleware-sdk-rds in DocDB and Neptune (#2042) (a0068f3)
  • lib-storage: fix typo in Upload.intialize (initialize) (#2025) (16214be)
  • middleware-sdk-ec2: add undeclared dependency @aws-sdk/protocol-http (#2043) (6e562ba)

Features

  • client-sso*: remove auth dependencies if client doesn't need (#2037) (f1e190c)
  • lib-storage: rewrite lib-storage upload (#2039) (2bd8f6a)

3.5.0 (2021-02-12)

Bug Fixes

  • util-dynamodb: state options.wrapNumbers on BigInt error in unmarshall (#2015) (d1c548e)
  • util-dynamodb: unmarshall small numbers or those in scientific notation (#2017) (80a8094)
  • util-user-agent-browser: use default import from bowser (#1991) (d2e8d4f)

Features

  • s3-request-presigner: automatically add host header (#1988) (cd50eeb)
  • util-dynamodb: marshall JavaScript Maps (#2010) (569b572)
  • util-dynamodb: support marshalling for Object.create (#1974) (a008d23)
  • add S3 and S3Control customizations for custom endpoints (#1993) (96c1b99)

3.4.1 (2021-01-29)

Bug Fixes

Features

  • use git-sync action to sync with private mirror (#1965) (10ab6a1)

3.4.0 (2021-01-28)

Bug Fixes

  • allow packages/types in gitignore (#1942) (b4b6fad)
  • credential-provider-cognito-identity: remove duplicate declarationDir (#1944) (d75488a)
  • generate-clients: call mergeManifest when constructor.name is Object (#1937) (601c03b)

Features

  • middleware-stack: allow adding middleware to override an existing one (#1964) (9c21f14), closes #1883
  • util-dynamodb: add option to convert class instance to map (#1969) (1783c69)
  • run prettier in parallel in generate-clients (#1949) (878617a)
  • use downlevel-dts to generate TS 3.4 compatible types (#1943) (63ad215)

3.3.0 (2021-01-14)

Bug Fixes

  • clients: export explicit dependencies on @aws-sdk/types (#1902) (96f1087)
  • clients: lowercase all header names in serializer (#1892) (1308721)
  • url-parser: merge browser and node url parser, add rn url parser (#1903) (99be092)
  • util-waiters: waiters should call operation once before entering waiting (#1915) (2a6ac11)

Features

  • clients: update README with documentation, usage and more (#1907) (03be111)

3.2.0 (2021-01-09)

Bug Fixes

  • lib-storage: chunk from readable only when defined (#1886) (4cdc08a)
  • s3-request-presigner: not to throw when get signed urls concurrently (#1884) (741bb99)
  • stop adding command mw repeatedly in resolveMiddleware() (#1883) (d4c302b)
  • readme: npm downloads tag (#1870) (1f8baf3)
  • readme: remove duplicate @aws-sdk (#1873) (85ae915)
  • readme: use latest for npm version badge in template (#1871) (80b57a7)
  • readme: use latest in npm version tag (#1872) (b8542d8)
  • util-user-agent-*: move @aws-sdk/types to devDependencies (#1879) (ea39ca6)
  • util-waiter: expose minDelay and maxDelay for waiters (#1839) (25cb359)

Features

  • use lock-threads GH action for inactive issues/PRs (#1881) (fc22682)
  • util-dynamodb: enable undefined values removal in marshall (#1840) (314d3b3)

3.1.0 (2020-12-23)

Bug Fixes

  • clients: default region and credential provider (#1834) (bc79ab5)
  • clients: populate sdkId in serviceId and default to use arnNamespace as signingName (#1786) (0011af2)
  • clients: remove retry headers for several services (#1789) (fc98d2d)
  • clients: update endpoint provider (#1824) (64d2210)
  • clients: use signing name from auth sigv4 trait (#1835) (e539302)
  • codegen: strip names from enums (#1837) (0711503)
  • lib-storage: cleanup stream listeners to prevent memory leak (3d36682)
  • middleware-user-agent: add middleware to final step of build (#1833) (e7dce39)
  • signature-v4: add secrets to signing key cache key (#1776) (8785ad4)
  • util-waiter: fix compiling error with waiter (#1812) (ca1f0d6), closes #1803
  • log requestId, extendedRequestId, cfId in $metadata (#1819) (f2a47e8)

Features

  • credential-provider-node: use credential_process from profile (#1773) (842e2a0), closes #1772
  • standardize user agent value (#1775) (388b180)
  • cucumber: use waiters in integration tests (#1792) (e151aee)
  • middleware-logger: log clientName, commandName, input, output (#1788) (4f9e56f)

3.0.0 (2020-12-15)

Features

1.0.0-rc.10 (2020-12-15)

Bug Fixes

Features

1.0.0-rc.9 (2020-12-11)

Bug Fixes

  • codegen: import SENSITIVE_STRING only when used (#1761) (9296283)
  • middleware-sdk-sqs: call next() exactly once in sendMessageMiddleware (#1752) (dc63e37)
  • shared-ini-file-loader: ignore prohibited profile name (#1764) (a209082)

Features

1.0.0-rc.8 (2020-12-05)

Bug Fixes

  • client-s3: fix union serialization (#1730) (6437e24)
  • client-sts: disable auth for public assumeRole commands (#1706) (891eae2)
  • codegen: checkstyle errors in AddBuiltinPlugins.java (#1731) (48c02f4)
  • middleware-sdk-sqs: Fix MD5 verification on SendMessageBatch. (#1666) (049f45e)
  • s3-request-presigner: skip hoisting SSE headers (#1701) (1ec70ff)

Features

  • update clients as of 11/20/2020 (#1711) (e932876)
  • update clients as of 11/30/2020 (#1734) (a1e8036)
  • update clients as of 12/3/2020 (#1741) (58383dc)
  • invalid-dependency: add invalidAsyncFunction which rejects with an Error (#1719) (c4c046e)

1.0.0-rc.7 (2020-11-20)

Bug Fixes

  • abort-controller: make AbortSignal WHATWG Spec compliant (#1699) (723ec4d)
  • codegen: add aws-iam-traits dependency (#1686) (d6fb1f6)
  • fetch-http-handler: omit body for HEAD/GET methods (#1698) (778b305)
  • node-http-handler: throw TimeoutError for Node.js timeouts (#1693) (96f61bb)
  • change paginators to export paginateOperationName (#1692) (6d02935)

Features

  • update clients as of 11/18/2020 (#1700) (8adfed1)
  • api-reference: add typedoc plugins for api reference (#1694) (2cb016f)
  • ci: add GitHub Action to test codegen (#1684) (41e9359)
  • node-http-handler: update timeout code and tests (#1691) (9e58bbb)
  • service-error-classification: add 429 response as Throttling (#1690) (9a62c0a)

BREAKING CHANGES

  • change paginators to export paginateOperationName to be consistent with verb nouns across AWS

1.0.0-rc.6 (2020-11-13)

Bug Fixes

Features

  • update clients as of 11/13 (#1676) (2d934c9)
  • codegen: add script to copy models from local directory (#1675) (028a362)

1.0.0-rc.5 (2020-11-09)

Bug Fixes

  • codegen for paginator send commands (#1667) (13f3347)
  • node-http-handler: set maxSockets above 0 in test (#1660) (706768d)
  • package.json: migrate @aws-sdk/types into devDependencies codegen (#1658) (eb50962)
  • migrate dev types for packages and lib into package.json devDeps (#1654) (16d7030)

Features

1.0.0-rc.4 (2020-10-31)

Bug Fixes

  • log requestId, extendedRequestId, cfId in metadata (#1640) (3a2f617)
  • client-timestream-*: use correct endpoint prefix (#1643) (f329821)
  • credential-provider-cognito-identity: return identityId as part of cognitoIdentityPool (#1635) (de75f7e)
  • util-format-url: remove headers or path from input (#1639) (db7aa08)

Features

1.0.0-rc.3 (2020-10-27)

Bug Fixes

  • client-cognito-identity: remove auth for UnlinkIdentity (#1621) (c32e5f3)
  • codegen: skip awsAuthPlugin when optionalAuth trait is set (#1622) (785272b)
  • generate-clients: Invoke prettier relative to client-generation (#1614) (a4136ab)

Features

  • polly-request-presigner: add presigned getSynthesizeSpeechUrl() (#1612) (2c9fd94)
  • update client description to add keywords (#1631) (93fc586)

1.0.0-rc.2 (2020-10-22)

Bug Fixes

  • storage: add version and downloads badges (#1599) (230d030)
  • throw 3XX redirection as errors explicitly (#1591) (76f83f1)

1.0.0-rc.1 (2020-10-19)

Bug Fixes

  • node-http-handler: fix type error sending Uint8Array as payload (#1561) (7bf03fc)

Features