Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

whisper-node-ts

sunshineLixun66MIT0.0.16TypeScript support: included

Node bindings for OpenAI's Whisper. Optimized for CPU.

OpenAI, Whisper, CPP, C++, Whisper, Bindings, Transcript, Transcriber, Audio, Speech, Speech-to-Text, STT, TTS, SRT, diarization

readme

whisper-node-ts

npm downloads npm downloads

Node.js bindings for OpenAI's Whisper.

Base on whisper-node

Features

  • Output transcripts to JSON (also .txt .srt .vtt)
  • Optimized for CPU (Including Apple Silicon ARM)
  • Timestamp precision to single word

Installation

  1. Add dependency to project
npm install whisper-node-ts
  1. Download whisper model of choice
npx whisper-node-ts download

Usage

import whisper from "whisper-node-ts";

const transcript = await whisper("example/sample.wav");

console.log(transcript); // output: [ {start,end,speech} ]

Output (JSON)

[
  {
    start: "00:00:14.310", // time stamp begin
    end: "00:00:16.480", // time stamp end
    speech: "howdy" // transcription
  }
];

Usage with Additional Options

import whisper from 'whisper-node-ts';

const filePath = "example/sample.wav", // required

const options = {
  modelName: "tiny.en",                   // default
  modelPath: "/custom/path/to/model.bin", // use model in a custom directory
  whisperOptions: {
    gen_file_txt: false,      // outputs .txt file
    gen_file_subtitle: false, // outputs .srt file
    gen_file_vtt: false,      // outputs .vtt file
    timestamp_size: 10,       // amount of dialogue per timestamp pair
    word_timestamps: true     // timestamp for every word
  }
}

const transcript = await whisper(filePath, options);

Made with

Roadmap

  • <input checked="" disabled="" type="checkbox"> Support projects not using Typescript
  • <input checked="" disabled="" type="checkbox"> Allow custom directory for storing models
  • <input disabled="" type="checkbox"> Config files as alternative to model download cli
  • <input disabled="" type="checkbox"> Remove path, shelljs and prompt-sync package for browser, react-native expo, and webassembly compatibility
  • <input disabled="" type="checkbox"> fluent-ffmpeg to support more audio formats
  • <input disabled="" type="checkbox"> Pyanote diarization for speaker names
  • <input disabled="" type="checkbox"> Implement WhisperX as optional alternative model for diarization and higher precision timestamps (as alternative to C++ version)

Modifying whisper-node-ts

npm run dev - runs nodemon and tsc on '/src/test.ts'

npm run build - runs tsc, outputs to '/dist' and gives sh permission to 'dist/download.js'