Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

large-models-interface

chenxingqiang32MIT1.0.0

A comprehensive, unified interface for all types of AI models - natural language, vision, audio, and video. Supports 51 providers with dynamic model discovery and multi-modal capabilities.

large-models, multi-modal-ai, ai-interface, llm-interface, vision-models, audio-models, video-models, natural-language-processing, computer-vision, speech-recognition, text-to-speech, image-generation, ai21-studio, ai21, ailayer, aimlapi, anyscale, anthropic, microsoft-azure-ai, cloudflare-ai, cohere, corcel, deepinfra, deepseek, fireworks-ai, forefront-ai, friendliai, google-gemini, gooseai, groq, hugging-face, hugging-face-inference, hyperbee-ai, lamini, llama-cpp, mistral-ai, monster-api, neets-ai, novita-ai, nvidia-ai, octoai, ollama, openai, perplexity-ai, reka-ai, replicate, shuttle-ai, theb-ai, together-ai, voyage-ai, watsonx-ai, writer, zhipu-ai, baidu-ernie, alibaba-qwen, bytedance-doubao, tencent-hunyuan, iflytek-spark, baichuan-ai, moonshot-kimi, minimax-ai, stepfun-ai, yi-ai, xai-grok, coze-ai, siliconflow, chinese-llm, chinese-ai, dynamic-model-discovery, unified-ai-interface

readme

large-models-interface

Star on GitHub Fork on GitHub Watch on GitHub

License: MIT Built with Node.js

Maintained by chenxingqiang

Introduction

Large Models Interface is a comprehensive npm module designed to streamline interactions with various AI model providers in your Node.js applications. Our mission is to provide a unified interface for all types of large models, making it simple to switch between providers and leverage the best models for your specific needs.

🎯 Our Vision: Universal access to all kinds of large AI models through a single, consistent interface.

🇨🇳 Special Focus on Chinese AI Ecosystem: We prioritize comprehensive support for leading Chinese AI providers including Baidu, Alibaba, ByteDance, Tencent, iFLYTEK, and emerging players, making this the most China-friendly international AI interface.

🚀 Multi-Modal AI Support

We are building the most comprehensive interface for modern AI models:

  • 🗣️ Natural Language Models - Chat completion, text generation, and language understanding
  • 🖼️ Vision Models - Image analysis, generation, and vision-language tasks
  • 🎵 Audio Models - Speech recognition, synthesis, and audio processing
  • 🎬 Video Models - Video analysis, generation, and multimodal video understanding
  • 🧠 Specialized Models - Code generation, embeddings, and domain-specific AI

The Large Models Interface package currently offers comprehensive support for 51 language model providers and hundreds of models, with active development to expand into all AI modalities. This extensive and growing coverage ensures maximum flexibility in choosing the best models for your applications.

🌟 Current Support: 51 Providers & Hundreds of Models

🗣️ Natural Language Models (Current)

🌍 Global Leading Providers

International: OpenAI, Anthropic, Google Gemini, Mistral AI, Groq, DeepSeek, Hugging Face, NVIDIA AI, xAI, Coze, and 30+ more providers.

Supported Global Providers: AI21 Studio, AiLAYER, AIMLAPI, Anyscale, Anthropic, Cloudflare AI, Cohere, Corcel, Coze, DeepInfra, DeepSeek, Fireworks AI, Forefront AI, FriendliAI, Google Gemini, GooseAI, Groq, Hugging Face Inference, HyperBee AI, Lamini, LLaMA.CPP, Mistral AI, Monster API, Neets.ai, Novita AI, NVIDIA AI, OctoAI, Ollama, OpenAI, Perplexity AI, Reka AI, Replicate, Shuttle AI, SiliconFlow, TheB.ai, Together AI, Voyage AI, Watsonx AI, Writer, xAI, and Zhipu AI.

🇨🇳 Chinese AI Ecosystem

Leading Chinese Providers: 百度文心一言 (Baidu ERNIE), 阿里通义千问 (Alibaba Qwen), 字节跳动豆包 (ByteDance Doubao), 讯飞星火 (iFLYTEK Spark), 智谱 ChatGLM, 腾讯混元 (Tencent Hunyuan), and more.

Chinese Providers (已支持/Currently Supported):

🚧 Coming Soon: Multi-Modal Expansion

  • 🖼️ Vision Models - Image understanding, OCR, visual question answering
  • 🎵 Audio Models - Speech-to-text, text-to-speech, audio generation
  • 🎬 Video Models - Video analysis, captioning, generation
  • 🧠 Specialized Models - Code completion, scientific computing, domain-specific AI

Our roadmap includes expanding across all AI modalities, with dynamic model discovery to automatically support the latest releases.

AI21 Studio AIMLAPI Anthropic Anyscale Cloudflare AI Cohere Corcel DeepInfra DeepSeek Forefront AI GooseAI Lamini Mistral AI Monster API Neets.ai Perplexity AI Reka AI Replicate Shuttle AI Together AI Writer

Detailed Provider List

Core Features

🎯 Universal AI Interface

  • Unified API: LLMInterface.sendMessage provides a single, consistent interface to interact with 51 AI model providers
  • Multi-Modal Ready: Designed to support text, vision, audio, and video models through the same interface
  • Dynamic Model Discovery: Automatically detects and supports newly released models without code updates
  • 🇨🇳 China-First Design: Comprehensive support for Chinese AI ecosystem with native language examples and documentation

🚀 Advanced Capabilities

  • Chat Completion & Streaming: Full support for chat completion, streaming, and embeddings with intelligent failover
  • Smart Model Selection: Automatically choose the best model based on task type and requirements
  • Response Caching: Intelligent caching system to reduce costs and improve performance
  • Graceful Error Handling: Robust retry mechanisms with exponential backoff

🔧 Developer Experience

  • Dynamic Module Loading: Lazy loading of provider interfaces to minimize resource usage
  • JSON Output & Repair: Native JSON output support with automatic repair for malformed responses
  • Extensible Architecture: Easy integration of new providers and model types
  • Type Safety: Full TypeScript support for better development experience

🌐 Future-Ready Architecture

  • Modality Expansion: Built to seamlessly integrate vision, audio, and video models
  • Provider Agnostic: Switch between providers without changing your application code
  • Auto-Discovery: Continuously updated model registry for the latest AI capabilities

Dependencies

The project relies on several npm packages and APIs. Here are the primary dependencies:

  • axios: For making HTTP requests (used for various HTTP AI APIs).
  • @google/generative-ai: SDK for interacting with the Google Gemini API.
  • dotenv: For managing environment variables. Used by test cases.
  • jsonrepair: Used to repair invalid JSON responses.
  • loglevel: A minimal, lightweight logging library with level-based logging and filtering.

The following optional packages can added to extend LLMInterface's caching capabilities:

  • flat-cache: A simple JSON based cache.
  • cache-manager: An extendible cache module that supports various backends including Redis, MongoDB, File System, Memcached, Sqlite, and more.

Installation

To install the LLM Interface npm module, you can use npm:

npm install large-models-interface

Quick Start

  • Looking for API Keys? This document provides helpful links.
  • Detailed usage documentation is available here.
  • Various examples are also available to help you get started.
  • A breakdown of model aliases is available here.
  • A breakdown of embeddings model aliases is available here.
  • If you still want more examples, you may wish to review the test cases for further examples.

Usage

First import LLMInterface. You can do this using either the CommonJS require syntax:

const { LLMInterface } = require('large-models-interface');

🌍 Global Providers Example

LLMInterface.setApiKey({ openai: process.env.OPENAI_API_KEY });

try {
  const response = await LLMInterface.sendMessage(
    'openai',
    'Explain the importance of low latency LLMs.',
  );
} catch (error) {
  console.error(error);
}

🇨🇳 Chinese Providers Example

// 智谱 ChatGLM
LLMInterface.setApiKey({ zhipuai: process.env.ZHIPUAI_API_KEY });

const response = await LLMInterface.sendMessage(
  'zhipuai',
  '请解释大语言模型在中文自然语言处理中的重要性',
  { model: 'glm-4' }
);

// 百度文心一言
LLMInterface.setApiKey({ baidu: process.env.BAIDU_API_KEY });

const response = await LLMInterface.sendMessage(
  'baidu',
  '请帮我写一段关于人工智能发展的文章',
  { model: 'ernie-4.0-8k' }
);

// 阿里通义千问
LLMInterface.setApiKey({ alibaba: process.env.ALIBABA_API_KEY });

const response = await LLMInterface.sendMessage(
  'alibaba',
  '请介绍一下人工智能的发展历程',
  { model: 'qwen-turbo' }
);

if you prefer, you can pass use a one-liner to pass the provider and API key, essentially skipping the LLMInterface.setApiKey() step.

const response = await LLMInterface.sendMessage(
  ['openai', process.env.OPENAI_API_KEY],
  'Explain the importance of low latency LLMs.',
);

Passing a more complex message object is just as simple. The same rules apply:

const message = {
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain the importance of low latency LLMs.' },
  ],
};

try {
  const response = await LLMInterface.sendMessage('openai', message, {
    max_tokens: 150,
  });
} catch (error) {
  console.error(error);
}

LLMInterfaceSendMessage and LLMInterfaceStreamMessage are still available and will be available until version 3

Running Tests

The project includes tests for each LLM handler. To run the tests, use the following command:

npm test

The comprehensive test suite covers all 51 providers with proper API key validation and graceful skipping when credentials are not available.

🗓️ Roadmap

Phase 1: Enhanced Language Models (Completed)

  • <input checked="" disabled="" type="checkbox"> Dynamic Model Discovery - Auto-detect latest models from all providers
  • <input checked="" disabled="" type="checkbox"> Chinese AI Providers Integration:
    • <input checked="" disabled="" type="checkbox"> 百度文心一言 (Baidu ERNIE) - ERNIE-4.0, ERNIE-3.5 series
    • <input checked="" disabled="" type="checkbox"> 阿里通义千问 (Alibaba Qwen) - Qwen2.5, Qwen-Turbo, Qwen-Plus
    • <input checked="" disabled="" type="checkbox"> 字节跳动豆包 (ByteDance Doubao) - Doubao-pro, Doubao-lite series
    • <input checked="" disabled="" type="checkbox"> 讯飞星火 (iFLYTEK Spark) - Spark-4.0, Spark-3.5 models
    • <input checked="" disabled="" type="checkbox"> 腾讯混元 (Tencent Hunyuan) - Hunyuan-large, Hunyuan-pro
    • <input checked="" disabled="" type="checkbox"> 月之暗面 (Moonshot AI) - Moonshot-v1 series
    • <input checked="" disabled="" type="checkbox"> 百川大模型 (Baichuan AI) - Baichuan2 series
    • <input checked="" disabled="" type="checkbox"> 零一万物 (01.AI) - Yi-34B, Yi-6B series
    • <input checked="" disabled="" type="checkbox"> 阶跃星辰 (StepFun) - Step-1V, Step-2 models
  • <input checked="" disabled="" type="checkbox"> New Global Providers - xAI Grok, SiliconFlow, Coze
  • <input checked="" disabled="" type="checkbox"> Enhanced Embeddings - Voyage AI, improved embedding support

🖼️ Phase 2: Vision Models (Next)

  • <input disabled="" type="checkbox"> Image Understanding - GPT-4V, Claude Vision, Gemini Vision
  • <input disabled="" type="checkbox"> Image Generation - DALL-E, Midjourney, Stable Diffusion
  • <input disabled="" type="checkbox"> OCR & Document AI - Advanced document processing capabilities
  • <input disabled="" type="checkbox"> Visual Question Answering - Multi-modal reasoning

🎵 Phase 3: Audio Models (Future)

  • <input disabled="" type="checkbox"> Speech Recognition - Whisper, Azure Speech, Google Speech-to-Text
  • <input disabled="" type="checkbox"> Text-to-Speech - ElevenLabs, Azure TTS, OpenAI TTS
  • <input disabled="" type="checkbox"> Audio Generation - Music generation, sound effects
  • <input disabled="" type="checkbox"> Real-time Audio - Streaming audio processing

🎬 Phase 4: Video & Advanced AI (Future)

  • <input disabled="" type="checkbox"> Video Understanding - Video analysis, captioning, content moderation
  • <input disabled="" type="checkbox"> Video Generation - AI video creation and editing
  • <input disabled="" type="checkbox"> Multi-modal Reasoning - Cross-modal understanding and generation
  • <input disabled="" type="checkbox"> Specialized AI - Scientific computing, code generation, domain-specific models

🚀 Submit your feature requests and suggestions!

Contribute

Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes or improvements.

Acknowledgments

This project is based on and extends the excellent llm-interface project. We thank the original authors for their foundational work.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author: chenxingqiang
GitHub: chenxingqiang

Blogs

Share

Twitter Facebook LinkedIn