Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

chonkie

chonkie-inc2kMIT0.3.0TypeScript support: included

🦛 CHONK your texts in TS with Chonkie!✨The no-nonsense lightweight and efficient chunking library.

chonkie, chunking, splitting, retrieval, vector-search, vector-database, vector-embedding, semantic-search, LLM, AI

readme

Chonkie Logo

🦛 chonkie-ts ✨

npm badge npm downloads npm license npm bundle size Documentation Discord Github Stars

🦛 CHONK your texts in TypeScript with Chonkie!✨ The no-nonsense lightweight and efficient chunking library.

InstallationUsageChunkersAcknowledgementsCitation

We built chonkie-ts while developing a TypeScript web app that needed fast, on-the-fly text chunking for RAG applications. After trying several existing libraries, we found them either too heavy or not flexible enough for our needs. chonkie-ts is a port of the original chonkie library, but with some type-safety and a few extra features to make it more useful for TypeScript developers!

🚀 Feature-rich: All the CHONKs you'd ever need
✨ Easy to use: Install, Import, CHONK
⚡ Fast: CHONK at the max speed of TypeScript! tssssooooooom
🪶 Light-weight: No bloat, just CHONK
🦛 Cute CHONK mascot: psst it's a pygmy hippo btw
❤️ Moto Moto's favorite TypeScript library

Chonkie is a chunking library that "just works" ✨

[!NOTE] This library is not a binding but a port of the original chonkie library written in Python, to TypeScript. This library is still under active development and not at feature parity with the original chonkie library yet. Please bear with us! 🫂

📦 Installation

Simply install Chonkie using npm:

npm install chonkie

Chonkie believes in having minimum default dependencies, and maximum flexibility, and so we have a lot of optional dependencies that you can opt out of if you don't need them. You can get the minimal install by running:

npm install chonkie --omit=optional

Learn more about the optional dependencies in the DOCS.md file.

📚 Usage

Chonkie is a simple and easy to use library for chunking text. It is designed to be used in any project that needs to chunk text, and is a great way to get started with text chunking.

import { TokenChunker } from 'chonkie';

async function main() {
  // Create a token chunker with default options
  const chunker = await TokenChunker.create();

  // Chunk a string
  const chunks = await chunker.chunk('Woah! Chonkie is such a great ts library!');

  // Print the chunks
  for (const chunk of chunks) {
    console.log(chunk.text);
    console.log(chunk.token_count);
  }
}

main();

More examples can be found in the DOCS or in the examples folder.

Chunkers

chonkie-ts is currently a work in progress and does not have feature parity with the original chonkie library yet. Here's an overview of the chunkers that are currently implemented:

Name Description
TokenChunker Splits text into fixed-size token chunks
SentenceChunker Splits text into chunks based on sentences.
RecursiveChunker Splits text hierarchically using customizable rules to create semantically meaningful chunks.
CodeChunker Splits code into structurally meaningful chunks.

Contributing

Want to help grow Chonkie? Check out CONTRIBUTING.md to get started! Whether you're fixing bugs, adding features, improving docs, or simply leaving a ⭐️ on the repo, every contribution helps make Chonkie a better CHONK for everyone.

Remember: No contribution is too small for this tiny hippo!

Acknowledgements

Chonkie would like to CHONK its way through a special thanks to all the users and contributors who have helped make this library what it is today! Your feedback, issue reports, and improvements have helped make Chonkie the CHONKIEST it can be.

And of course, special thanks to Moto Moto for endorsing Chonkie with his famous quote:

"I like them big, I like them chonkie in TypeScript" ~ Moto Moto... definitly did not say this

Citation

If you use Chonkie in your research, please cite it as follows:

@software{chonkie2025,
  author = {Bhavnick Minhas and Shreyash Nigam},
  title = {Chonkie: A no-nonsense fast, lightweight, and efficient text chunking library},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/chonkie-inc/chonkie}},
}

changelog

Changelog

All notable changes to this project will be documented in this file.

[0.3.0] - 22-05-2025

BREAKING CHANGES

Removed from Main Package Export

  • CodeChunker - Now only available via selective import:
    import { CodeChunker } from "chonkie/chunker/code";
  • CodeChunk - Now only available via selective import:
    import { CodeChunk } from "chonkie/types";
  • TreeSitterNode - Now only available via selective import:
    import { TreeSitterNode } from "chonkie/types";
  • ChromaHandshake - Now only available via selective import:
    import { ChromaHandshake } from "chonkie/friends";

Why This Change?

The main package export (import { ... } from "chonkie") was causing bundler resolution errors for users because it loaded ALL chunkers, including:

  • web-tree-sitter dependency from CodeChunker
  • chromadb dependency from ChromaHandshake

This caused build failures even when users only wanted RecursiveChunker or TokenChunker.

Migration Guide

Before (v0.2.x):

import { CodeChunker, RecursiveChunker, ChromaHandshake } from "chonkie";

After (v0.3.x):

import { RecursiveChunker } from "chonkie"; // Still works for common chunkers
import { CodeChunker } from "chonkie/chunker/code"; // Selective import required
import { ChromaHandshake } from "chonkie/friends"; // Selective import required

Added

  • New package exports for selective imports:

    // Individual chunker imports
    import { TokenChunker } from "chonkie/chunker/token";
    import { SentenceChunker } from "chonkie/chunker/sentence";
    import { RecursiveChunker } from "chonkie/chunker/recursive";
    import { CodeChunker } from "chonkie/chunker/code"; // Includes web-tree-sitter
    
    // Friends and utilities
    import { ChromaHandshake } from "chonkie/friends";
    
    // Types
    import { CodeChunk, TreeSitterNode } from "chonkie/types";

Improved

  • Significantly smaller bundles for users not using CodeChunker or ChromaHandshake
  • Better tree-shaking with explicit dependency loading
  • Bundler compatibility - No more resolution errors for optional dependencies
  • All examples updated to demonstrate selective import patterns

Still Available in Main Export

These remain available from the main package export for convenience:

// Common chunkers (no optional dependencies)
import { TokenChunker, SentenceChunker, RecursiveChunker } from "chonkie";

// Utilities
import { Tokenizer, Visualizer, Hubbie } from "chonkie";

// Basic types
import { Chunk, SentenceChunk, RecursiveChunk } from "chonkie";

[0.2.6] - Previous release

  • Enhanced export patterns for better tree-shaking
  • Explicit named exports instead of wildcard re-exports