Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

@zanreal/search

zanreal-labs296MIT1.0.0TypeScript support: included

A powerful TypeScript fuzzy search library with intelligent scoring, exact match prioritization, and automatic field detection for any object structure

search, fuzzy-search, typescript, full-text-search, autocomplete, filter, query, search, intelligent-search, levenshtein, text-matching

readme

Universal Search Engine

A powerful TypeScript fuzzy search library with intelligent scoring, exact match prioritization, and automatic field detection for any object structure.

Features

  • Universal Search: Works with any data structure without configuration
  • Intelligent Scoring: Prioritizes exact matches, then fuzzy matches with smart weighting
  • Automatic Field Detection: Automatically finds searchable string fields in your objects
  • Nested Object Support: Search through nested properties using dot notation
  • Customizable Weights: Define field importance with custom weights
  • Multiple Search Types: Exact start matches, exact contains, and fuzzy matching
  • TypeScript First: Full TypeScript support with comprehensive type definitions

Installation

npm install @zanreal/search

Or with other package managers:

# Yarn
yarn add @zanreal/search

# pnpm
pnpm add @zanreal/search

# Bun
bun add @zanreal/search

Quick Start

import { search, searchItems, quickSearch } from '@zanreal/search';

// Simple search - returns just the matching items
const data = [
  { name: 'John Doe', email: 'john@example.com' },
  { name: 'Jane Smith', email: 'jane@example.com' }
];

const results = quickSearch(data, 'john');
// Returns: [{ name: 'John Doe', email: 'john@example.com' }]

// Detailed search - returns items with scores and match details
const detailedResults = search(data, 'john');
// Returns: [{ item: {...}, score: 15.2, matches: [...] }]

💡 Want to see more examples? Check out our comprehensive examples collection with production-ready patterns for e-commerce, document search, user directories, and more! Run npm run examples for an interactive explorer.

API Reference

Core Functions

search<T>(data: T[], query: string, options?: SearchOptions): SearchResult<T>[]

The main search function that returns detailed results with scores and match information.

const results = search(users, 'john', {
  fields: ['name', 'email'],
  fieldWeights: { name: 5, email: 1 },
  fuzzyThreshold: 0.7,
  limit: 10
});

searchItems<T>(data: T[], query: string, options?: SearchOptions): T[]

Simplified search that returns just the matching items.

const items = searchItems(users, 'john', {
  fields: ['name', 'email']
});

quickSearch<T>(data: T[], query: string, fields?: string[]): T[]

Quick search with sensible defaults for most use cases.

const results = quickSearch(users, 'john', ['name', 'email']);

Factory Functions

createSearcher<T>(config: SearchOptions)

Create a reusable search function with predefined configuration.

const searchUsers = createSearcher<User>({
  fieldWeights: { name: 5, email: 2, bio: 1 },
  fuzzyThreshold: 0.8
});

const results = searchUsers(users, 'john');

createDocumentSearcher<T>()

Create a document searcher with common defaults.

const searchDocs = createDocumentSearcher<Document>();
const results = searchDocs(documents, 'typescript');

Search Options

interface SearchOptions {
  /** Fields to search in (auto-detected if not provided) */
  fields?: string[];
  /** Custom field weights (higher = more important) */
  fieldWeights?: Record<string, number>;
  /** Minimum similarity threshold for fuzzy matching (0-1) */
  fuzzyThreshold?: number;
  /** Minimum query length for fuzzy matching */
  minFuzzyLength?: number;
  /** Maximum number of results to return */
  limit?: number;
  /** Case sensitive search */
  caseSensitive?: boolean;
}

Search Results

interface SearchResult<T> {
  item: T;           // The matching item
  score: number;     // Relevance score
  matches: SearchMatch[]; // Detailed match information
}

interface SearchMatch {
  field: string;     // Which field matched
  value: string;     // The field value
  score: number;     // Match score for this field
  type: "exact-start" | "exact-contain" | "fuzzy";
  position?: number; // Position of match in string
}

Advanced Usage

const data = [
  {
    user: { name: 'John', profile: { title: 'Developer' } },
    company: { name: 'Tech Corp' }
  }
];

// Automatically searches nested fields: user.name, user.profile.title, company.name
const results = search(data, 'developer');

Custom Field Weights

const results = search(articles, 'typescript', {
  fieldWeights: {
    title: 10,        // Highest priority
    summary: 5,       // Medium priority
    content: 1        // Lower priority
  }
});

Field-Specific Configuration

const searchConfig = {
  fields: ['title', 'author.name', 'tags'],
  fieldWeights: { title: 8, 'author.name': 3, tags: 2 },
  fuzzyThreshold: 0.8,
  limit: 20
};

const searcher = createSearcher<Article>(searchConfig);
const results = searcher(articles, query);

Match Types & Scoring

  1. Exact Start Match (Highest Score): Query matches from the beginning of the field
  2. Exact Contains Match (High Score): Query found anywhere in the field
  3. Fuzzy Match (Lower Score): Similar strings based on Levenshtein distance

The library automatically adjusts scores based on:

  • Field importance (via weights)
  • Match position (earlier = better)
  • Field length (shorter fields get bonus)
  • String similarity (for fuzzy matches)

Default Options

export const DEFAULT_SEARCH_OPTIONS = {
  fieldWeights: {},
  fuzzyThreshold: 0.7,
  minFuzzyLength: 3,
  limit: 100,
  caseSensitive: false,
};

Development

This project uses Bun for fast TypeScript execution and development.

# Install dependencies
bun install

# Run TypeScript directly
bun run src/index.ts

Examples & Use Cases

The /examples directory contains comprehensive, production-ready examples for different domains and use cases:

🚀 Quick Start with Examples

Prerequisites: Make sure you've built the library first:

npm run build

Run any example instantly:

# Interactive explorer (recommended for beginners)
npm run examples

# Individual examples
npm run examples:basic          # Basic usage patterns
npm run examples:ecommerce      # E-commerce search
npm run examples:documents      # Document/CMS search
npm run examples:users          # People directory search
npm run examples:performance    # Performance benchmarks

# Validate all examples work correctly
npm run examples:validate

📖 Example Categories

Example Use Case Key Features Best For
Basic Usage Simple search operations Quick search, custom searcher, field weighting Getting started, simple applications
E-commerce Search Product catalogs Custom scoring, price awareness, category filtering Online stores, marketplaces
Document Search CMS/Blog search Nested objects, content analysis, metadata search Content sites, documentation
User Directory Employee/people search Fuzzy matching, skill matching, department search HR systems, social networks
Performance Demo Large datasets (1000+ items) Benchmarking, auto-complete, analytics High-scale applications

🛠️ Common Patterns

Quick Search Pattern

import { quickSearch } from '@zanreal/search';

// Fastest way to search - returns just the items
const results = quickSearch(data, query);

Detailed Search Pattern

import { search } from '@zanreal/search';

// Full control with scoring and configuration
const results = search(data, query, {
  fieldWeights: { title: 10, description: 5 },
  fuzzyThreshold: 0.7
});

Reusable Searcher Pattern

import { createSearcher } from '@zanreal/search';

// Create once, use many times for better performance
const searcher = createSearcher({
  fieldWeights: { name: 10, email: 5 },
  fuzzyThreshold: 0.7
});

const results = searcher(data, query);

⚖️ Field Weight Guidelines

Field Type Weight Range Use Case Example
Primary Identifiers 15-20 Names, titles, SKUs name: 20
Important Content 10-15 Descriptions, summaries description: 12
Secondary Content 5-10 Categories, tags category: 8
Searchable Text 3-5 Body content, reviews content: 4
Auxiliary Data 1-3 IDs, timestamps id: 1

⚡ Performance Tips

  1. Limit Results: Use the limit option for large datasets
  2. Pre-filter: Filter data before searching when possible
  3. Field Selection: Specify fields array to search only relevant fields
  4. Fuzzy Threshold: Adjust fuzzyThreshold based on your use case (0.7 is usually good)
  5. Caching: Cache searcher instances for repeated use with same configuration

🎛️ Configuration Presets

// Strict Matching (Exact matches preferred)
{ fuzzyThreshold: 0.9, minFuzzyLength: 5 }

// Loose Matching (Typo-tolerant)
{ fuzzyThreshold: 0.5, minFuzzyLength: 2 }

// Title-First Search (Prioritize titles/names)
{ fieldWeights: { title: 20, content: 1 } }

// Balanced Search (Equal importance)
{ fieldWeights: { title: 8, description: 5, content: 3 } }

For complete examples with sample data and detailed explanations, see the /examples directory.

💡 Usage Ideas & Real-World Applications

The Universal Search library can be applied to a wide variety of use cases. Here are some popular applications and implementation patterns:

🏪 E-commerce & Marketplace

Product Search with Business Logic

const productSearcher = createSearcher({
  fieldWeights: {
    name: 15,        // Product names most important
    brand: 12,       // Brand recognition
    category: 8,     // Category filtering
    tags: 10,        // Searchable attributes
    description: 3   // Supporting content
  },
  fuzzyThreshold: 0.7
});

// E-commerce with filtering
function ecommerceSearch(products, query, filters = {}) {
  let filteredData = products;

  if (filters.category) {
    filteredData = filteredData.filter(p => p.category === filters.category);
  }
  if (filters.maxPrice) {
    filteredData = filteredData.filter(p => p.price <= filters.maxPrice);
  }
  if (filters.inStock) {
    filteredData = filteredData.filter(p => p.inStock);
  }

  return productSearcher(filteredData, query);
}

Use Cases:

  • Product catalogs with thousands of items
  • Auto-complete search suggestions
  • Category and price filtering
  • Brand and feature-based discovery

👥 HR & People Management

const peopleSearcher = createSearcher({
  fieldWeights: {
    'profile.firstName': 10,
    'profile.lastName': 10,
    'employment.title': 15,
    'employment.department': 8,
    skills: 12,              // Converted from array to string
    bio: 5,
    'profile.email': 3
  },
  fuzzyThreshold: 0.6,       // Higher tolerance for name typos
  limit: 20
});

Use Cases:

  • Employee directories with skills matching
  • Department and team organization
  • Project collaboration discovery
  • Expertise location within organizations

📄 Content Management Systems

Document & Article Search

const documentSearcher = createSearcher({
  fieldWeights: {
    title: 20,               // Titles are crucial
    'metadata.summary': 12,  // Executive summaries
    'content.excerpt': 8,    // Article previews
    'metadata.tags': 10,     // Topic classification
    'author.name': 5,        // Author attribution
    'content.body': 3        // Full content (lower weight)
  }
});

// Content with quality scoring
function contentSearch(documents, query) {
  const results = documentSearcher(documents, query);

  // Enhance with content quality metrics
  return results.map(result => ({
    ...result,
    qualityScore: result.score * (result.item.readTime > 5 ? 1.2 : 1.0),
    freshness: calculateFreshness(result.item.publishDate)
  }));
}

Use Cases:

  • Blog and news site search
  • Documentation and knowledge bases
  • Academic paper repositories
  • Content recommendation systems

🏢 Business Applications

Customer Relationship Management

const customerSearcher = createSearcher({
  fieldWeights: {
    'company.name': 15,
    'contact.name': 12,
    'contact.email': 8,
    'details.industry': 10,
    'notes.content': 5
  }
});

Inventory Management

const inventorySearcher = createSearcher({
  fieldWeights: {
    sku: 20,                 // Exact SKU matching critical
    name: 15,
    category: 10,
    supplier: 8,
    location: 12
  },
  fuzzyThreshold: 0.8        // Stricter for inventory
});

🎯 Specialized Use Cases

Real-time Auto-complete

function autoComplete(dataset, query, limit = 5) {
  if (query.length < 2) return [];

  return search(dataset, query, {
    fieldWeights: { name: 15, brand: 12, category: 5 },
    fuzzyThreshold: 0.8,     // Stricter for autocomplete
    limit
  });
}

Multi-language Content

const multiLangSearcher = createSearcher({
  fieldWeights: {
    'title.en': 10,
    'title.es': 10,
    'title.fr': 10,
    'content.en': 5,
    'content.es': 5,
    'content.fr': 5
  }
});
const locationSearcher = createSearcher({
  fieldWeights: {
    name: 15,
    'address.city': 12,
    'address.state': 10,
    'address.country': 8,
    category: 6
  }
});

⚡ Performance Patterns

Large Dataset Optimization

// Pre-filter before search for better performance
const optimizedSearch = (data, query, category) => {
  const filtered = category 
    ? data.filter(item => item.category === category)
    : data;

  return search(filtered, query, { limit: 50 });
};

Caching for Repeated Searches

const searchCache = new Map();

function cachedSearch(data, query, options) {
  const cacheKey = `${query}-${JSON.stringify(options)}`;

  if (searchCache.has(cacheKey)) {
    return searchCache.get(cacheKey);
  }

  const results = search(data, query, options);
  searchCache.set(cacheKey, results);
  return results;
}

🔧 Advanced Patterns

Search with Analytics

function analyticsSearch(data, query, options = {}) {
  const startTime = Date.now();
  const results = search(data, query, options);
  const endTime = Date.now();

  const analytics = {
    query,
    resultCount: results.length,
    executionTime: endTime - startTime,
    avgScore: results.reduce((sum, r) => sum + r.score, 0) / results.length,
    topCategories: [...new Set(results.slice(0, 10).map(r => r.item.category))],
    timestamp: new Date().toISOString()
  };

  // Log analytics for monitoring
  console.log('Search Analytics:', analytics);

  return { results, analytics };
}

Progressive Search Enhancement

function progressiveSearch(data, query) {
  // Start with strict matching
  let results = search(data, query, { fuzzyThreshold: 0.9, limit: 10 });

  // If insufficient results, try fuzzy matching
  if (results.length < 5) {
    results = search(data, query, { fuzzyThreshold: 0.6, limit: 20 });
  }

  // If still insufficient, broaden field search
  if (results.length < 3) {
    results = search(data, query, { 
      fuzzyThreshold: 0.4, 
      limit: 30,
      fields: undefined // Search all fields
    });
  }

  return results;
}

🎨 Industry-Specific Examples

  • Real Estate: Property search with location, price, features
  • Healthcare: Patient records, medication databases
  • Education: Course catalogs, student directories
  • Legal: Case law, document discovery
  • Media: Asset management, content libraries
  • Finance: Transaction search, customer portfolios
  • Tourism: Hotel/restaurant discovery, activity search
  • Food & Beverage: Recipe databases, ingredient matching

Each of these patterns can be customized with appropriate field weights, fuzzy thresholds, and filtering logic to match your specific domain requirements.

TypeScript Support

The library is built with TypeScript and provides full type safety:

interface User {
  id: number;
  name: string;
  email: string;
}

// Full type inference and safety
const users: User[] = [...];
const results: SearchResult<User>[] = search(users, 'query');
const items: User[] = searchItems(users, 'query');

🔬 Benchmarks & Performance

The Universal Search library comes with a comprehensive benchmark suite to measure performance across different scenarios and data sizes. Our benchmarks help ensure the library performs well in real-world applications.

Quick Performance Overview

Data Size Average Time Operations/sec Use Case
100 items < 1ms > 1,000 Small apps, auto-complete
1K items < 5ms > 200 Medium catalogs
5K items < 25ms > 40 Large datasets
10K items < 50ms > 20 Enterprise scale

Running Benchmarks

Prerequisites: Make sure you've built the library first:

npm run build

Run all benchmarks:

# Complete benchmark suite with detailed analysis
npm run benchmark

# Quick performance check (ideal for CI/CD)
npm run benchmark:quick

# Individual benchmark types
npm run benchmark:main         # Core performance tests
npm run benchmark:comparative  # Compare different search functions
npm run benchmark:memory      # Memory usage and leak detection

Advanced benchmarking with garbage collection:

npm run benchmark:gc

Benchmark Types

🚀 Performance Benchmarks

Tests search speed across realistic scenarios:

  • Small Dataset (100 items): User directory search
  • Medium Dataset (1,000 items): E-commerce product search
  • Large Dataset (5,000 items): Document/content search
  • XL Dataset (10,000 items): Enterprise-scale stress test

Metrics measured:

  • Average response time (milliseconds)
  • Operations per second (higher is better)
  • Result accuracy and relevance
  • Memory usage during operations

⚖️ Comparative Benchmarks

Compares performance between different search functions:

// Functions tested in comparison
search()                    // Full results with detailed scoring
searchItems()              // Items only, no score details  
quickSearch()              // Simplified search with defaults
createSearcher()           // Pre-configured reusable searcher
createDocumentSearcher()   // Document-optimized searcher

Helps you choose the right function for your use case based on performance vs features trade-offs.

🧠 Memory Benchmarks

Analyzes memory efficiency and scalability:

  • Memory Usage: Tests consumption across data sizes (100 to 10,000 items)
  • Scalability Analysis: How performance scales with data size
  • Memory Leak Detection: Sustained search operations to detect leaks

Scalability ratings:

  • Excellent (> 0.8): Near-linear performance scaling
  • Good (0.6-0.8): Performance degrades slowly
  • Fair (0.4-0.6): Noticeable impact with scale
  • Poor (< 0.4): Significant degradation

🔍 Full Suite Analysis

The complete benchmark suite provides:

  • System Information: Environment and runtime details
  • Performance Overview: Comprehensive timing analysis
  • Memory Efficiency: Detailed memory usage patterns
  • Recommendations: Performance optimization suggestions

Real-World Performance Data

Our benchmarks use realistic data generators that simulate actual usage patterns:

// 1,000 users with names, emails, roles, skills
const results = search(users, 'john developer', {
  fieldWeights: { name: 10, role: 8, skills: 6 }
});
// Typical: ~2-4ms, 250-500 ops/sec
// 5,000 products with names, brands, categories, descriptions
const results = search(products, 'wireless headphones', {
  fieldWeights: { name: 15, brand: 12, category: 8 }
});
// Typical: ~8-15ms, 60-125 ops/sec
// 10,000 articles with titles, content, tags, metadata
const results = search(documents, 'typescript tutorial', {
  fieldWeights: { title: 20, tags: 10, content: 3 }
});
// Typical: ~20-40ms, 25-50 ops/sec

Performance Optimization Tips

Based on benchmark results, here are proven optimization strategies:

🎯 Field Selection

// ✅ Good: Specify relevant fields only
search(data, query, { fields: ['name', 'title', 'category'] })

// ❌ Avoid: Searching all fields unnecessarily  
search(data, query) // Auto-detects ALL string fields

⚡ Smart Limiting

// ✅ Good: Limit results for better performance
search(data, query, { limit: 20 })

// ❌ Avoid: Unlimited results on large datasets
search(largeData, query) // Returns up to 100 by default

🔄 Reusable Searchers

// ✅ Good: Create once, use many times
const searcher = createSearcher({ fieldWeights: {...} });
const results1 = searcher(data, 'query1');
const results2 = searcher(data, 'query2'); // Reuses config

// ❌ Avoid: Recreating configuration each time
search(data, 'query1', options);
search(data, 'query2', options); // Recreates config

🎚️ Threshold Tuning

// ✅ Good: Tune fuzzy threshold for your use case
search(data, query, { 
  fuzzyThreshold: 0.8  // Stricter for exact matching
  fuzzyThreshold: 0.6  // More forgiving for typos
})

Memory Efficiency Guidelines

Our memory benchmarks show these patterns:

Data Size Memory Usage Memory/Item Efficiency
100 items ~2-4 MB ~20-40 KB Excellent
1K items ~8-15 MB ~8-15 KB Good
5K items ~25-40 MB ~5-8 KB Good
10K items ~45-70 MB ~4.5-7 KB Fair

Memory leak detection results: < 5% memory growth over 100 iterations ✅

CI/CD Integration

Integrate benchmarks into your CI/CD pipeline:

# Example GitHub Actions step
- name: Run Performance Benchmarks
  run: |
    npm run build
    npm run benchmark:quick > benchmark-results.txt
    # Add custom logic to compare with baseline

Benchmark Data Generators

The benchmark suite includes realistic data generators:

  • generateUsers(count): User directory with profiles, skills, departments
  • generateProducts(count): E-commerce with brands, categories, pricing
  • generateDocuments(count): Articles with content, metadata, tags

For detailed benchmark documentation and advanced usage, see /benchmarks/README.md.

Performance Monitoring

Track performance in your application:

function monitoredSearch(data, query, options = {}) {
  const start = performance.now();
  const results = search(data, query, options);
  const duration = performance.now() - start;

  console.log(`Search completed in ${duration.toFixed(2)}ms`);
  console.log(`Found ${results.length} results`);

  return results;
}