Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

tongwen-core

tongwentang97MIT4.1.1TypeScript support: included

A fast converter between Traditional Chinese and Simplified Chinese

tongwen, new tongwentang, simplified chinese, traditional chinese, converter

readme

TongWen Core converter

A fast converter between Traditional Chinese and Simplified Chinese and a helper DOM tree walker.

Installation

Install by npm:

npm install tongwen-core

Install by yarn:

yarn add tongwen-core

Examples and Usages

Note: Example scripts are all written in TypeScript.

An example for how to use converter:

import { createConverterMap, createConverterObj, LangType, SrcPack } from 'tongwen-core';

const dics: SrcPack = { s2t: [{ 台湾: '台灣' }], t2s: [{ 台灣: '台湾' }] };
const mConv = createConverterMap(dics);
const oConv = createConverterObj(dics);
const result = [mConv.phrase(LangType.s2t, '台湾'), oConv.phrase(LangType.s2t, '台湾')];
console.log(result); // [ '台灣', '台灣' ]

The difference between createConverterMap and createConverterObj is the former use es Map and
the latter use plain Object as internal data structure. Use depend on your environment,
but the es version is highly recommended, due performance boost can up to 2.x time faster.

Note: You should provide dictionaries when creating converter, no default dictionaries.

Here is an example for using converter and walker in web page.

import { createConverterMap, LangType, SrcPack, walkerNode } from 'tongwen-core';

const dics: SrcPack = { s2t: [{ 台湾: '台灣' }], t2s: [{ 台灣: '台湾' }] };
const mConv = createConverterMap(dics);
const parseds = walkNode(document);

parseds; // parsed result as an array
import { walkerNode } from 'tongwen-core';

// customize by passing custom function(s)
const parseds = walkNode(document, { isRejectNode: node => false });

parseds; // parsed result as an array

Dictionaries

Dictionaries that included in this project is use only for test, you can use them but not recommmanded, since they are for v1.5 New TongWenTang Core algorithm. We plan to release a independent repository in the future.

API and Types

For converter

// The source dictionaries collection
type SrcPack = {
  s2t: Record<string, string>[];
  t2s: Record<string, string>[];
};
const dics: SrcPack = { s2t: [{ 台湾: '台灣' }], t2s: [{ 台灣: '台湾' }] };

// Converter type
type Converter = {
  set: (src: SrcPack) => undefined;
  char: (type: LangType, text: string) => string;
  phrase: (type: LangType, text: string) => string;
};

For walker:

// ParsedResult
interface ParsedTextNode {
  type: 'TEXT';
  node: Node;
  text: string;
}

interface ParsedElementNode {
  type: 'ELEMENT';
  node: Element;
  attr: string;
  text: string;
}

type ParsedResult = ParsedTextNode | ParsedElementNode;

// WalkNode
type WalkNode = (node: Node, anf?: Partial<AcceptNodeFn>) => ParsedResult[];

interface AcceptNodeFn {
  hasTargetContent: (text: string | null) => boolean;
  isRejectNode: (node: Node) => boolean;
  isEditableElement: (elm: Element) => boolean;
  hasTargetAttributes: (elm: Element) => boolean;
  parseTextNode: (node: Node) => ParsedTextNode;
  parseElementNode: (elm: Element) => ParsedElementNode[];
}

For more detail, please check the source code.

Recommanded for development

  • Editor: Visual Studio Code
    • For best TypeScript support
    • Packages: prettier - code formater, TypeScript Toolbox
  • Environment
    • node
    • yarn
  • npm scripts:
    • test:test for any TypeScript error

Story

TongWenCore and TongWenParser derived from the core converter of New Tongwentang extension (version 1.5), which a browser extension that provide functionality for convert charaters between Traditional Chinese and Simplified Chinese who developed by softcup.

TongWenCore and TongWenParser extract from the extension as a independent repository and totally rewrite with TypeScript to make it more solid.

Convert speed of TongWenCore is faster than New Tongwentang Core (about 3.x time faster which tested in certain case). Convert Algorithm have been redesign, the idea was originally from cookwu and t7yang who implemented in TypeScript.

Lisence

MIT

changelog

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

4.1.1 (2021-07-12)

[4.1.0] - 2021-04-28

Fixed

  • Ensure skip or reject all contenteditable node. (#7)

Changed

  • Introduce IsTargetTextNode to replace HasTargetContent in AcceptNodeFn.

[4.0.1] - 2021-04-23

Fixed

  • Add attr property to ParsedElementNode.

[4.0.0] - 2021-04-23 (Deprecated)

Added

  • Introduce a new walkNode node parser, which resolved many issues in tongwentang/tongwentang-extension.

Fixed

  • Export all utitlies functions and constants use with walkNode which can help developr customize walkNode behavior. Resolved #6.

Deprecated

  • Last parser walker is deprecated now and will be remove on next major version (v5.0.0) as well as its functions and constants.

[3.2.5] - 2020-12-24

Security

  • Update deps for security alerts

Changed

  • export LangType as enum instead of const enum due isolatedModules: true

[3.2.4] - 2020-09-06

Security

  • Update deps for security alerts

[3.2.3]

Security

  • Update deps for security alerts

[3.2.2]

Security

  • Update dependencies.

[3.2.1]

Fixed

  • Update homepage and repo's url link.

Security

  • Update dependencis for security alerts.

[3.2.0]

Fixed

  • Readme typo, createConveter_ => createConverter_
  • extractAttrText filter by hasChinese to make sure no empty text parsed node.

Changed

  • Rename hasTargetAttr(s) to isTargetAttr(s) for semantic.

Added

  • Add reject guard to acceptNode and walker, now reject unneeded node by node.nodeName even for tree walker root.

[3.1.0]

Changed

  • Move parse node to independ function.
  • Remove unneeded document node in parse node function.

Added

  • export all helper functions in walker.

[3.0.2]

Fixed

  • Export ParsedNode type.

[3.0.1]

Added

  • Add converter and converter creator types.

[3.0.0]

Converter in version 3 is completely rewrite, class pattern is replace by module pattern as well as Parser. Parser is replace by walker and do not handle for update node value any more, just return the parsed nodes. For more detail, please check latest README.

[2.x.x]

  • Please check here.