Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

greek-text-utils

mkarajohn24MIT1.0.1TypeScript support: included

A collection of greek text utils, including conversion to ISO 843/ELOT 743, Greeklish, Phonetic Latin, and Transliterated Latin

greek, text, utils, iso843, elot743, greeklish, phonetic, transliterated

readme

Greek Text Utils

NPM

A TypeScript library for Greek text processing.

  • Convert between Greek and Greeklish
  • Convert Greek to phonetic or transliterated Latin
  • Convert Greek to ISO 843 / ELOT 743 (Type 1 and Type 2)
  • Sanitize polytonic/monotonic diacritics
  • Remove Greek stopwords (ancient and modern)

Installation

npm install greek-text-utils

or with pnpm:

pnpm add greek-text-utils

Usage

ESM

import greekUtils from 'greek-text-utils';

CommonJS

const greekUtils = require('greek-text-utils');

Methods

toISO843Type1(text)

// Context-insensitive, letter-for-letter (except AI/EI/OI/OU)

greekUtils.toISO843Type1('αυτός'); // 'autos'
greekUtils.toISO843Type1('μπάλα'); // 'mpala'
greekUtils.toISO843Type1('κομπιούτερ'); // 'kompiouter'

When to use Type 1

  • Library/catalog systems
  • Databases where reversibility is preferred
  • Academic/bibliographic references

fromISO843Type1(text)

Reconstruct Greek text from Type 1 Latin. This follows the standard's re‑transliteration principles and applies final sigma (ς) at word ends.

// Basic reverse examples
greekUtils.fromISO843Type1('Athina'); // 'Αθηνα' (note: no accent)
greekUtils.fromISO843Type1('autos'); // 'αυτος'
greekUtils.fromISO843Type1('mpampas'); // 'μπαμπας'
greekUtils.fromISO843Type1('theos'); // 'θεος' (final ς)

// Round-trip stability at the Latin level (accents are not restored):
const t1 = greekUtils.toISO843Type1('Αθήνα'); // 'Athina'
const back = greekUtils.fromISO843Type1(t1); // 'Αθηνα'
greekUtils.toISO843Type1(back) === t1; // true

Notes:

  • Type 1 is reversible at the Latin level. Diacritics are not restored when converting back to Greek.
  • Some Greek vowels collapse in Type 1 (e.g., η/ι/υ → i). Reverse chooses a consistent Greek base letter.
  • There is no reverse for Type 2 (transcription) because it is pronunciation-based and not strictly reversible.

toISO843Type2(text)

// Context-sensitive rules for αυ/ευ/ηυ and μπ
// Matches the official government converter

greekUtils.toISO843Type2('αυτός'); // 'aftos'
greekUtils.toISO843Type2('αυγό'); // 'avgo'
greekUtils.toISO843Type2('μπάλα'); // 'bala' (μπ at beginning)
greekUtils.toISO843Type2('λάμπα'); // 'lampa' (μπ in middle)
greekUtils.toISO843Type2('ΘΕΟΣ'); // 'THEOS' (all caps)
greekUtils.toISO843Type2('Θεσσαλονίκη'); // 'Thessaloniki'

// Backward-compatible alias (Type 2)
greekUtils.toISO843('ΕΥΡΩΠΗ'); // 'EVROPI'

When to use Type 2

  • Government documents (passports/IDs)
  • Toponyms on signs and maps
  • Any case where Greek pronunciation guidance is desired

sanitizeDiacritics(text, ignoreCharacters?)

Convert Greek text with polytonic/monotonic diacritics to their simple equivalents.

greekUtils.sanitizeDiacritics('Ἀθήνα Ἑλλάς άέήίόύώ ϊΐ ϋΰ');
// 'Αθηνα Ελλας αεηιουω ϊΐ ϋΰ'

toGreek(text, ignoreCharacters?)

Convert a Latin (greeklish) representation to modern Greek.

greekUtils.toGreek('kalhmera, pws eiste?'); // 'καλημερα, πωσ ειστε?'
greekUtils.toGreek('kalhmera, pws eiste?', '?p'); // 'καλημερα, pωσ ειστε?'

Note: current mapping does not apply final sigma automatically.

toGreeklish(text, ignoreCharacters?)

Convert modern Greek to Greeklish.

greekUtils.toGreeklish('καλημερα, πως ειστε;'); // 'kalhmera, pws eiste;'
greekUtils.toGreeklish('Το κάλλος είναι η καλύτερη συστατική επιστολή');
// 'To kάllos eίnai h kalύterh systatikή epistolή'

greekUtils.toGreeklish('καλημερα, πως ειστε;', ';λ'); // 'kaλhmera, pws eiste;'

toPhoneticLatin(text, ignoreCharacters?)

Convert to phonetic Latin (sound mapping). Accents may be preserved to indicate stress.

greekUtils.toPhoneticLatin('αυτός'); // 'aftós'
greekUtils.toPhoneticLatin('ευαγγέλιο'); // 'evaggélio'
greekUtils.toPhoneticLatin('Εύηχο: αυτό που ακούγεται ωραία.');
// 'Évikho: aftó pou akoúyete oréa.'

greekUtils.toPhoneticLatin('Εύηχο: αυτό που ακούγεται ωραία.', 'χ');
// 'Éviχo: aftó pou akoúyete oréa.'

toTransliteratedLatin(text, ignoreCharacters?)

Convert to transliterated Latin (letter mapping) using macrons/diacritics when appropriate.

greekUtils.toTransliteratedLatin('ψυχή'); // 'psukhḗ'
greekUtils.toTransliteratedLatin('Εύηχο: αυτό που ακούγεται ωραία.');
// 'Eúēkho: autó pou akoúgetai ōraía.'

greekUtils.toTransliteratedLatin('Εύηχο: αυτό που ακούγεται ωραία.', 'αύ');
// 'Eύēkho: αutó pou αkoύgetαi ōrαíα.'

removeStopWords(text, shouldRemoveMultipleWhiteSpaces = false)

Remove Greek stopwords (ancient and modern). Optionally collapse multiple spaces.

greekUtils.removeStopWords('Αυτή είναι μια απλή πρόταση ...', false);
// 'μια απλή πρόταση,  δείχνει  αφαίρεση όλων  stopwords  αρχαίας  νέας Ελληνικής γλώσσας  επιστρέφει  καθαρό κείμενο.'

greekUtils.removeStopWords('Αυτή είναι μια απλή πρόταση ...', true);
// 'μια απλή πρόταση, δείχνει αφαίρεση όλων stopwords αρχαίας νέας Ελληνικής γλώσσας επιστρέφει καθαρό κείμενο.'

API Summary

  • toISO843Type1(text: string): string
  • fromISO843Type1(text: string): string
  • toISO843Type2(text: string): string
  • toISO843(text: string): string - (alias of Type 2)
  • toGreek(text: string, ignoreCharacters?: string): string
  • toGreeklish(text: string, ignoreCharacters?: string): string
  • toPhoneticLatin(text: string, ignoreCharacters?: string): string
  • toTransliteratedLatin(text: string, ignoreCharacters?: string): string
  • sanitizeDiacritics(text: string, ignoreCharacters?: string): string
  • removeStopWords(text: string, shouldRemoveMultipleWhiteSpaces?: boolean): string

Notes on ISO 843 Compliance

  • This library implements both systems defined in the official standard.
  • Type 2 matches the Greek government converter: αυ/ευ/ηυ → av/af, ev/ef, iv/if depending on following letter; μπ → b at word edges, mp in middle; all-caps digraphs converted (TH/CH/PS/NCH/NX).
  • Type 1 follows the standard’s transliteration tables; AU/EU/OU are the only fixed vowel digraphs.

Spec references for ISO → Greek (Type 1 reverse)

  • Re‑transliteration definition: ISO 843 / ELOT 743 Appendix A, A.2.1.3 (Επαναμεταγραμματισμός) — reverse of transliteration rules.
  • Final sigma rule (σ ↔ ς): Table 1 notes (e.g., note about how Latin 's' maps back to Greek σ/ς depending on word position).
  • Fixed vowel digraphs in Transliteration (Type 1): Table 1 Note 1 (AU/EU/OU), enabling unambiguous reverse mapping.
  • Full text of the standard: ELOT 743 / ISO 843 PDF.

Acknowledgments

This library is based on and inspired by the original "greek-utils" project. See vbarzokas/greek-utils.

License

MIT