Unicode aware string truncation that given a max byte size will truncate the string to or just below that size
Tells, is given character a part of astral character, specifically, a high and low surrogate
a regular expression that matches all the surrogate pairs and combining-marked characters in a string