Remove special characters from a string except emojis [closed]

2 weeks ago 14
ARTICLE AD BOX

Your regex removes emojis because it only allows ASCII letters and numbers:

/[^a-zA-Z0-9\s]/g

Emojis are Unicode characters, so they don’t match that range and get removed.

If you're using modern JavaScript (ES2018+), you can use Unicode property escapes to keep letters, numbers, spaces, and emojis:

stringWithEmoji.replace( /[^\p{L}\p{N}\s\p{Emoji}\u200D]/gu, '' );

\p{L} → any letter (all languages)

\p{N} → any number

\s → whitespace

\p{Emoji} → emoji characters

\u200D → zero-width joiner (required for combined emojis like 😎)

u flag → enables Unicode mode

g flag → replace globally

Unicode property escapes require ES2018+ support.

If you can’t use Unicode property escapes (\p{…}) because the environment doesn’t support ES2018+, you need to manually allow common emoji Unicode ranges.

stringWithEmoji.replace( /[^a-zA-Z0-9\s\u203C-\u3299\uD83C-\uDBFF\uDC00-\uDFFF]/g, '' );

Most emojis are stored in high Unicode ranges using surrogate pairs:

\uD83C-\uDBFF

\uDC00-\uDFFF

This is not perfect.

Emoji are complex:

Some use multiple code points

Some use variation selectors

Some use zero-width joiners

If you need fully correct emoji handling, the proper solution is to use:

Unicode property escapes (\p{Emoji}) in modern JS

A library like emoji-regex

const emojiRegex = require('emoji-regex'); const regex = new RegExp( `[^\w\\s${emojiRegex().source}]`, 'g' );
Read Entire Article