I am trying to build a css only system which would be used in my Anki card deck. The purpose of this system is to color-highlight diacritic symbols on language which make heavy use of tones and transcript those with accents, for example Vietnamese.
Cobbling together multiple approaches, I achieved this solution:
const combiningMarks = {
771: 126, // tilde
769: 180, // acute accent
768: 96, // grave accent
803: 196, //dot below Ä (placeholder letter, not relevant)
777: 214, //question mark Ö (placeholder letter, not relevant)
795: 220, //horn Ü (placeholder letter, not relevant)
770: 223, //circumflex ß (placeholder letter, not relevant)
774: 222 //Breve Þ (placeholder letter, not relevant)
}
// const target = 'mè bố bẳ người bệ bế bể bễ bề ';
const target = "a á ạ à ả ã ă ắ ặ ằ ẳ ẵ â ấ ậ ầ ẩ ẫ e é ẹ è ẻ ẽ ê ế ệ ề ể ễ i í ị ì ỉ ĩ o ó ọ ò ỏ õ ô ố ộ ồ ổ ỗ ơ ớ ợ ờ ở ỡ u ú ụ ù ủ ũ ư ứ ự ừ ử ữ y ý ỵ ỳ ỷ ỹ đ"
const decomposedString = target.normalize("NFD")
const codepoints = [...decomposedString].map(c => c.codePointAt(0))
const charsWithFullMarks = codepoints.map(c => combiningMarks[c] || c)
var finalString = String.fromCodePoint(...charsWithFullMarks)
finalString = finalString.replaceAll("Ü", '<span class="diacritical-mark-generic">̛</span>');
finalString = finalString.replaceAll("ß", '<span class="diacritical-mark-generic">̂</span>');
finalString = finalString.replaceAll("Þ", '<span class="diacritical-mark-generic">̆</span>');
finalString = finalString.replaceAll("´", '<span class="diacritical-mark-green">́</span>');
finalString = finalString.replaceAll("`", '<span class="diacritical-mark-red">̀</span>');
finalString = finalString.replaceAll("~", '<span class="diacritical-mark-pink">̃</span>');
finalString = finalString.replaceAll("Ä", '<span class="diacritical-mark-blue">̣</span>');
finalString = finalString.replaceAll("Ö", '<span class="diacritical-mark-orange">̉</span>');
document.getElementById("exp").innerHTML = finalString
:root,
html,
body {
font-family: sans-serif;
/* This needs to be different from the diacritic font for the coloring to work */
}
span.diacritical-mark-generic {
color: initial;
font-family: "DejaVu Sans";
/* font with the most accurate positioning */
}
span.diacritical-mark-red {
color: #ff3131;
font-family: "DejaVu Sans";
/* font with the most accurate positioning */
}
span.diacritical-mark-green {
color: #31ff61;
font-family: "DejaVu Sans";
/* font with the most accurate positioning */
}
span.diacritical-mark-blue {
color: #4081e9;
font-family: "DejaVu Sans";
/* font with the most accurate positioning */
}
span.diacritical-mark-pink {
color: #ff31cf;
font-family: "DejaVu Sans";
/* font with the most accurate positioning */
}
span.diacritical-mark-orange {
color: #ff9f04;
font-family: "DejaVu Sans";
/* font with the most accurate positioning */
}
<span id="exp"></span><br>
<p>
a á ạ à ả ã ă ắ ặ ằ ẳ ẵ â ấ ậ ầ ẩ ẫ e é ẹ è ẻ ẽ ê ế ệ ề ể ễ i í ị ì ỉ ĩ o ó ọ ò ỏ õ ô ố ộ ồ ổ ỗ ơ ớ ợ ờ ở ỡ u ú ụ ù ủ ũ ư ứ ự ừ ử ữ y ý ỵ ỳ ỷ ỹ đ
</p>
This code splits the diacritics from the letters of the input and appends them as placeholder tokens behind the vowel, after which the placeholders get converted into combining diacritics and wrapped in a span to be styled.
This seems to me a quite hacky solution with caveats like the span having to have another font applied in order to take styling at all, and the alignment of the diacritics being hit or miss (probably because we disconnect the accent from the letter by using a different span and font). Specifically, special cases like stacked circumflex and tilde accents overlapping or the i dot not being replaced by applicable accents are unacceptable.
I don't really want to write a whole font rendering system with vectors and manual positioning of the diacritics, hence I'm looking for a smart and easy way to overcome these shortcomings and color the diacritics, which might be multiple stacked ones, in specified colors.