Quantcast
Channel: Encoding – The Wiert Corner – irregular stream of stuff
Viewing all articles
Browse latest Browse all 160

Unicode subscripts and superscripts: Latin, Greek, Cyrillic, and IPA tables; Source: Small caps: Unicode – Wikipedia

$
0
0

I originally searched for the tables below to see if I could get the visualisations of TeX and LaTeX right for infinite loop in “LaTeX: A Document Preparation System” by Leslie Lamport, printed in 1994..

Didn’t work, neither did using plain html super and subscript. The only thing that worked was using CSS styles (I chose to embed them, as separate CSS files are a huge premium over the WordPress plan), which also preserves actual meaning for screen readers:

  • TᴇX / LᴬTᴇX – Unicode super and subscript (there is no Unicode subscript E) which is copied in plain text as TᴇX / LᴬTᴇX, including their Unicode superscripts and subscripts, having this HTML:
    TᴇX / LᴬTᴇX
  • TEX / LATEX – super and subscript which is copied in plain text as TEX / LATEX, having this HTML:
    T<sub>E</sub>X / L<sup>A</sub>T<sub>E</sub>X
  • TeX / LaTeX – embeddded CSS with this HTML (copied from Wikipedia entries TeX and LaTeX) which is copied as TeX / LaTeX having his HTML:
    <span class="texhtml" style="font-family: 'CMU Serif', cmr10, LMRoman10-Regular, 'Latin Modern Math', 'Nimbus Roman No9 L', 'Times New Roman', Times, serif;">
    T
    <span style="text-transform: uppercase; vertical-align: -0.25em; margin-left: -0.1667em; margin-right: -0.125em; line-height: 1ex;">
    e
    </span>
    X
    </span>
     / 
    <span class="texhtml" style="font-family: 'CMU Serif', cmr10, LMRoman10-Regular, 'Latin Modern Math', 'Nimbus Roman No9 L', 'Times New Roman', Times, serif;">
    L
    <span style="text-transform: uppercase; font-size: 0.75em; vertical-align: +0.25em; margin-left: -0.1667em; margin-right: -0.125em; line-height: 0.66ex;">
    a
    </span>
    T
    <span style="text-transform: uppercase; vertical-align: -0.25em; margin-left: -0.1667em; margin-right: -0.125em; line-height: 1ex;">
    e
    </span>
    X
    </span>
    

The biggest problem with using the Unicode subscripts and superscripts is that accessibility tools cannot reveal or have a hard time revealing their actual meaning.

I found them via the  Stack Overflow entry [Wayback/Archive] javascript – How to find the unicode of the subscript alphabet? – Stack Overflow (thanks [Wayback/Archive] Mahmoud Elgohary, [Wayback/Archive] Kevin Hakanson, [Wayback/Archive] user2987828 and [Wayback/Archive] Bimo)

Q

I’ve found some letters but i need to find others such as “c”, “m”, “p”, is this even possible?

A

Take a look at the wikipedia article Unicode subscripts and superscripts. It looks like these are spread out across different ranges, and not all characters are available.

C

You can add small capitals that looks like subscripts: Aᴀʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴɪᴘǫʀsᴛᴜᴠᴡxʏᴢ And there are also some other small letters that look like subscripts (except b, o and q): ₐ𝒸𝒹ₑ𝒻𝓰ₕᵢⱼₖₗₘₙₚᵣₛₜᵤᵥ𝓌ₓᵧ𝓏

Tables

Besides the useful article at [Wayback/Archive] Superscript letters in Unicode | Rupert Shepherd there are these useful tables:

Small caps: Unicode – Wikipedia

Although small caps are allographs of their full size equivalents (and so not usually “semantically important”), the Unicode standard does define a number of “small capital” characters in the IPA extensions, Phonetic Extensions and Latin Extended-D ranges (0250–02AF, 1D00–1D7F, A720–A7FF). These characters are meant for use in phonetic representations. For example, ʀ represents a uvular trill in IPA, and ɢ a voiced uvular plosive. They should not normally be used in other contexts;[b] rather, the basic character set should be used with suitable formatting controls as described in the preceding sections.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
inline ʙ ɢ ʜ ɪ ʟ ɴ ʀ * ʏ
superscript * 𐞄 * * 𐞒 𐞖 * 𐞪 𐞲

* Superscript versions of small caps ,[41] ,[42] [41] and have been provisionally assigned for inclusion in a future version of the Unicode Standard.[43]

Unicode subscripts and superscripts: Latin, Greek, Cyrillic, and IPA tables – Wikipedia (only quoting the Latin and Greek bits; maybe in the future I will include other bits as well if I need those too):

Consolidated, the Unicode standard contains superscript and subscript versions of a subset of Latin, Greek and Cyrillic letters. Here they are arranged in alphabetical order for comparison (or for copy and paste convenience). Since these characters appear in different Unicode ranges, they may not appear to be the same size or position due to font substitution in the browser. Shaded cells mark small capitals that are not very distinct from minuscules, and Greek letters that are indistinguishable from Latin, and so would not be expected to be supported by Unicode.

Little punctuation is encoded. Parentheses are shown above in the basic block above, and the exclamation mark ⟨⟩ is shown in the IPA table below. A question mark may be created with a superscript gelded question mark and a combining dot: ⟨ˀ̣⟩, although some fonts do not render it properly.

 

Latin superscript and subscript letters
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Superscript capital ᴿ
Superscript small cap 𐞄 𐞒 𐞖 𐞪 𐞲
Superscript minuscule ʰ ʲ ˡ 𐞥 ʳ ˢ ʷ ˣ ʸ
Overscript small cap ◌ᷛ ◌ᷞ ◌ᷟ ◌ᷡ ◌ᷢ
Overscript minuscule ◌ͣ ◌ᷨ ◌ͨ ◌ͩ ◌ͤ ◌ᷫ ◌ᷚ ◌ͪ ◌ͥ ◌ᷜ ◌ᷝ ◌ͫ ◌ᷠ ◌ͦ ◌ᷮ ◌ͬ ◌ᷤ ◌ͭ ◌ͧ ◌ͮ ◌ᷱ ◌ͯ ◌ᷦ
Subscript minuscule
Underscript minuscule ◌᷊ ◌ᪿ

Additional superscript capitals are ᴭ ᴯ ᴲ ᴻ. Some of these are small caps in the source documents in the Unicode proposals.
Superscript capital s has been proposed for a future version of the Unicode Standard.[8][9]
Superscript versons of small capital A and E have been proposed for a future version of the Unicode Standard.[10][11][9]

Greek superscript and subscript letters
Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω
Superscript minuscule [A] ᶿ [A]
Overscript minuscule ◌ᷧ ◌ᷩ
Subscript minuscule ͺ[12]
Underscript minuscule ◌ͅ ◌̫[13]
  1. In some fonts, Latin alpha ᵅ and upsilon ᶹ can be used as superscript Greek alpha and upsilon. ᵋ and ᶥ are also officially Latin letters, but display the same as Greek.

Superscript versons of Greek psi and omega have been proposed for a future version of the Unicode Standard.[10][9]

Queries

--jeroen


Viewing all articles
Browse latest Browse all 160

Trending Articles