Home

Character Codes and (Special) Tag Characters in HTML5

|
Updated:  
2016-03-26 14:02:57
|
HTML5 and CSS3 All-in-One For Dummies
Explore Book
Buy On Amazon

Encodings for the ISO Latin-1 character set are supplied by default in all modern web browsers. (Search for “ISO Latin-1 character set” to find a complete table of values.) Thus, the character entities in that set may be used directly in HTML markup without going through any special contortions.

However, using other encodings requires inclusion of special markup to tell the browser to interpret Unicode character codes. (Unicode is an international standard — ISO standard 10645, in fact — that embraces enough codes to handle most human alphabets, plus plenty of symbols and non-alphabetic characters, too.) This special markup takes this form:

<meta charset="UTF-8">

Because the charset value reads UTF-8, you can reference all common Unicode values. (UTF-8 stands for UCS Transformation Format 8-bit, an encoding format that represents all Unicode characters. Search for “Unicode UTF-8 character table” to skim over its one-million-plus character codes.)

Although today’s browsers support UTF-8 more or less universally, expect to see support for UTF-16 character codes sometime soon. UTF-16 character codes let browsers deal more effectively with non-Roman alphabets such as Arabic, katakana (Japanese ideographs), and Hangul (Korean ideographs), which some browsers struggle to render correctly and completely today.

HTML-savvy software assumes that certain HTML characters, such as the left and right angle brackets (less-than and greater-than signs in math notation) are meant to be hidden and not displayed on your finished web pages. If you actually want to display these characters on your pages, you must make your wishes clear to the browser.

These entities enable display of characters that are normally part of hidden HTML markup:

  • left angle bracket ( <

  • right angle bracket (>): >

  • ampersand (&): &

If you need these symbols to appear, include their entities in your markup like this:

<p>The paragraph element identifies some text as a Paragraph: </p>
<p><p>This is a paragraph</p></p>

This figure shows how these entities appear inside a browser window.

image0.jpg

About This Article

This article is from the book: 

About the book author:

Ed Tittel is a 28-year veteran of the computer industry. A seasoned author and consultant, Ed has more than 140 books to his credit.

Chris Minnick is an accomplished author, teacher, and programmer. Minnick authored or co-authored over 20 books, including titles in the For Dummies series. He has developed video courses for top online training platforms and he teaches programming and machine learning to professional developers at some of the largest global companies.