How Text Becomes Binary: Character Encoding From ASCII to UTF-8

A few years ago I was building an API that accepted user input in multiple languages. Everything worked fine in English. Then a user submitted a form in Japanese and the database stored garbage characters. The classic mojibake problem. I had assumed UTF-8 everywhere but one middleware component was silently converting to Latin-1, which cannot represent Japanese characters. Fixing it took ten minutes. Finding it took two days. Understanding how text becomes binary -- and how that binary becomes text again -- is fundamental to avoiding an entire category of bugs that are notoriously difficult to debug. ASCII: Where It Started ASCII (American Standard Code for Information Interchange) was published in 1963. It maps 128 characters to 7-bit binary numbers: Character Decimal Binary A 65 1000001 B 66 1000010 Z 90 1011010 a 97 1100001 0 48 0110000 space 32 0100000 newline 10 0001010 Some useful patterns to notice: Uppercase letters start at 65, lowercase at 97. The difference is exactly 32, wh

How Text Becomes Binary: Character Encoding From ASCII to UTF-8

Related Articles

The Hidden Complexity of Citation Formatting (And Why I Automated It)

The Widmark Formula: How BAC Is Actually Calculated

Three Ways to Talk to Claude Remotely When You’re Not at Your Desk

The Anatomy of a Good Box Shadow (and Why Most Look Fake)

How to Use Google Stitch to Turn Design Systems into Production-Ready UI

Related Articles

How-To
The Hidden Complexity of Citation Formatting (And Why I Automated It)
Dev.to Beginners • 2h ago

How-To
The Widmark Formula: How BAC Is Actually Calculated
Dev.to Tutorial • 2h ago

How-To
Three Ways to Talk to Claude Remotely When You’re Not at Your Desk
Medium Programming • 2h ago

How-To
The Anatomy of a Good Box Shadow (and Why Most Look Fake)
Dev.to Tutorial • 3h ago

How-To
How to Use Google Stitch to Turn Design Systems into Production-Ready UI
Medium Programming • 5h ago