View UTF-8 byte sequences and convert between UTF-8, UTF-16, and other encodings.
Type or paste text.
See the UTF-8 byte representation.
Copy byte values or decoded text.
Use UTF-8 Encode/Decode when debugging character encoding issues in web applications, analyzing byte sequences in file headers, or converting between different text encodings for data migration. It is invaluable for diagnosing mojibake (garbled text) problems caused by encoding mismatches. Backend developers use it when working with multi-language databases and internationalized content.
UTF-8 is the dominant character encoding on the web, capable of representing every Unicode character using 1 to 4 bytes. It is backward-compatible with ASCII, meaning standard English characters use just one byte each. Over 98% of all web pages use UTF-8, making it the universal standard for text encoding in modern applications.
UTF-8 uses a variable number of bytes per character: ASCII characters (A-Z, 0-9) use 1 byte, accented Latin characters use 2 bytes, CJK characters (Chinese, Japanese, Korean) use 3 bytes, and emoji or rare scripts use 4 bytes. Understanding this helps debug string length vs. byte length discrepancies in code.
UTF-8 is the web standard using 1-4 bytes per character and is ASCII-compatible, while UTF-16 uses 2 or 4 bytes and is common in Windows and Java internals. UTF-8 is more space-efficient for Latin-based text, while UTF-16 can be more compact for CJK-heavy content. Choose UTF-8 for web and file interchange.