UTF-8 to Hex Converter

Frequently Asked Questions

What Is UTF-8 Encoding?

UTF-8 is a variable-length character encoding used for Unicode. It uses 1 to 4 bytes to represent each character, making it efficient for encoding ASCII characters while also supporting all Unicode symbols.

How Does This Tool Convert Text to UTF-8?

This tool uses the browser's built-in TextEncoder to encode input text into UTF-8. Each character is converted based on its Unicode code point into one or more bytes, then displayed as hexadecimal escape sequences (e.g., \xE4\xB8\xAD for '中').

How Does This Tool Decode UTF-8 to Readable Text?

The tool strips the \x prefix from the input, parses the remaining hex values into bytes, and uses the browser’s TextDecoder to convert the bytes back into readable text, following UTF-8 decoding rules.

Why Is UTF-8 the Most Commonly Used Encoding?

UTF-8 is widely adopted because it is backward-compatible with ASCII, efficient for English text, and capable of encoding all Unicode characters. It is the default encoding for web pages and many modern applications, ensuring cross-platform text consistency.

How Does UTF-8 Encoding Work?

UTF-8 works by encoding Unicode code points into a sequence of bytes:

  • Code points from U+0000 to U+007F are encoded in one byte (same as ASCII).
  • Code points from U+0080 to U+07FF are encoded in two bytes.
  • Code points from U+0800 to U+FFFF are encoded in three bytes.
  • Code points from U+10000 to U+10FFFF are encoded in four bytes.

Each byte in a multi-byte sequence starts with a specific bit pattern that indicates its position, making UTF-8 self-synchronizing and error-resilient.

How to Encode and Decode UTF-8 in Different Programming Languages?

Here are examples of how to encode strings into UTF-8 bytes and decode UTF-8 bytes back into strings using different programming languages:

Go

UTF-8 encoding and decoding in Go.


import "fmt"

func main() {
    text := "Hello, World!"
    // Encode string to UTF-8 bytes
    utf8Bytes := []byte(text)
    fmt.Printf("UTF-8 bytes: %x\n", utf8Bytes)

    // Decode UTF-8 bytes back to string
    decodedText := string(utf8Bytes)
    fmt.Printf("Decoded text: %s\n", decodedText)
}
      
Java

UTF-8 conversion example in Java.


import java.nio.charset.StandardCharsets;

public class Utf8Example {
    public static void main(String[] args) {
        String text = "Hello, World!";
        // Encode string to UTF-8 bytes
        byte[] utf8Bytes = text.getBytes(StandardCharsets.UTF_8);
        System.out.println("UTF-8 bytes: " + java.util.Arrays.toString(utf8Bytes));

        // Decode UTF-8 bytes back to string
        String decodedText = new String(utf8Bytes, StandardCharsets.UTF_8);
        System.out.println("Decoded text: " + decodedText);
    }
}
      
Python

Python code for UTF-8 encoding and decoding.


text = "Hello, World!"
# Encode string to UTF-8 bytes
utf8_bytes = text.encode("utf-8")
print(f"UTF-8 bytes: {utf8_bytes}")

# Decode UTF-8 bytes back to string
decoded_text = utf8_bytes.decode("utf-8")
print(f"Decoded text: {decoded_text}")
      
PHP

How to handle UTF-8 in PHP.


<?php
$text = "Hello, World!";
// Encode string to UTF-8 bytes
$utf8Bytes = utf8_encode($text);
echo "UTF-8 bytes: " . bin2hex($utf8Bytes) . PHP_EOL;

// Decode UTF-8 bytes back to string
$decodedText = utf8_decode($utf8Bytes);
echo "Decoded text: " . $decodedText . PHP_EOL;
?>
      
JavaScript

Using JavaScript to convert to and from UTF-8.


const text = "Hello, World!";
// Encode string to UTF-8 bytes
const encoder = new TextEncoder();
const utf8Bytes = encoder.encode(text);
console.log("UTF-8 bytes:", Array.from(utf8Bytes));

// Decode UTF-8 bytes back to string
const decoder = new TextDecoder("utf-8");
const decodedText = decoder.decode(utf8Bytes);
console.log("Decoded text:", decodedText);
      
TypeScript

UTF-8 conversion in TypeScript with examples.


const text: string = "Hello, World!";
// Encode string to UTF-8 bytes
const encoder: TextEncoder = new TextEncoder();
const utf8Bytes: Uint8Array = encoder.encode(text);
console.log("UTF-8 bytes:", Array.from(utf8Bytes));

// Decode UTF-8 bytes back to string
const decoder: TextDecoder = new TextDecoder("utf-8");
const decodedText: string = decoder.decode(utf8Bytes);
console.log("Decoded text:", decodedText);