What is URL Encoding?
URL Encoding (Percent-Encoding) is a mechanism for encoding information in a Uniform Resource Identifier (URI) by replacing unsafe ASCII characters with a '%' followed by two hexadecimal digits representing the character's byte value.
Quick Facts
| Full Name | Percent-Encoding / URL Encoding |
|---|---|
| Created | 1994 (RFC 1738, updated in RFC 3986 in 2005) |
| Specification | Official Specification |
How It Works
URL encoding ensures that URLs only contain valid ASCII characters. Characters that have special meaning in URLs (like &, =, ?, /) or are not allowed (like spaces, non-ASCII characters) must be encoded. For example, a space becomes %20, and an ampersand becomes %26. This encoding is essential for passing data in query strings, form submissions, and API requests. Modern web applications typically handle URL encoding automatically, but understanding it is crucial for debugging and API development.
Key Characteristics
- Replaces unsafe characters with %XX hexadecimal format
- Space can be encoded as %20 or + (in form data)
- Preserves alphanumeric characters (A-Z, a-z, 0-9)
- Safe characters include - _ . ~
- UTF-8 characters are encoded as multiple %XX sequences
- Case-insensitive for hex digits (%2f equals %2F)
Common Use Cases
- Query string parameters in URLs
- Form data submission (application/x-www-form-urlencoded)
- API request parameters
- Encoding special characters in file paths
- Creating safe URLs for sharing
Example
Loading code...Frequently Asked Questions
Why is URL encoding necessary?
URL encoding is necessary because URLs can only contain a limited set of ASCII characters. Special characters like spaces, ampersands, and non-ASCII characters must be encoded to ensure proper transmission and interpretation across different systems.
What is the difference between %20 and + for encoding spaces?
Both represent spaces, but %20 is the standard percent-encoding while + is specifically used in application/x-www-form-urlencoded content (HTML form submissions). In query strings, + is often used, but %20 is more universally compatible.
Which characters do not need URL encoding?
Unreserved characters that don't need encoding include uppercase and lowercase letters (A-Z, a-z), digits (0-9), and four special characters: hyphen (-), underscore (_), period (.), and tilde (~).
How are non-ASCII characters like Chinese or emoji encoded in URLs?
Non-ASCII characters are first converted to their UTF-8 byte sequences, then each byte is percent-encoded. For example, a Chinese character might become three %XX sequences, and an emoji might become four %XX sequences.
Should I encode the entire URL or just certain parts?
You should only encode the values in query parameters and path segments, not the structural characters like :, /, ?, &, and =. Encoding these delimiters would break the URL structure and make it unparseable.