What is Regex?
Regex (Regular Expression) is a sequence of characters that defines a search pattern, primarily used for pattern matching within strings. Regular expressions provide a powerful and flexible way to search, match, and manipulate text.
Quick Facts
| Full Name | Regular Expression |
|---|---|
| Created | 1951 by Stephen Cole Kleene (formalized in 1968) |
| Specification | Official Specification |
How Regex Works
Regular expressions use a combination of literal characters and metacharacters to define patterns. Metacharacters like . (any character), * (zero or more), + (one or more), ? (zero or one), and [] (character class) provide pattern-matching capabilities. Anchors like ^ (start) and $ (end) specify positions. Groups () capture matched text, and alternation | provides OR logic. Most programming languages support regex through built-in functions or libraries, though syntax may vary slightly between implementations (PCRE, JavaScript, Python, etc.).
Key Characteristics
- Pattern-based text matching and manipulation
- Support for quantifiers (*, +, ?, {n,m})
- Character classes and ranges ([a-z], [0-9], \d, \w)
- Anchors for position matching (^, $, \b)
- Grouping and capturing with parentheses
- Lookahead and lookbehind assertions
Common Use Cases
- Form validation (email, phone, password patterns)
- Search and replace operations in text editors
- Data extraction and web scraping
- Log file parsing and analysis
- Input sanitization and security filtering
Example
Email validation:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Phone number (US):
^\(?\d{3}\)?[-.]?\d{3}[-.]?\d{4}$
URL extraction:
https?://[\w.-]+(?:/[\w.-]*)*