What is Regex?
Regex (Regular Expression) is a sequence of characters that defines a search pattern, primarily used for pattern matching within strings. Regular expressions provide a powerful and flexible way to search, match, and manipulate text.
Quick Facts
| Full Name | Regular Expression |
|---|---|
| Created | 1951 by Stephen Cole Kleene (formalized in 1968) |
| Specification | Official Specification |
How It Works
Regular expressions use a combination of literal characters and metacharacters to define patterns. Metacharacters like . (any character), * (zero or more), + (one or more), ? (zero or one), and [] (character class) provide pattern-matching capabilities. Anchors like ^ (start) and $ (end) specify positions. Groups () capture matched text, and alternation | provides OR logic. Most programming languages support regex through built-in functions or libraries, though syntax may vary slightly between implementations (PCRE, JavaScript, Python, etc.). Regex implementations vary across languages. JavaScript uses /pattern/flags syntax with limited lookbehind support (added in ES2018). Python's re module supports verbose mode with comments. Go's regexp uses RE2 engine without backtracking for guaranteed linear time. PCRE (Perl Compatible Regular Expressions) offers the most features but can be vulnerable to ReDoS attacks.
Key Characteristics
- Pattern-based text matching and manipulation
- Support for quantifiers (*, +, ?, {n,m})
- Character classes and ranges ([a-z], [0-9], \d, \w)
- Anchors for position matching (^, $, \b)
- Grouping and capturing with parentheses
- Lookahead and lookbehind assertions
Common Use Cases
- Form validation (email, phone, password patterns)
- Search and replace operations in text editors
- Data extraction and web scraping
- Log file parsing and analysis
- Input sanitization and security filtering
Example
Loading code...Frequently Asked Questions
What do the symbols *, +, and ? mean in regex?
These are quantifiers: * matches zero or more occurrences of the preceding element, + matches one or more occurrences, and ? matches zero or one occurrence. For example, 'ab*c' matches 'ac', 'abc', 'abbc', while 'ab+c' requires at least one 'b', and 'ab?c' matches only 'ac' or 'abc'.
How do I match special characters like dots or brackets in regex?
Escape special characters with a backslash (\). To match a literal dot, use \. instead of . (which matches any character). Similarly, use \[, \], \(, \), \*, \+, \?, \{, \}, \^, \$, \|, and \\ for their literal meanings.
What is the difference between greedy and lazy matching?
Greedy matching (default) matches as much as possible. Lazy matching (add ? after quantifier) matches as little as possible. For '<div>text</div>' with pattern '<.*>', greedy matches the entire string, while '<.*?>' matches just '<div>'. Lazy matching is useful when you want the shortest match.
How can I test and debug my regular expressions?
Use online tools like regex101.com, regexr.com, or debuggex.com which provide real-time matching, explanations, and testing against sample text. Most code editors also have regex search features. Start with simple patterns and build complexity gradually, testing at each step.