Regular Expressions Complete Guide [2026] - From Beginner to Expert

Regular Expressions (Regex) are powerful text pattern matching tools supported by virtually all modern programming languages. Whether it's data validation, text search and replace, or log analysis, regular expressions are essential skills for every developer. This guide will take you from zero to mastery of regular expressions.

Key Takeaways
What is a Regular Expression?
Basic Regex Syntax
Advanced Features
Common Regex Patterns
Code Examples
- JavaScript
- Python
- Java
- Go
Regex Best Practices
FAQ
Summary

Key Takeaways

Pattern Matching: Regular expressions are a language for describing string patterns, used for searching, matching, and manipulating text.
Universal Support: Almost all programming languages support regular expressions with similar syntax.
Powerful Capabilities: Complex text patterns can be described with concise expressions.
Performance Considerations: Complex regex can cause performance issues; design carefully.
Readability: Regex can be hard to read; consider adding comments or splitting complex patterns.
Testing is Critical: Always thoroughly test regex before using in production.

Want to quickly test your regular expressions? Try our free online tool with real-time matching and support for multiple programming language syntaxes.

Test Your Regex Now - Free Online Regex Tester

What is a Regular Expression?

A Regular Expression (Regex) is a formal language for describing string patterns. It originated from mathematical theory in the 1950s, first proposed by mathematician Stephen Cole Kleene. Today, regular expressions have become the standard tool for text processing, widely used for:

Data Validation: Validating user input (email, phone, password, etc.)
Text Search: Finding specific patterns in large amounts of text
Text Replacement: Batch modifying text that matches patterns
Data Extraction: Extracting structured information from text
Log Analysis: Parsing and analyzing log files

The core idea of regular expressions is to use special characters and rules to describe a class of strings, rather than a specific string.

Basic Regex Syntax

Character Matching

The most basic regular expressions are literal characters that match themselves:

Pattern	Description	Example
`abc`	Matches literal string "abc"	"abc" ✓, "abcd" ✓
`.`	Matches any single character (except newline)	"a.c" matches "abc", "a1c"
`\d`	Matches any digit [0-9]	"\d\d" matches "42"
`\D`	Matches any non-digit	"\D" matches "a"
`\w`	Matches word character [a-zA-Z0-9_]	"\w+" matches "hello_123"
`\W`	Matches non-word character	"\W" matches "@"
`\s`	Matches whitespace (space, tab, etc.)	"a\sb" matches "a b"
`\S`	Matches non-whitespace	"\S+" matches "hello"
`\\`	Matches backslash itself	"\\" matches "\"

Quantifiers

Quantifiers specify how many times the preceding element can occur:

Quantifier	Description	Example
`*`	Matches 0 or more times	`a*` matches "", "a", "aaa"
`+`	Matches 1 or more times	`a+` matches "a", "aaa", not ""
`?`	Matches 0 or 1 time	`a?` matches "", "a"
`{n}`	Matches exactly n times	`a{3}` matches "aaa"
`{n,}`	Matches at least n times	`a{2,}` matches "aa", "aaa", "aaaa"
`{n,m}`	Matches n to m times	`a{2,4}` matches "aa", "aaa", "aaaa"

Anchors

Anchors match positions rather than characters:

Anchor	Description	Example
`^`	Matches start of string	`^hello` matches strings starting with "hello"
`$`	Matches end of string	`world$` matches strings ending with "world"
`\b`	Matches word boundary	`\bcat\b` matches "cat" but not "category"
`\B`	Matches non-word boundary	`\Bcat` matches "cat" in "category"

Groups and Capturing

Groups allow you to treat multiple characters as a single unit:

Syntax	Description	Example
`(abc)`	Capturing group, matches and remembers "abc"	`(ab)+` matches "abab"
`(?:abc)`	Non-capturing group, matches but doesn't remember	`(?:ab)+` matches "abab"
`\1, \2`	Backreference to nth capturing group	`(a)(b)\1\2` matches "abab"
`(?<name>abc)`	Named capturing group	`(?<year>\d{4})`
`(a\|b)`	Alternation, matches a or b	`(cat\|dog)` matches "cat" or "dog"

Character Classes

Character classes define a set of characters that can match:

Syntax	Description	Example
`[abc]`	Matches any one of a, b, or c	`[aeiou]` matches vowels
`[^abc]`	Matches any character except a, b, c	`[^0-9]` matches non-digits
`[a-z]`	Matches any character from a to z	`[A-Za-z]` matches any letter
`[0-9]`	Matches any digit from 0 to 9	Equivalent to `\d`

Advanced Features

Lookaround Assertions

Lookaround assertions match positions without consuming characters:

Syntax	Name	Description
`(?=pattern)`	Positive Lookahead	Matches position followed by pattern
`(?!pattern)`	Negative Lookahead	Matches position not followed by pattern
`(?<=pattern)`	Positive Lookbehind	Matches position preceded by pattern
`(?<!pattern)`	Negative Lookbehind	Matches position not preceded by pattern

Examples:

code

# Positive lookahead: Match digits followed by "USD"
\d+(?=USD)
Input: "100USD" → Matches "100"

# Negative lookahead: Match "foo" not followed by "bar"
foo(?!bar)
Input: "foobaz" → Matches "foo"
Input: "foobar" → No match

# Positive lookbehind: Match digits preceded by "$"
(?<=\$)\d+
Input: "$100" → Matches "100"

# Negative lookbehind: Match "happy" not preceded by "un"
(?<!un)happy
Input: "happy" → Matches
Input: "unhappy" → No match

Greedy vs Non-Greedy Matching

By default, quantifiers are greedy and match as many characters as possible. Adding ? after a quantifier makes it non-greedy (lazy):

Greedy	Non-Greedy	Description
`*`	`*?`	Match 0 or more, as few as possible
`+`	`+?`	Match 1 or more, as few as possible
`?`	`??`	Match 0 or 1, as few as possible
`{n,m}`	`{n,m}?`	Match n to m, as few as possible

Example:

code

Input: "<div>hello</div><div>world</div>"

Greedy: <div>.*</div>
Result: "<div>hello</div><div>world</div>" (matches entire string)

Non-greedy: <div>.*?</div>
Result: "<div>hello</div>" (matches first div only)

Flags/Modifiers

Flags modify how the regex engine matches:

Flag	Description
`i`	Case-insensitive matching
`g`	Global matching (find all matches)
`m`	Multiline mode (^ and $ match line start/end)
`s`	Single-line mode (. matches newlines too)
`u`	Unicode mode
`x`	Extended mode (ignore whitespace, allow comments)

Common Regex Patterns

Email Validation

regex

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

Breakdown:

^ - Start of string
[a-zA-Z0-9._%+-]+ - Username part with letters, digits, and special chars
@ - At symbol
[a-zA-Z0-9.-]+ - Domain part
\. - Dot
[a-zA-Z]{2,} - TLD, at least 2 letters
$ - End of string

Test Cases:

✓ user@example.com
✓ john.doe+tag@company.co.uk
✗ invalid@
✗ @nodomain.com

Phone Number Validation

US Phone Number:

regex

^(\+1)?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$

International Phone (E.164 format):

regex

^\+?[1-9]\d{1,14}$

Breakdown:

^\+? - Optional plus sign at start
[1-9] - First digit cannot be zero
\d{1,14} - 1 to 14 more digits
$ - End of string

URL Validation

regex

^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$

More Complete URL Validation:

regex

^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$

Breakdown:

^(https?|ftp):\/\/ - Protocol part
[^\s/$.?#] - First character of domain
[^\s]* - Rest of URL
$ - End of string

IP Address Validation

IPv4 Address:

regex

^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$

Breakdown:

25[0-5] - Matches 250-255
2[0-4]\d - Matches 200-249
[01]?\d\d? - Matches 0-199
\. - Dot separator
{3} - First three octets
Last octet without trailing dot

IPv6 Address:

regex

^([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$

Password Strength Validation

At least 8 characters with uppercase, lowercase, and digit:

regex

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d]{8,}$

Stronger Password (with special characters):

regex

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

Breakdown:

(?=.*[a-z]) - At least one lowercase letter
(?=.*[A-Z]) - At least one uppercase letter
(?=.*\d) - At least one digit
(?=.*[@$!%*?&]) - At least one special character
{8,} - At least 8 characters

Credit Card Validation

Visa:

regex

^4[0-9]{12}(?:[0-9]{3})?$

Mastercard:

regex

^5[1-5][0-9]{14}$

General Credit Card (Luhn algorithm validation needed separately):

regex

^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})$

Code Examples

JavaScript

javascript

// Basic matching
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
const email = "user@example.com";
console.log(emailRegex.test(email)); // true

// Using match to extract matches
const text = "Contact: 123-456-7890 or 098-765-4321";
const phoneRegex = /\d{3}-\d{3}-\d{4}/g;
const phones = text.match(phoneRegex);
console.log(phones); // ["123-456-7890", "098-765-4321"]

// Using capturing groups
const urlRegex = /^(https?):\/\/([^\/]+)(\/.*)?$/;
const url = "https://example.com/path/to/page";
const match = url.match(urlRegex);
if (match) {
  console.log("Protocol:", match[1]); // "https"
  console.log("Domain:", match[2]);   // "example.com"
  console.log("Path:", match[3]);     // "/path/to/page"
}

// Using named capturing groups
const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const dateMatch = "2026-01-12".match(dateRegex);
console.log(dateMatch.groups.year);  // "2026"
console.log(dateMatch.groups.month); // "01"
console.log(dateMatch.groups.day);   // "12"

// Replacement operations
const masked = "1234567890".replace(/(\d{3})\d{4}(\d{3})/, "$1****$2");
console.log(masked); // "123****890"

// Using exec for iterative matching
const regex = /\d+/g;
const str = "Price: $100, Quantity: 50";
let result;
while ((result = regex.exec(str)) !== null) {
  console.log(`Found ${result[0]} at position ${result.index}`);
}
// Output:
// Found 100 at position 8
// Found 50 at position 22

Python

python

import re

# Basic matching
email_pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
email = "user@example.com"
if re.match(email_pattern, email):
    print("Email is valid")

# Find all matches
text = "Contact: 123-456-7890 or 098-765-4321"
phones = re.findall(r'\d{3}-\d{3}-\d{4}', text)
print(phones)  # ['123-456-7890', '098-765-4321']

# Using capturing groups
url_pattern = r'^(https?):\/\/([^\/]+)(\/.*)?$'
url = "https://example.com/path/to/page"
match = re.match(url_pattern, url)
if match:
    print(f"Protocol: {match.group(1)}")  # https
    print(f"Domain: {match.group(2)}")    # example.com
    print(f"Path: {match.group(3)}")      # /path/to/page

# Using named capturing groups
date_pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
date_match = re.match(date_pattern, "2026-01-12")
if date_match:
    print(date_match.group('year'))   # 2026
    print(date_match.group('month'))  # 01
    print(date_match.group('day'))    # 12

# Replacement operations
phone = "1234567890"
masked = re.sub(r'(\d{3})\d{4}(\d{3})', r'\1****\2', phone)
print(masked)  # 123****890

# Compile regex for better performance
pattern = re.compile(r'\d+')
numbers = pattern.findall("Price: $100, Quantity: 50")
print(numbers)  # ['100', '50']

# Using finditer to get match objects
for match in re.finditer(r'\d+', "Price: $100, Quantity: 50"):
    print(f"Found {match.group()} at position {match.start()}-{match.end()}")

Java

java

import java.util.regex.*;
import java.util.ArrayList;
import java.util.List;

public class RegexExample {
    public static void main(String[] args) {
        // Basic matching
        String emailPattern = "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$";
        String email = "user@example.com";
        boolean isValid = email.matches(emailPattern);
        System.out.println("Email valid: " + isValid);

        // Find all matches
        String text = "Contact: 123-456-7890 or 098-765-4321";
        Pattern phonePattern = Pattern.compile("\\d{3}-\\d{3}-\\d{4}");
        Matcher matcher = phonePattern.matcher(text);
        List<String> phones = new ArrayList<>();
        while (matcher.find()) {
            phones.add(matcher.group());
        }
        System.out.println(phones); // [123-456-7890, 098-765-4321]

        // Using capturing groups
        String urlPattern = "^(https?)://([^/]+)(/.*)?$";
        String url = "https://example.com/path/to/page";
        Pattern pattern = Pattern.compile(urlPattern);
        Matcher urlMatcher = pattern.matcher(url);
        if (urlMatcher.matches()) {
            System.out.println("Protocol: " + urlMatcher.group(1)); // https
            System.out.println("Domain: " + urlMatcher.group(2));   // example.com
            System.out.println("Path: " + urlMatcher.group(3));     // /path/to/page
        }

        // Using named capturing groups (Java 7+)
        String datePattern = "(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})";
        Pattern dateRegex = Pattern.compile(datePattern);
        Matcher dateMatcher = dateRegex.matcher("2026-01-12");
        if (dateMatcher.matches()) {
            System.out.println("Year: " + dateMatcher.group("year"));
            System.out.println("Month: " + dateMatcher.group("month"));
            System.out.println("Day: " + dateMatcher.group("day"));
        }

        // Replacement operations
        String phone = "1234567890";
        String masked = phone.replaceAll("(\\d{3})\\d{4}(\\d{3})", "$1****$2");
        System.out.println(masked); // 123****890
    }
}

Go

package main

import (
    "fmt"
    "regexp"
)

func main() {
    // Basic matching
    emailPattern := `^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`
    emailRegex := regexp.MustCompile(emailPattern)
    email := "user@example.com"
    fmt.Println("Email valid:", emailRegex.MatchString(email))

    // Find all matches
    text := "Contact: 123-456-7890 or 098-765-4321"
    phoneRegex := regexp.MustCompile(`\d{3}-\d{3}-\d{4}`)
    phones := phoneRegex.FindAllString(text, -1)
    fmt.Println(phones) // [123-456-7890 098-765-4321]

    // Using capturing groups
    urlPattern := `^(https?)://([^/]+)(/.*)?$`
    urlRegex := regexp.MustCompile(urlPattern)
    url := "https://example.com/path/to/page"
    matches := urlRegex.FindStringSubmatch(url)
    if len(matches) > 0 {
        fmt.Println("Protocol:", matches[1]) // https
        fmt.Println("Domain:", matches[2])   // example.com
        fmt.Println("Path:", matches[3])     // /path/to/page
    }

    // Using named capturing groups
    datePattern := `(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})`
    dateRegex := regexp.MustCompile(datePattern)
    dateMatch := dateRegex.FindStringSubmatch("2026-01-12")
    names := dateRegex.SubexpNames()
    for i, name := range names {
        if name != "" && i < len(dateMatch) {
            fmt.Printf("%s: %s\n", name, dateMatch[i])
        }
    }

    // Replacement operations
    phone := "1234567890"
    replaceRegex := regexp.MustCompile(`(\d{3})\d{4}(\d{3})`)
    masked := replaceRegex.ReplaceAllString(phone, "$1****$2")
    fmt.Println(masked) // 123****890

    // Using ReplaceAllStringFunc for complex replacements
    text2 := "Price $100"
    numRegex := regexp.MustCompile(`\d+`)
    result := numRegex.ReplaceAllStringFunc(text2, func(s string) string {
        return "[" + s + "]"
    })
    fmt.Println(result) // Price $[100]
}

Regex Best Practices

1. Keep It Simple

Complex regex is hard to maintain and debug. If possible, split complex patterns into multiple simple ones:

javascript

// Not recommended: One complex regex
const complexRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

// Recommended: Multiple simple checks
function validatePassword(password) {
  if (password.length < 8) return false;
  if (!/[a-z]/.test(password)) return false;
  if (!/[A-Z]/.test(password)) return false;
  if (!/\d/.test(password)) return false;
  if (!/[@$!%*?&]/.test(password)) return false;
  return true;
}

2. Use Non-Capturing Groups

If you don't need to capture the match, use non-capturing groups (?:...) for better performance:

javascript

// Capturing group (saves match result)
/(cat|dog) food/

// Non-capturing group (doesn't save, more efficient)
/(?:cat|dog) food/

3. Avoid Catastrophic Backtracking

Certain regex patterns can cause exponential backtracking and severe performance issues:

javascript

// Dangerous: Can cause catastrophic backtracking
/(a+)+$/

// Safe: Use atomic groups or more precise patterns
/a+$/

4. Pre-compile Regular Expressions

When using regex in loops, pre-compile for better performance:

python

import re

# Not recommended: Compiles every iteration
for line in lines:
    if re.match(r'\d+', line):
        process(line)

# Recommended: Pre-compile
pattern = re.compile(r'\d+')
for line in lines:
    if pattern.match(line):
        process(line)

5. Use Anchors

When you know the match position, use anchors to improve performance:

javascript

// Not recommended: Searches entire string
/hello/

// Recommended: If you know it's at the start
/^hello/

6. Test Edge Cases

Always test various edge cases before production use:

Empty strings
Very long strings
Special characters
Unicode characters
Newline characters

FAQ

What's the difference between regex and wildcards?

Wildcards (like * and ?) are simplified pattern matching, mainly used for filename matching. Regular expressions are more powerful, supporting complex pattern descriptions, capturing groups, assertions, and other advanced features.

Feature	Wildcards	Regular Expressions
`*`	Matches any characters	Matches preceding char 0+ times
`?`	Matches single character	Matches preceding char 0 or 1 time
Complexity	Simple	Powerful but complex
Use Case	Filename matching	Text processing, validation

How do I debug complex regular expressions?

Use online tools: Like our Regex Tester to see matches in real-time
Build incrementally: Start with simple patterns and add complexity gradually
Add comments: Use extended mode (x flag) to add comments
Use visualization tools: Convert regex to visual diagrams

How can I optimize regex performance?

Use anchors to limit search scope
Avoid unnecessary capturing groups
Use non-greedy matching when appropriate
Pre-compile regular expressions
Avoid nested quantifiers (like (a+)+)
Use more specific character classes

What are the differences between regex in different programming languages?

Most programming languages use similar regex syntax (PCRE-style), but there are subtle differences:

Feature	JavaScript	Python	Java	Go
Lookbehind	✓ (ES2018+)	✓	✓	✗
Named Groups	`(?<name>)`	`(?P<name>)`	`(?<name>)`	`(?P<name>)`
Unicode	Needs u flag	Default	Default	Default
Atomic Groups	✗	✗	✓	✗

How can I test regex without writing code?

You can use online tools like our free Regex Tester to:

Test regular expressions in real-time
View matches and capturing groups
Get code examples in multiple programming languages
Save and share your regular expressions

Summary

Regular expressions are a core skill every developer should master. While the learning curve may be steep, once mastered, they will greatly improve your text processing efficiency.

Quick Summary:

Start with basic syntax: character matching, quantifiers, anchors
Master groups and capturing groups
Learn advanced features like lookaround assertions
Memorize common patterns (email, phone, URL, etc.)
Follow performance optimization best practices
Practice and test frequently

Ready to test your regular expressions? Try our free online tool:

Test Your Regex Now - Free Online Regex Tester

Regular Expressions Complete Guide [2026] - From Beginner to Expert

Table of Contents

Key Takeaways

What is a Regular Expression?

Basic Regex Syntax

Character Matching

Quantifiers

Anchors

Groups and Capturing

Character Classes

Advanced Features

Lookaround Assertions

Greedy vs Non-Greedy Matching

Flags/Modifiers

Common Regex Patterns

Email Validation

Phone Number Validation

URL Validation

IP Address Validation

Password Strength Validation

Credit Card Validation

Code Examples

JavaScript

Python

Java

Go

Regex Best Practices

1. Keep It Simple

2. Use Non-Capturing Groups

3. Avoid Catastrophic Backtracking

4. Pre-compile Regular Expressions

5. Use Anchors

6. Test Edge Cases

FAQ

What's the difference between regex and wildcards?

How do I debug complex regular expressions?

How can I optimize regex performance?

What are the differences between regex in different programming languages?

How can I test regex without writing code?

Summary