Regex Tester
Test regular expressions with real-time match highlighting. 100% client-side — your data stays in your browser.
Regular Expressions: A Practical Introduction
Regular expressions (regex) are patterns that match text. They're built into virtually every programming language, text editor, and command-line tool. A regex like /\d{3}-\d{4}/ matches phone number patterns like "555-1234". Learning regex is one of the highest-leverage skills in programming — a single regex can replace dozens of lines of string-parsing code.
The basics are straightforward. . matches any character. \d matches any digit. \w matches any word character (letter, digit, underscore). * means "zero or more." + means "one or more." ? means "zero or one." {3} means "exactly 3." [abc] matches any one of a, b, or c. ^ matches the start of a line. $ matches the end.
Where regex gets complex is combining these primitives. /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/ is a basic email validator. It looks intimidating, but break it down piece by piece and each part is simple: "one or more alphanumeric or special characters, then @, then one or more alphanumeric characters with dots, then a dot, then 2+ letters."
Common Regex Patterns
Email validation: /^[\w.+-]+@[\w-]+\.[a-zA-Z]{2,}$/ — catches most valid emails. Note that truly comprehensive email validation via regex is practically impossible due to the RFC 5322 spec's complexity.
URL matching: /https?:\/\/[\w.-]+(?:\.[\w]{2,})(?:\/[^\s]*)?/ — matches http and https URLs with optional paths.
IP address: /\b(?:\d{1,3}\.){3}\d{1,3}\b/ — matches IPv4 addresses (doesn't validate range, so 999.999.999.999 would match).
Date (YYYY-MM-DD): /\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])/ — matches ISO 8601 date format with basic validation.
Phone number (US): /(?:\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/ — matches various US phone formats like (555) 123-4567, 555-123-4567, +1 555 123 4567.
Regex Performance and Gotchas
Catastrophic backtracking is the biggest performance trap in regex. Patterns like /(a+)+b/ can take exponential time on inputs like "aaaaaaaaaaaaaaac" because the engine tries every possible way to divide the a's between the inner and outer groups before failing. Avoid nested quantifiers on overlapping patterns.
Greedy vs. lazy matching. By default, .* is greedy — it matches as much as possible. .*? is lazy — it matches as little as possible. This matters when parsing HTML: /<div>.*<\/div>/ matches from the first <div> to the last </div> on the line. Add ? to match the closest closing tag instead.
Don't parse HTML with regex. This is a classic mistake. HTML is a context-free grammar that regex fundamentally can't parse correctly. Use a proper HTML parser (DOMParser in browsers, cheerio in Node.js) for anything beyond trivial text extraction.