
Stop Writing Regex by Hand: Generate Patterns From Examples
Regular expressions are a write-only language. The joke persists because it's true. A regex that took 20 minutes to write takes 40 minutes to understand six months later. The cognitive overhead of regex syntax, backtracking behavior, greedy vs lazy quantifiers, lookaheads, character class shorthand, creates a gap between "I know what pattern I want" and "I can express it correctly." The problem with manual regex Consider matching a US phone number. The "simple" pattern is \d{3}-\d{3}-\d{4} . But real phone numbers come in formats like (555) 123-4567, 555.123.4567, 555 123 4567, +1-555-123-4567, and 5551234567. A regex that handles all formats: ^(\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$ This took me several iterations to write. Each optional separator, the optional country code, the optional parentheses around the area code. And I haven't accounted for extensions, international formats, or the fact that some area codes start with 0 or 1 (which are invalid in the North American N
Continue reading on Dev.to Webdev
Opens in a new tab




