readable-regex

A fluent, chainable Python API for building regular expressions that read like English.

$ pip install readable-regex

Items — what you match

Singular	Plural (1+)	Regex
`digit`	`digits`	`\d` / `\d+`
`word`	`words`	`\w` / `\w+`
`letter`	`letters`	`[a-zA-Z]` / `[a-zA-Z]+`
`whitespace`	`whitespaces`	`\s` / `\s+`
`any_char`	`any_chars`	`.` / `.+`
`then`(`'text'`)		escaped literal
`any_of`(`'a'`, `'b'`)		`[ab]`

Modifiers — how you constrain

Modifier	Effect
`exactly`(`3`)	`\d{3}`
`between`(`1`, `3`)	`\d{1,3}`
`optional`	`\d?`
`zero_or_more`	`\d*`
`starts_with` / `ends_with`	`^` / `$`
`ignore_case`	case-insensitive
`exclude`.`digits`	`\D+`
`capture`(builder)	`(...)`

Raw regex vs readable-regex

Raw regex

import re
re.compile(r'\w+@\w+\.\w+')
re.compile(r'\d{3}-\d{3}-\d{4}')
re.compile(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')
re.compile(r'(\w+)=(.+)')

readable-regex

regex.words.then('@').words.then('.').words
regex.digit.exactly(3).then('-').digit.exactly(3).then('-').digit.exactly(4)
regex.digit.between(1,3).then('.').digit.between(1,3).then('.')…
regex.capture(regex.words).then('=').capture(regex.any_chars)

Email pattern

Plurals make common patterns extremely concise — words already means "one or more word characters".

Python

from readable_regex import regex

email = regex.words.then('@').words.then('.').words

email.test("user@example.com")   # True
email.test("bad@@address")      # False

Pattern

\w+@\w+\.\w+

Phone number

Use digit (singular) with exactly(n) for fixed-width fields.

Python

phone = (
    regex
    .digit.exactly(3).then('-')
    .digit.exactly(3).then('-')
    .digit.exactly(4)
)

phone.test("123-456-7890")  # True
phone.test("12-34-5678")    # False

Pattern

\d{3}\-\d{3}\-\d{4}

IP address

Use between(min, max) for variable-width segments.

Python

ip = (
    regex
    .digit.between(1, 3).then('.')
    .digit.between(1, 3).then('.')
    .digit.between(1, 3).then('.')
    .digit.between(1, 3)
)

ip.test("192.168.1.1")   # True
ip.test("not.an.ip")     # False

Pattern

\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

Extract all numbers

Terminal methods like find_all() execute the pattern directly.

Python

text = "Order #42 has 3 items totaling $129"
regex.digits.find_all(text)

Result

['42', '3', '129']

Capturing groups

Wrap a sub-builder in capture() to create a capturing group.

Python

kv = regex.capture(regex.words).then('=').capture(regex.any_chars)

m = kv.search("color=blue")
m.group(1)  # 'color'
m.group(2)  # 'blue'

Pattern

(\w+)=(.+)

Case-insensitive & flags

Flags are properties that chain naturally at the end.

Python

greeting = regex.starts_with('hello').ignore_case

greeting.test("HELLO world")  # True
greeting.test("Hello World")  # True
greeting.test("hey there")    # False

Search & replace

Use replace(text, repl) as a terminal method.

Python

text = "My SSN is 123-45-6789 and PIN is 9876"
regex.digits.replace(text, "***")

Result

'My SSN is ***-***-*** and PIN is ***'

Splitting text

Split on a comma followed by optional whitespace.

Python

regex.then(',').whitespace.zero_or_more.split("apple, banana,cherry, date")

Result

['apple', 'banana', 'cherry', 'date']

Negated classes with exclude

Use exclude as a property to access negated character classes.

Python

# Match runs of non-digit characters
regex.exclude.digits.find_all("a1b2c3")

# Words without underscores
regex.words.excluding('_')

Results

['a', 'b', 'c']
Pattern: [^\W_]+

Immutable builder — safe reuse

Every method returns a new builder. Branch freely from any saved pattern.

Python

base = regex.starts_with('LOG-')

errors = base.then('ERROR').any_chars
warns  = base.then('WARN').any_chars

errors.test("LOG-ERROR disk full")   # True
warns.test("LOG-WARN low memory")   # True
errors.test("LOG-INFO started")    # False

# base is unchanged:
base.pattern  # '^LOG\\-'

Debug with .pattern

Inspect the raw regex at any point in the chain.

Python

p = regex.starts_with().words.then('@').words.ends_with()

p.pattern    # '^\w+@\w+$'
p.compile()  # re.Pattern object (cached)