readable-regex

A fluent, chainable Python API for building regular expressions that read like English.

$ pip install readable-regex

Items — what you match

SingularPlural (1+)Regex
digitdigits\d / \d+
wordwords\w / \w+
letterletters[a-zA-Z] / [a-zA-Z]+
whitespacewhitespaces\s / \s+
any_charany_chars. / .+
then('text')escaped literal
any_of('a', 'b')[ab]

Modifiers — how you constrain

ModifierEffect
exactly(3)\d{3}
between(1, 3)\d{1,3}
optional\d?
zero_or_more\d*
starts_with / ends_with^ / $
ignore_casecase-insensitive
exclude.digits\D+
capture(builder)(...)

Raw regex vs readable-regex

Raw regex
import re
re.compile(r'\w+@\w+\.\w+')
re.compile(r'\d{3}-\d{3}-\d{4}')
re.compile(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')
re.compile(r'(\w+)=(.+)')
readable-regex
regex.words.then('@').words.then('.').words
regex.digit.exactly(3).then('-').digit.exactly(3).then('-').digit.exactly(4)
regex.digit.between(1,3).then('.').digit.between(1,3).then('.')…
regex.capture(regex.words).then('=').capture(regex.any_chars)
1

Email pattern

Plurals make common patterns extremely concise — words already means "one or more word characters".

Python
from readable_regex import regex

email = regex.words.then('@').words.then('.').words

email.test("user@example.com")   # True
email.test("bad@@address")      # False
Pattern
\w+@\w+\.\w+
2

Phone number

Use digit (singular) with exactly(n) for fixed-width fields.

Python
phone = (
    regex
    .digit.exactly(3).then('-')
    .digit.exactly(3).then('-')
    .digit.exactly(4)
)

phone.test("123-456-7890")  # True
phone.test("12-34-5678")    # False
Pattern
\d{3}\-\d{3}\-\d{4}
3

IP address

Use between(min, max) for variable-width segments.

Python
ip = (
    regex
    .digit.between(1, 3).then('.')
    .digit.between(1, 3).then('.')
    .digit.between(1, 3).then('.')
    .digit.between(1, 3)
)

ip.test("192.168.1.1")   # True
ip.test("not.an.ip")     # False
Pattern
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
4

Extract all numbers

Terminal methods like find_all() execute the pattern directly.

Python
text = "Order #42 has 3 items totaling $129"
regex.digits.find_all(text)
Result
['42', '3', '129']
5

Capturing groups

Wrap a sub-builder in capture() to create a capturing group.

Python
kv = regex.capture(regex.words).then('=').capture(regex.any_chars)

m = kv.search("color=blue")
m.group(1)  # 'color'
m.group(2)  # 'blue'
Pattern
(\w+)=(.+)
6

Case-insensitive & flags

Flags are properties that chain naturally at the end.

Python
greeting = regex.starts_with('hello').ignore_case

greeting.test("HELLO world")  # True
greeting.test("Hello World")  # True
greeting.test("hey there")    # False
7

Search & replace

Use replace(text, repl) as a terminal method.

Python
text = "My SSN is 123-45-6789 and PIN is 9876"
regex.digits.replace(text, "***")
Result
'My SSN is ***-***-*** and PIN is ***'
8

Splitting text

Split on a comma followed by optional whitespace.

Python
regex.then(',').whitespace.zero_or_more.split("apple, banana,cherry, date")
Result
['apple', 'banana', 'cherry', 'date']
9

Negated classes with exclude

Use exclude as a property to access negated character classes.

Python
# Match runs of non-digit characters
regex.exclude.digits.find_all("a1b2c3")

# Words without underscores
regex.words.excluding('_')
Results
['a', 'b', 'c']
Pattern: [^\W_]+
10

Immutable builder — safe reuse

Every method returns a new builder. Branch freely from any saved pattern.

Python
base = regex.starts_with('LOG-')

errors = base.then('ERROR').any_chars
warns  = base.then('WARN').any_chars

errors.test("LOG-ERROR disk full")   # True
warns.test("LOG-WARN low memory")   # True
errors.test("LOG-INFO started")    # False

# base is unchanged:
base.pattern  # '^LOG\\-'
11

Debug with .pattern

Inspect the raw regex at any point in the chain.

Python
p = regex.starts_with().words.then('@').words.ends_with()

p.pattern    # '^\w+@\w+$'
p.compile()  # re.Pattern object (cached)