A fluent, chainable Python API for building regular expressions that read like English.
| Singular | Plural (1+) | Regex |
|---|---|---|
digit | digits | \d / \d+ |
word | words | \w / \w+ |
letter | letters | [a-zA-Z] / [a-zA-Z]+ |
whitespace | whitespaces | \s / \s+ |
any_char | any_chars | . / .+ |
then('text') | escaped literal | |
any_of('a', 'b') | [ab] | |
| Modifier | Effect |
|---|---|
exactly(3) | \d{3} |
between(1, 3) | \d{1,3} |
optional | \d? |
zero_or_more | \d* |
starts_with / ends_with | ^ / $ |
ignore_case | case-insensitive |
exclude.digits | \D+ |
capture(builder) | (...) |
import re
re.compile(r'\w+@\w+\.\w+')
re.compile(r'\d{3}-\d{3}-\d{4}')
re.compile(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')
re.compile(r'(\w+)=(.+)')
regex.words.then('@').words.then('.').words
regex.digit.exactly(3).then('-').digit.exactly(3).then('-').digit.exactly(4)
regex.digit.between(1,3).then('.').digit.between(1,3).then('.')…
regex.capture(regex.words).then('=').capture(regex.any_chars)
Plurals make common patterns extremely concise — words already means "one or more word characters".
from readable_regex import regex
email = regex.words.then('@').words.then('.').words
email.test("user@example.com") # True
email.test("bad@@address") # False
Use digit (singular) with exactly(n) for fixed-width fields.
phone = (
regex
.digit.exactly(3).then('-')
.digit.exactly(3).then('-')
.digit.exactly(4)
)
phone.test("123-456-7890") # True
phone.test("12-34-5678") # False
Use between(min, max) for variable-width segments.
ip = (
regex
.digit.between(1, 3).then('.')
.digit.between(1, 3).then('.')
.digit.between(1, 3).then('.')
.digit.between(1, 3)
)
ip.test("192.168.1.1") # True
ip.test("not.an.ip") # False
Terminal methods like find_all() execute the pattern directly.
text = "Order #42 has 3 items totaling $129"
regex.digits.find_all(text)
Wrap a sub-builder in capture() to create a capturing group.
kv = regex.capture(regex.words).then('=').capture(regex.any_chars)
m = kv.search("color=blue")
m.group(1) # 'color'
m.group(2) # 'blue'
Flags are properties that chain naturally at the end.
greeting = regex.starts_with('hello').ignore_case
greeting.test("HELLO world") # True
greeting.test("Hello World") # True
greeting.test("hey there") # False
Use replace(text, repl) as a terminal method.
text = "My SSN is 123-45-6789 and PIN is 9876"
regex.digits.replace(text, "***")
Split on a comma followed by optional whitespace.
regex.then(',').whitespace.zero_or_more.split("apple, banana,cherry, date")
Use exclude as a property to access negated character classes.
# Match runs of non-digit characters
regex.exclude.digits.find_all("a1b2c3")
# Words without underscores
regex.words.excluding('_')
Every method returns a new builder. Branch freely from any saved pattern.
base = regex.starts_with('LOG-')
errors = base.then('ERROR').any_chars
warns = base.then('WARN').any_chars
errors.test("LOG-ERROR disk full") # True
warns.test("LOG-WARN low memory") # True
errors.test("LOG-INFO started") # False
# base is unchanged:
base.pattern # '^LOG\\-'
Inspect the raw regex at any point in the chain.
p = regex.starts_with().words.then('@').words.ends_with()
p.pattern # '^\w+@\w+$'
p.compile() # re.Pattern object (cached)