
Fluent DSL for readable, composable regular expressions with type-safe builders, named captures, describe() output, reverse-engineering of raw regex, ReDoS analysis, zero match-time overhead.
A fluent Kotlin DSL that makes regular expressions readable.
Raw regular expressions are write-only. A week after authoring one, even the writer struggles to remember what it does:
// Raw regex — what does this match?
val emailRegex = Regex("[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}")With kexpresso the same constraint reads like English:
// kexpresso — self-documenting and composable
val emailPattern = kexpresso {
email()
}Or, for a richer pattern that you build up incrementally:
val strictEmail = kexpresso {
startOfText()
email()
endOfText()
}
strictEmail.matches("barista@coffee.shop") // true
strictEmail.matches("not an email") // falseTwo equivalent entry points —
kexpresso { }(top-level function) andKexpresso.pattern { }(object-oriented style) produce the sameKexpressoPattern. See Object-oriented entry point below.
Benefits at a glance:
Regex at construction time
(measured: 0 % match-time overhead vs raw Regex — see benchmarks).Is kexpresso right for your case? We're honest about it: it's great for complex, maintained patterns and a poor fit for trivial ones. Read When to use kexpresso — and when not to before adopting. Where we're headed: the Roadmap.
Clone and run the guided-tour sample — no extra setup, no credentials:
git clone https://github.com/elzinko/kexpresso && cd kexpresso && ./gradlew :samples:runThe console output walks you through every headline feature: building patterns, domain helpers,
typed captures, describe(), reverse-engineering a raw regex with Kexpresso.from(), and ReDoS
analysis.
Once io.github.elzinko:kexpresso lands in your own project (see Install below), the same
capabilities are one dependency away.
Kexpresso is published to Maven Central — no repository configuration and no token required. Just add the dependency.
groupId is now
io.github.elzinko(it wascom.github.elzinkoon JitPack/GitHub Packages). Maven Central requires theio.github.*namespace.
// build.gradle.kts
dependencies {
implementation("io.github.elzinko:kexpresso:0.9.0")
}mavenCentral() is in your repositories by default in most projects; add it if needed:
repositories {
mavenCentral()
}<dependency>
<groupId>io.github.elzinko</groupId>
<artifactId>kexpresso</artifactId>
<version>0.9.0</version>
</dependency>Maven Central is the recommended source. Two alternatives remain available:
com.github.elzinko:kexpresso:<tag>.
Serves jvm, js, wasmJs, linuxX64, mingwX64 only (no Apple/iOS — JitPack builds on Linux).
Add maven { url = uri("https://jitpack.io") } to your repositories.io.github.elzinko:kexpresso:0.9.0.
Requires a GitHub token (a GitHub limitation, even for public packages) — see
Where the artifacts are hosted.Kexpresso is a Kotlin Multiplatform library. The full DSL is written in
commonMain, so the builder, describe(), analyze(), captures, and the reverse
(regex → DSL) API are available on every supported target:
| Target | Status |
|---|---|
| JVM | ✅ published |
| JS (IR, Node.js) | ✅ published |
Wasm (wasmJs, Node.js) |
✅ published |
Native — linuxX64, mingwX64
|
✅ published |
Native — macosX64, macosArm64
|
✅ published |
Native — iosArm64, iosX64, iosSimulatorArm64
|
✅ published |
Built per host; published from macOS. Kotlin/Native targets only cross-compile from a capable host, so the build registers them conditionally: the Linux CI gate builds
linuxX64+mingwX64(fast), while macOS — the most capable host — builds every target. The release therefore runs on amacos-latestrunner and publishes the complete, consistent multiplatform metadata (JVM, JS, Wasm, Linux, Windows, macOS, iOS) from that single host, so a consumer resolving the root module sees every variant. A dedicatedApple & Nativeworkflow exercises the Apple/iOS targets on every PR.(Building the Apple targets locally requires a full Xcode install — a Command-Line-Tools-only macOS box still builds jvm/js/wasmJs and skips the Apple/Native targets with a warning.)
For a Gradle Multiplatform consumer, the dependency resolves automatically per target via Gradle module metadata:
kotlin {
sourceSets {
commonMain.dependencies {
implementation("io.github.elzinko:kexpresso:0.9.0")
}
}
}A plain-Maven (JVM-only) consumer must use the target-suffixed coordinate instead:
<dependency>
<groupId>io.github.elzinko</groupId>
<artifactId>kexpresso-jvm</artifactId>
<version>0.9.0</version>
</dependency>Breaking change (since the multiplatform release): artifact coordinates now carry a target suffix. Gradle resolves
io.github.elzinko:kexpresso:0.9.0to the right target automatically through Gradle metadata, but tools that ignore Gradle metadata (e.g. plain Maven) must referencekexpresso-jvmdirectly.
Not every target is available from every repository — choose your source accordingly:
| Repository | Targets served | Auth |
|---|---|---|
| Maven Central (the Install section) | all targets, incl. macosX64/macosArm64/iosArm64/iosX64/iosSimulatorArm64 |
none |
| GitHub Packages | all targets, incl. macosX64/macosArm64/iosArm64/iosX64/iosSimulatorArm64 |
a GitHub token |
| JitPack |
jvm, js, wasmJs, linuxX64, mingwX64
|
none |
Maven Central is the recommended source for every target (including Apple/iOS) with no authentication — the release runs on macOS and publishes the complete, signed multiplatform metadata. GitHub Packages serves the same full set but requires a token. JitPack builds on demand on Linux, so it can never produce the Apple/iOS artifacts.
To consume from GitHub Packages instead (a personal-access token with read:packages is
required even for public packages — a GitHub limitation):
// settings.gradle.kts
dependencyResolutionManagement {
repositories {
maven("https://maven.pkg.github.com/elzinko/kexpresso") {
credentials {
username = providers.gradleProperty("gpr.user").orNull ?: System.getenv("GITHUB_ACTOR")
password = providers.gradleProperty("gpr.key").orNull ?: System.getenv("GITHUB_TOKEN")
}
}
}
}The DSL builds the same regex string on every platform, but each non-JVM target uses its
own regex engine rather than the JVM's java.util.regex (PCRE-like) engine, so the supported
feature set narrows the further you get from the JVM. The portable common API — primitives,
quantifiers, character classes, alternation, simple/named groups, named & numeric
backreferences, lookahead, \b, literal escaping, describe(), toKexpressoCode(), captures,
and analyze() — works everywhere. Some JVM-flavoured constructs remain JVM-only at
runtime: they build fine but throw when compiled to a Regex on the smaller engines.
startOfText() / endOfText() (the
\A / \z anchors) are not valid ECMAScript — use startOfLine() / endOfLine()
(^ / $) for portable code. Atomic groups (?>…), possessive quantifiers (a++, a*+),
and some lookbehind forms are also JVM-only and only ever appear via raw(...) or
Kexpresso.from(...).wasmJs): runs on the same ECMAScript engine via the host; same caveats as JS.kotlin.text.Regex): ships a capable pure-Kotlin engine that is actually a
superset of ECMAScript here — it accepts the \A / \z / \Z / \G anchors, named
groups, named/numeric backreferences, lookahead, lookbehind, and atomic groups. Even so,
treat the JVM as the reference engine; exotic PCRE-only constructs reachable via raw(...)
may still differ.The whole commonTest portable suite (31 tests) passes identically on JVM, JS, Wasm, and the
built native targets. JVM-only constructs are exercised in the JVM-only jvmTest suite.
Literal escaping is portable: literal("a.b") renders as a\.b (a per-character
escaper) rather than the JVM-only \Qa.b\E; matching behaviour is identical everywhere.
toPattern() (conversion to java.util.regex.Pattern) is a JVM-only extension and is
not available on JS, Wasm, or Native.
val drinkName = kexpresso {
uppercaseLetter()
oneOrMore { letter() }
}
drinkName.matches("Espresso") // true
drinkName.matches("espresso") // false (no capital first letter)
drinkName.matches("Espresso42") // false (digit at the end)val wordPattern = kexpresso { word() }
val order = "Espresso Latte Cappuccino"
val drinks = wordPattern.findAll(order).map { it.value }.toList()
// ["Espresso", "Latte", "Cappuccino"]val emailValidator = kexpresso {
startOfText()
email()
endOfText()
}
emailValidator.matches("barista@coffee.shop") // true
emailValidator.matches("barista@coffee.shop extra") // false
emailValidator.matches("not-an-email") // falseval sentencePattern = kexpresso { sentence() }
sentencePattern.matches("Espresso is perfect!") // true
sentencePattern.matches("espresso is lowercase.") // false
sentencePattern.matches("No punctuation at the end") // false| Method | Regex produced | Notes |
|---|---|---|
literal(text) |
escaped text (e.g. a\.b) |
Escapes each regex metacharacter |
char(c) |
escaped char | Escapes metacharacters |
digit() |
\d |
Decimal digit 0–9 |
nonDigit() |
\D |
Any non-digit |
whitespace() |
\s |
Space, tab, newline, … |
nonWhitespace() |
\S |
Any non-whitespace |
wordChar() |
\w |
Letter, digit, or _
|
nonWordChar() |
\W |
Not a word character |
anyChar() |
. |
Any character except newline |
letter() |
[a-zA-Z] |
ASCII letters only |
uppercaseLetter() |
[A-Z] |
ASCII uppercase letters |
lowercaseLetter() |
[a-z] |
ASCII lowercase letters |
alphanumeric() |
[a-zA-Z0-9] |
ASCII letter or digit |
tab() |
\t |
Horizontal tab |
newline() |
\n |
Newline |
carriageReturn() |
\r |
Carriage return |
nonWordBoundary() |
\B |
Non-word boundary position |
endPunctuation() |
[.!?] |
Sentence-ending punctuation |
| Method | Regex produced | Notes |
|---|---|---|
anyOf(chars) |
[chars] |
One character from the given set; metacharacters escaped |
noneOf(chars) |
[^chars] |
One character NOT in the given set |
inRange(from, to) |
[from-to] |
One character in the inclusive range |
| Method | Regex produced | Notes |
|---|---|---|
startOfLine() |
^ |
Use with RegexOption.MULTILINE for per-line anchoring |
endOfLine() |
$ |
Use with RegexOption.MULTILINE for per-line anchoring |
startOfText() |
\A |
Anchors to the very beginning of the input |
endOfText() |
\z |
Anchors to the very end of the input |
wordBoundary() |
\b |
Transition between word and non-word character |
All quantifiers accept an optional greedy: Boolean parameter (default true).
Pass greedy = false to make the quantifier lazy (matches as few characters as possible).
| Method | Regex produced | Notes |
|---|---|---|
optional { } |
(?:...)? |
Zero or one occurrence |
zeroOrMore { } |
(?:...)* |
Zero or more occurrences |
oneOrMore { } |
(?:...)+ |
One or more occurrences |
exactly(n) { } |
(?:...){n} |
Exactly n occurrences |
atLeast(n) { } |
(?:...){n,} |
At least n occurrences |
between(min, max) { } |
(?:...){min,max} |
Between min and max occurrences (inclusive) |
Lazy example:
val lazyDigits = kexpresso {
startOfText()
oneOrMore(greedy = false) { digit() }
endOfText()
}
lazyDigits.matches("42") // true| Method | Regex produced | Notes |
|---|---|---|
group { } |
(?:...) |
Non-capturing group |
capture { } |
(...) |
Numbered capturing group |
capture("name") { } |
(?<name>...) |
Named capturing group |
oneOf({ }, { }, …) |
(?:a|b|…) |
Alternation: matches any one of the given patterns |
Named capture example:
val orderPattern = kexpresso { literal(": "); capture("drink") { word() } }
val result = orderPattern.find("Order: Cappuccino please")
result?.groups?.get("drink")?.value // "Cappuccino"Alternation example:
val drinkMenu = kexpresso {
oneOf(
{ literal("Espresso") },
{ literal("Latte") },
{ literal("Cappuccino") },
)
}
drinkMenu.matches("Latte") // true
drinkMenu.matches("Americano") // falseLookarounds assert a condition at the current position without consuming any characters. They are zero-width: the matched text is not included in the result.
| Method | Regex produced | Notes |
|---|---|---|
followedBy { } |
(?=...) |
Positive lookahead — position must be followed by the pattern |
notFollowedBy { } |
(?!...) |
Negative lookahead — position must NOT be followed by the pattern |
precededBy { } |
(?<=...) |
Positive lookbehind — position must be preceded by the pattern |
notPrecededBy { } |
(?<!...) |
Negative lookbehind — position must NOT be preceded by the pattern |
Example — extract the numeric part of a measurement:
// Match digits only when immediately followed by "ml"
val mlAmount = kexpresso {
oneOrMore { digit() }
followedBy { literal("ml") }
}
mlAmount.find("250ml")?.value // "250" (lookahead consumed nothing: "ml" stays in input)
mlAmount.find("250g") // null (not followed by "ml")Note: The JVM regex engine requires lookbehind patterns to be bounded in length.
precededBy { oneOrMore { digit() } }(unbounded+) will throw aPatternSyntaxExceptionat compile time. Use a bounded form instead:precededBy { between(1, 10) { digit() } }.
| Method | Regex produced | Notes |
|---|---|---|
raw(pattern) |
pattern verbatim |
No escaping — use only for raw regex fragments the DSL cannot yet express |
include(pattern) |
(?:pattern.source) |
Embed a compiled [KexpressoPattern] as a non-capturing group |
backreference(n) |
\n |
Numeric back-reference to the nth capturing group (n ≥ 1) |
backreference(name) |
\k<name> |
Named back-reference; name must start with a letter and contain only letters or digits |
raw example — inject a verbatim date fragment:
val datePattern = kexpresso { raw("\\d{4}-\\d{2}-\\d{2}") }
datePattern.matches("2026-06-03") // trueinclude example — compose a reusable octet pattern into an IP address:
val octet = kexpresso { between(1, 3) { digit() } }
val ip = kexpresso {
include(octet)
exactly(3) { char('.'); include(octet) }
}
ip.matches("192.168.1.1") // truebackreference example — detect repeated words:
val repeated = kexpresso {
capture { oneOrMore { wordChar() } }
whitespace()
backreference(1)
}
repeated.containsMatchIn("latte latte") // true
repeated.containsMatchIn("latte mocha") // falseThese extension functions on KexpressoBuilder compose common real-world patterns from
the primitives above.
| Method | Pattern | Matches |
|---|---|---|
word() |
[a-zA-Z0-9]+ |
One or more alphanumeric characters (e.g. Espresso, Cappuccino42) |
handle() |
[a-zA-Z0-9_-]+ |
Like word() but also allows _ and - — usernames and slugs (e.g. cold-brew_2024) |
email() |
see source | A broadly valid email address (e.g. barista@coffee.shop) |
url() |
see source | An HTTP or HTTPS URL (e.g. https://coffee.shop/menu) |
email()andurl()are intentionally permissive. Pair withstartOfText()/endOfText()for strict whole-string validation.
| Method | What it matches |
|---|---|
sentence() |
A capital-letter-led sequence of words ending with ., !, or ?
|
paragraph() |
One or more sentences separated by single spaces |
val paragraphPattern = kexpresso { paragraph() }
paragraphPattern.matches("Latte is smooth. Espresso is bold!") // true
paragraphPattern.matches("latte is lowercase.") // falseNote:
sentence()builds the first word asuppercaseLetter()+word(), so the first word must be at least two characters long (one uppercase letter followed by at least one alphanumeric character).
These helpers in Domains.kt let you match common real-world formats in one call.
Pair with startOfText()/endOfText() for whole-string validation.
| Helper | Matches | Caveats |
|---|---|---|
ipv4() |
IPv4 address, e.g. 192.168.1.1
|
Decimal only; no CIDR notation |
uuid() |
RFC 4122 UUID versions 1–5, e.g. 550e8400-e29b-41d4-a716-446655440000
|
Nil UUID and versions 6+ rejected |
slug() |
URL/CMS slug, e.g. cold-brew
|
Lowercase only; no underscores |
hexColor() |
CSS hex color #RGB, #RGBA, #RRGGBB, #RRGGBBAA, e.g. #1a2b3c
|
5- and 7-digit forms are invalid CSS and do not match |
semanticVersion() |
SemVer 2.0.0 string, e.g. 1.0.0-rc.1+build.42
|
No leading v; partial forms like 1.0 rejected |
isoDate() |
ISO-8601 date YYYY-MM-DD, e.g. 2024-01-15
|
Does NOT validate day-of-month (Feb 30 passes) |
isoTime() |
ISO-8601 time HH:MM[:SS][Z|±HH:MM], e.g. 14:30:00Z
|
Leap seconds and fractional seconds not supported |
integerNumber() |
Signed/unsigned integer without leading zeros, e.g. -7, 42
|
No upper bound on digit count |
decimalNumber() |
Decimal with optional fractional part, e.g. 3.14, -0.5
|
Bare .5 and scientific notation not supported |
hashtag() |
Social-media hashtag #word, e.g. #Espresso
|
First char after # must be a letter, not a digit |
mention() |
@mention (Twitter/X), 1–50 chars, e.g. @barista
|
Other platforms may allow longer names |
e164Phone() |
E.164 phone number, e.g. +14155552671
|
Compact form only — no separators; no country-code validation |
ipv6() |
IPv6 address — full or :: -compressed, e.g. 2001:db8::1, ::1
|
Embedded IPv4 (::ffff:192.168.1.1) and zone IDs (%eth0) not supported |
macAddress() |
IEEE 802 MAC address, colon- or hyphen-separated, e.g. 01:23:45:67:89:AB
|
Cisco dot notation not supported; mixed separators rejected |
base64() |
Standard Base64 string with optional =/== padding, e.g. S2V4cHJlc3Nv
|
Also matches empty string; URL-safe Base64 (-/_) not matched |
jwt() |
JSON Web Token — three base64url segments separated by dots | Structural only — signature not verified, payload not decoded |
Example — validate an IPv4 address:
val ipValidator = kexpresso {
startOfText()
ipv4()
endOfText()
}
ipValidator.matches("192.168.1.1") // true
ipValidator.matches("256.0.0.1") // false — octet out of rangeExample — extract all hashtags from a post:
val hashtagPattern = kexpresso { hashtag() }
val post = "Loving my #Espresso and #ColdBrew today! #Coffee"
val tags = hashtagPattern.findAll(post).map { it.value }.toList()
// ["#Espresso", "#ColdBrew", "#Coffee"]kexpresso { } returns a KexpressoPattern — an immutable, thread-safe wrapper
around a compiled Regex.
val p = kexpresso { oneOrMore { letter() } }
p.matches("Espresso") // true — entire string must match
p.containsMatchIn("Order: Espresso please") // true — match anywhere in the stringval wordPattern = kexpresso { oneOrMore { letter() } }
// First match only
val first = wordPattern.find("Espresso Latte")
first?.value // "Espresso"
// Skip ahead with startIndex
val second = wordPattern.find("Espresso Latte", startIndex = 9)
second?.value // "Latte"
// All non-overlapping matches (returns a lazy Sequence)
val drinks = wordPattern.findAll("Espresso Latte Cappuccino").map { it.value }.toList()
// ["Espresso", "Latte", "Cappuccino"]KexpressoPattern exposes convenience methods that delegate to the underlying Regex:
replaceFirst — replace the first match:
val drink = kexpresso { oneOrMore { letter() } }
drink.replaceFirst("espresso latte", "ESPRESSO") // "ESPRESSO latte"replaceAll with a fixed string — replace every match:
val drink = kexpresso { oneOrMore { letter() } }
drink.replaceAll("espresso latte", "brew") // "brew brew"replaceAll with a transform — compute the replacement per match:
val drink = kexpresso { oneOrMore { letter() } }
drink.replaceAll("espresso latte") { it.value.uppercase() } // "ESPRESSO LATTE"split — split around matches:
val sep = kexpresso { literal(", ") }
sep.split("Espresso, Latte, Cappuccino") // ["Espresso", "Latte", "Cappuccino"]
sep.split("Espresso, Latte, Cappuccino", limit = 2) // ["Espresso", "Latte, Cappuccino"]matchEntire — full-string match with group access:
val drinkOrder = kexpresso {
capture("drink") { oneOrMore { letter() } }
whitespace()
capture("size") { oneOrMore { letter() } }
}
val result = drinkOrder.matchEntire("Latte Large")
result?.groups?.get("drink")?.value // "Latte"
result?.groups?.get("size")?.value // "Large"Reading captured groups from a MatchResult is normally verbose and stringly-typed:
result.groups["year"]?.value?.toInt(). The Captures API wraps any MatchResult and
provides type-safe accessors:
val datePattern = kexpresso {
capture("year") { exactly(4) { digit() } }
literal("-")
capture("month") { exactly(2) { digit() } }
literal("-")
capture("day") { exactly(2) { digit() } }
}
val caps = datePattern.find("2026-06-03")?.captures
caps?.int("year") // 2026
caps?.int("month") // 6
caps?.int("day") // 3
caps?.string("day") // "03"Use ...OrThrow variants when the group is guaranteed to be present — they give clear
error messages instead of silent nulls:
val pricePattern = kexpresso {
literal("\$")
capture("dollars") { oneOrMore { digit() } }
}
val caps = pricePattern.find("\$42")?.captures ?: error("no match")
caps.intOrThrow("dollars") // 42
caps.intOrThrow("missing") // throws NoSuchElementException: "Named group 'missing'…"
caps.intOrThrow("dollars") // throws NumberFormatException if value isn't an IntBy index — index 0 is the whole match, 1 is the first capturing group, etc.:
val pricePattern = kexpresso {
literal("\$")
capture { oneOrMore { digit() } }
}
val caps = pricePattern.find("\$42")?.captures
caps?.string(0) // "\$42" — whole match
caps?.int(1) // 42 — first capture groupSupported types — string, int, long, double, boolean (strict: "true"/"false" only).
All nullable variants return null on absent/unparseable values; ...OrThrow variants throw
NoSuchElementException, NumberFormatException, or IllegalArgumentException with a message
that names the group and the offending value.
val p = kexpresso { digit(); letter() }
p.source // "\\d[a-zA-Z]" — raw regex string
p.options // emptySet() — Set<RegexOption>Every pattern can explain itself in plain English. describe() walks the internal AST
(the same representation that renders the regex) and returns a deterministic, comma-joined
phrase — handy for code review, logging, or learning what a pattern does:
val p = kexpresso { startOfText(); oneOrMore { digit() }; endOfText() }
p.source // "\\A(?:\\d)+\\z"
p.describe() // "start of text, one or more of (a digit), end of text"Domain helpers (e.g. email()) are emitted as raw fragments, so they describe as
raw regex `…` rather than a fully decomposed phrase.
examples() walks the internal AST and produces strings that satisfy matches():
val drinkCode = kexpresso {
uppercaseLetter()
oneOrMore { lowercaseLetter() }
}
drinkCode.examples(3) // e.g. ["Ac", "Bs", "Ct"] — each passes drinkCode.matches(it)
// Deterministic: same seed → same list
drinkCode.examples(5, seed = 42)
// Exact repetition
val pinPattern = kexpresso { exactly(4) { digit() } }
pinPattern.examples(3) // e.g. ["5279", "1836", "4021"]
// Alternation
val drinkMenu = kexpresso { oneOf({ literal("Espresso") }, { literal("Latte") }) }
drinkMenu.examples(5) // ["Espresso", "Latte"]Honesty contract — when examples are guaranteed to match:
examples() guarantees that every returned string satisfies matches() when the
pattern's AST contains only supported nodes: Sequence, Literal, Token primitives
(digit, letter, whitespace, …), Quantifier, Group, and Alternation.
Best-effort cases (no match guarantee):
Raw fragments — including domain helpers (email(), ipv4(), isoDate(), …) and
Kexpresso.from(rawRegex), which all use raw nodes internally.followedBy, precededBy, …) — zero-width; skipped during generation.In best-effort mode examples() still returns without throwing — the results simply may
not satisfy matches().
val p = kexpresso { literal("Cappuccino") }
val kotlinRegex: Regex = p.toRegex() // all targets
val javaPattern: java.util.regex.Pattern = p.toPattern() // JVM only
toRegex()is available on every target.toPattern()is a JVM-only extension (java.util.regex.Patterndoes not exist on Kotlin/JS).
Pass any number of RegexOption values to the kexpresso { } call or to
Kexpresso.pattern { }:
val caseInsensitive = kexpresso(RegexOption.IGNORE_CASE) {
literal("espresso")
}
caseInsensitive.matches("ESPRESSO") // true
caseInsensitive.matches("Espresso") // true
val multiline = kexpresso(RegexOption.MULTILINE) {
startOfLine()
literal("Espresso")
endOfLine()
}
multiline.containsMatchIn("Espresso\nCappuccino") // trueIf you prefer an object-oriented style, use Kexpresso.pattern { } — it is identical
to the top-level kexpresso { } function:
val p = Kexpresso.pattern(RegexOption.IGNORE_CASE) { literal("Ristretto") }
p.matches("ristretto") // trueCertain regex patterns can cause catastrophic backtracking — an attacker who controls
input can make the regex engine take exponential time. The classic shape is nested
unbounded quantifiers such as (?:a+)+.
Kexpresso provides a best-effort static analyzer to catch this shape at development time:
// DSL produces (?:(?:[a-zA-Z])+)+ — nested unbounded quantifiers
val risky = kexpresso { oneOrMore { oneOrMore { letter() } } }
val report = risky.analyze()
if (report.isPotentiallyVulnerable) {
println("Findings:")
report.findings.forEach { println(" [${it.severity}] ${it.message}") }
}
// Findings:
// [WARNING] Nested unbounded quantifier at index 0: (?:(?:[a-zA-Z])+)+ …
// Convenience shorthand
if (risky.isPotentiallyVulnerable) { /* warn or reject */ }This is a best-effort heuristic, not a guarantee. It detects the canonical "evil
regex" shape — a group with an outer unbounded quantifier (*, +, or {n,}) whose
body also contains an inner unbounded quantifier. It does NOT detect all ReDoS patterns
(e.g. alternation-based catastrophic backtracking), and a clean result does not prove
the pattern is safe. Use it as an early-warning signal alongside proper input constraints
and performance testing.
Inherited a cryptic regex? Kexpresso.from(...) reads it back: it compiles the regex and
lets you explain it (describe()) or rewrite it as kexpresso DSL (toKexpressoCode()).
val pattern = Kexpresso.from("\\d{4}-\\d{2}-\\d{2}")
pattern.describe()
// "exactly 4 of (a digit), the literal "-", exactly 2 of (a digit), the literal "-", exactly 2 of (a digit)"
println(pattern.toKexpressoCode())
// kexpresso {
// exactly(4) { digit() }
// literal("-")
// exactly(2) { digit() }
// literal("-")
// exactly(2) { digit() }
// }toKexpressoCode() works on any KexpressoPattern — whether you built it with the DSL or
parsed it with from — so you can round-trip between the two representations.
Matching is always exact; parsing is best-effort. Kexpresso.from(r) compiles r
verbatim, so Kexpresso.from(r).matches(x) is always identical to Regex(r).matches(x).
The structural parse that powers describe() and toKexpressoCode() models the common
constructs (literals, predefined classes, anchors, quantifiers, groups, lookarounds,
alternation, back-references) and honestly degrades anything it doesn't model to raw("…")
(e.g. possessive quantifiers, atomic groups (?>…), inline-flag groups (?i)). The generated
code stays compilable and never changes match behaviour. An invalid regex throws
PatternSyntaxException, exactly as Regex(...) would.
# Compile, run all tests, Detekt static analysis, and the Kover coverage check
./gradlew build
# Run tests only
./gradlew test
# Run Detekt only
./gradlew detektSee CONTRIBUTING.md for the full contributor guide — including how to add a new DSL primitive — and docs/ARCHITECTURE.md for a map of how the codebase fits together.
Found a vulnerability? Please report it privately — see SECURITY.md. The project also runs CodeQL static analysis and an OpenSSF Scorecard supply-chain check on every push, and uses Dependabot to keep dependencies and GitHub Actions current.
For catastrophic-backtracking (ReDoS) risk in your own patterns, kexpresso ships a best-effort analyzer — see Safety: ReDoS analysis.
This project follows the Contributor Covenant. By participating, you are expected to uphold it.
kexpresso is free and open source. If it saves you time, you can say thanks:
☕ https://buymeacoffee.com/elzinko — every coffee fuels another release.
MIT — Copyright (c) 2026 Thomas Couderc.
A fluent Kotlin DSL that makes regular expressions readable.
Raw regular expressions are write-only. A week after authoring one, even the writer struggles to remember what it does:
// Raw regex — what does this match?
val emailRegex = Regex("[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}")With kexpresso the same constraint reads like English:
// kexpresso — self-documenting and composable
val emailPattern = kexpresso {
email()
}Or, for a richer pattern that you build up incrementally:
val strictEmail = kexpresso {
startOfText()
email()
endOfText()
}
strictEmail.matches("barista@coffee.shop") // true
strictEmail.matches("not an email") // falseTwo equivalent entry points —
kexpresso { }(top-level function) andKexpresso.pattern { }(object-oriented style) produce the sameKexpressoPattern. See Object-oriented entry point below.
Benefits at a glance:
Regex at construction time
(measured: 0 % match-time overhead vs raw Regex — see benchmarks).Is kexpresso right for your case? We're honest about it: it's great for complex, maintained patterns and a poor fit for trivial ones. Read When to use kexpresso — and when not to before adopting. Where we're headed: the Roadmap.
Clone and run the guided-tour sample — no extra setup, no credentials:
git clone https://github.com/elzinko/kexpresso && cd kexpresso && ./gradlew :samples:runThe console output walks you through every headline feature: building patterns, domain helpers,
typed captures, describe(), reverse-engineering a raw regex with Kexpresso.from(), and ReDoS
analysis.
Once io.github.elzinko:kexpresso lands in your own project (see Install below), the same
capabilities are one dependency away.
Kexpresso is published to Maven Central — no repository configuration and no token required. Just add the dependency.
groupId is now
io.github.elzinko(it wascom.github.elzinkoon JitPack/GitHub Packages). Maven Central requires theio.github.*namespace.
// build.gradle.kts
dependencies {
implementation("io.github.elzinko:kexpresso:0.9.0")
}mavenCentral() is in your repositories by default in most projects; add it if needed:
repositories {
mavenCentral()
}<dependency>
<groupId>io.github.elzinko</groupId>
<artifactId>kexpresso</artifactId>
<version>0.9.0</version>
</dependency>Maven Central is the recommended source. Two alternatives remain available:
com.github.elzinko:kexpresso:<tag>.
Serves jvm, js, wasmJs, linuxX64, mingwX64 only (no Apple/iOS — JitPack builds on Linux).
Add maven { url = uri("https://jitpack.io") } to your repositories.io.github.elzinko:kexpresso:0.9.0.
Requires a GitHub token (a GitHub limitation, even for public packages) — see
Where the artifacts are hosted.Kexpresso is a Kotlin Multiplatform library. The full DSL is written in
commonMain, so the builder, describe(), analyze(), captures, and the reverse
(regex → DSL) API are available on every supported target:
| Target | Status |
|---|---|
| JVM | ✅ published |
| JS (IR, Node.js) | ✅ published |
Wasm (wasmJs, Node.js) |
✅ published |
Native — linuxX64, mingwX64
|
✅ published |
Native — macosX64, macosArm64
|
✅ published |
Native — iosArm64, iosX64, iosSimulatorArm64
|
✅ published |
Built per host; published from macOS. Kotlin/Native targets only cross-compile from a capable host, so the build registers them conditionally: the Linux CI gate builds
linuxX64+mingwX64(fast), while macOS — the most capable host — builds every target. The release therefore runs on amacos-latestrunner and publishes the complete, consistent multiplatform metadata (JVM, JS, Wasm, Linux, Windows, macOS, iOS) from that single host, so a consumer resolving the root module sees every variant. A dedicatedApple & Nativeworkflow exercises the Apple/iOS targets on every PR.(Building the Apple targets locally requires a full Xcode install — a Command-Line-Tools-only macOS box still builds jvm/js/wasmJs and skips the Apple/Native targets with a warning.)
For a Gradle Multiplatform consumer, the dependency resolves automatically per target via Gradle module metadata:
kotlin {
sourceSets {
commonMain.dependencies {
implementation("io.github.elzinko:kexpresso:0.9.0")
}
}
}A plain-Maven (JVM-only) consumer must use the target-suffixed coordinate instead:
<dependency>
<groupId>io.github.elzinko</groupId>
<artifactId>kexpresso-jvm</artifactId>
<version>0.9.0</version>
</dependency>Breaking change (since the multiplatform release): artifact coordinates now carry a target suffix. Gradle resolves
io.github.elzinko:kexpresso:0.9.0to the right target automatically through Gradle metadata, but tools that ignore Gradle metadata (e.g. plain Maven) must referencekexpresso-jvmdirectly.
Not every target is available from every repository — choose your source accordingly:
| Repository | Targets served | Auth |
|---|---|---|
| Maven Central (the Install section) | all targets, incl. macosX64/macosArm64/iosArm64/iosX64/iosSimulatorArm64 |
none |
| GitHub Packages | all targets, incl. macosX64/macosArm64/iosArm64/iosX64/iosSimulatorArm64 |
a GitHub token |
| JitPack |
jvm, js, wasmJs, linuxX64, mingwX64
|
none |
Maven Central is the recommended source for every target (including Apple/iOS) with no authentication — the release runs on macOS and publishes the complete, signed multiplatform metadata. GitHub Packages serves the same full set but requires a token. JitPack builds on demand on Linux, so it can never produce the Apple/iOS artifacts.
To consume from GitHub Packages instead (a personal-access token with read:packages is
required even for public packages — a GitHub limitation):
// settings.gradle.kts
dependencyResolutionManagement {
repositories {
maven("https://maven.pkg.github.com/elzinko/kexpresso") {
credentials {
username = providers.gradleProperty("gpr.user").orNull ?: System.getenv("GITHUB_ACTOR")
password = providers.gradleProperty("gpr.key").orNull ?: System.getenv("GITHUB_TOKEN")
}
}
}
}The DSL builds the same regex string on every platform, but each non-JVM target uses its
own regex engine rather than the JVM's java.util.regex (PCRE-like) engine, so the supported
feature set narrows the further you get from the JVM. The portable common API — primitives,
quantifiers, character classes, alternation, simple/named groups, named & numeric
backreferences, lookahead, \b, literal escaping, describe(), toKexpressoCode(), captures,
and analyze() — works everywhere. Some JVM-flavoured constructs remain JVM-only at
runtime: they build fine but throw when compiled to a Regex on the smaller engines.
startOfText() / endOfText() (the
\A / \z anchors) are not valid ECMAScript — use startOfLine() / endOfLine()
(^ / $) for portable code. Atomic groups (?>…), possessive quantifiers (a++, a*+),
and some lookbehind forms are also JVM-only and only ever appear via raw(...) or
Kexpresso.from(...).wasmJs): runs on the same ECMAScript engine via the host; same caveats as JS.kotlin.text.Regex): ships a capable pure-Kotlin engine that is actually a
superset of ECMAScript here — it accepts the \A / \z / \Z / \G anchors, named
groups, named/numeric backreferences, lookahead, lookbehind, and atomic groups. Even so,
treat the JVM as the reference engine; exotic PCRE-only constructs reachable via raw(...)
may still differ.The whole commonTest portable suite (31 tests) passes identically on JVM, JS, Wasm, and the
built native targets. JVM-only constructs are exercised in the JVM-only jvmTest suite.
Literal escaping is portable: literal("a.b") renders as a\.b (a per-character
escaper) rather than the JVM-only \Qa.b\E; matching behaviour is identical everywhere.
toPattern() (conversion to java.util.regex.Pattern) is a JVM-only extension and is
not available on JS, Wasm, or Native.
val drinkName = kexpresso {
uppercaseLetter()
oneOrMore { letter() }
}
drinkName.matches("Espresso") // true
drinkName.matches("espresso") // false (no capital first letter)
drinkName.matches("Espresso42") // false (digit at the end)val wordPattern = kexpresso { word() }
val order = "Espresso Latte Cappuccino"
val drinks = wordPattern.findAll(order).map { it.value }.toList()
// ["Espresso", "Latte", "Cappuccino"]val emailValidator = kexpresso {
startOfText()
email()
endOfText()
}
emailValidator.matches("barista@coffee.shop") // true
emailValidator.matches("barista@coffee.shop extra") // false
emailValidator.matches("not-an-email") // falseval sentencePattern = kexpresso { sentence() }
sentencePattern.matches("Espresso is perfect!") // true
sentencePattern.matches("espresso is lowercase.") // false
sentencePattern.matches("No punctuation at the end") // false| Method | Regex produced | Notes |
|---|---|---|
literal(text) |
escaped text (e.g. a\.b) |
Escapes each regex metacharacter |
char(c) |
escaped char | Escapes metacharacters |
digit() |
\d |
Decimal digit 0–9 |
nonDigit() |
\D |
Any non-digit |
whitespace() |
\s |
Space, tab, newline, … |
nonWhitespace() |
\S |
Any non-whitespace |
wordChar() |
\w |
Letter, digit, or _
|
nonWordChar() |
\W |
Not a word character |
anyChar() |
. |
Any character except newline |
letter() |
[a-zA-Z] |
ASCII letters only |
uppercaseLetter() |
[A-Z] |
ASCII uppercase letters |
lowercaseLetter() |
[a-z] |
ASCII lowercase letters |
alphanumeric() |
[a-zA-Z0-9] |
ASCII letter or digit |
tab() |
\t |
Horizontal tab |
newline() |
\n |
Newline |
carriageReturn() |
\r |
Carriage return |
nonWordBoundary() |
\B |
Non-word boundary position |
endPunctuation() |
[.!?] |
Sentence-ending punctuation |
| Method | Regex produced | Notes |
|---|---|---|
anyOf(chars) |
[chars] |
One character from the given set; metacharacters escaped |
noneOf(chars) |
[^chars] |
One character NOT in the given set |
inRange(from, to) |
[from-to] |
One character in the inclusive range |
| Method | Regex produced | Notes |
|---|---|---|
startOfLine() |
^ |
Use with RegexOption.MULTILINE for per-line anchoring |
endOfLine() |
$ |
Use with RegexOption.MULTILINE for per-line anchoring |
startOfText() |
\A |
Anchors to the very beginning of the input |
endOfText() |
\z |
Anchors to the very end of the input |
wordBoundary() |
\b |
Transition between word and non-word character |
All quantifiers accept an optional greedy: Boolean parameter (default true).
Pass greedy = false to make the quantifier lazy (matches as few characters as possible).
| Method | Regex produced | Notes |
|---|---|---|
optional { } |
(?:...)? |
Zero or one occurrence |
zeroOrMore { } |
(?:...)* |
Zero or more occurrences |
oneOrMore { } |
(?:...)+ |
One or more occurrences |
exactly(n) { } |
(?:...){n} |
Exactly n occurrences |
atLeast(n) { } |
(?:...){n,} |
At least n occurrences |
between(min, max) { } |
(?:...){min,max} |
Between min and max occurrences (inclusive) |
Lazy example:
val lazyDigits = kexpresso {
startOfText()
oneOrMore(greedy = false) { digit() }
endOfText()
}
lazyDigits.matches("42") // true| Method | Regex produced | Notes |
|---|---|---|
group { } |
(?:...) |
Non-capturing group |
capture { } |
(...) |
Numbered capturing group |
capture("name") { } |
(?<name>...) |
Named capturing group |
oneOf({ }, { }, …) |
(?:a|b|…) |
Alternation: matches any one of the given patterns |
Named capture example:
val orderPattern = kexpresso { literal(": "); capture("drink") { word() } }
val result = orderPattern.find("Order: Cappuccino please")
result?.groups?.get("drink")?.value // "Cappuccino"Alternation example:
val drinkMenu = kexpresso {
oneOf(
{ literal("Espresso") },
{ literal("Latte") },
{ literal("Cappuccino") },
)
}
drinkMenu.matches("Latte") // true
drinkMenu.matches("Americano") // falseLookarounds assert a condition at the current position without consuming any characters. They are zero-width: the matched text is not included in the result.
| Method | Regex produced | Notes |
|---|---|---|
followedBy { } |
(?=...) |
Positive lookahead — position must be followed by the pattern |
notFollowedBy { } |
(?!...) |
Negative lookahead — position must NOT be followed by the pattern |
precededBy { } |
(?<=...) |
Positive lookbehind — position must be preceded by the pattern |
notPrecededBy { } |
(?<!...) |
Negative lookbehind — position must NOT be preceded by the pattern |
Example — extract the numeric part of a measurement:
// Match digits only when immediately followed by "ml"
val mlAmount = kexpresso {
oneOrMore { digit() }
followedBy { literal("ml") }
}
mlAmount.find("250ml")?.value // "250" (lookahead consumed nothing: "ml" stays in input)
mlAmount.find("250g") // null (not followed by "ml")Note: The JVM regex engine requires lookbehind patterns to be bounded in length.
precededBy { oneOrMore { digit() } }(unbounded+) will throw aPatternSyntaxExceptionat compile time. Use a bounded form instead:precededBy { between(1, 10) { digit() } }.
| Method | Regex produced | Notes |
|---|---|---|
raw(pattern) |
pattern verbatim |
No escaping — use only for raw regex fragments the DSL cannot yet express |
include(pattern) |
(?:pattern.source) |
Embed a compiled [KexpressoPattern] as a non-capturing group |
backreference(n) |
\n |
Numeric back-reference to the nth capturing group (n ≥ 1) |
backreference(name) |
\k<name> |
Named back-reference; name must start with a letter and contain only letters or digits |
raw example — inject a verbatim date fragment:
val datePattern = kexpresso { raw("\\d{4}-\\d{2}-\\d{2}") }
datePattern.matches("2026-06-03") // trueinclude example — compose a reusable octet pattern into an IP address:
val octet = kexpresso { between(1, 3) { digit() } }
val ip = kexpresso {
include(octet)
exactly(3) { char('.'); include(octet) }
}
ip.matches("192.168.1.1") // truebackreference example — detect repeated words:
val repeated = kexpresso {
capture { oneOrMore { wordChar() } }
whitespace()
backreference(1)
}
repeated.containsMatchIn("latte latte") // true
repeated.containsMatchIn("latte mocha") // falseThese extension functions on KexpressoBuilder compose common real-world patterns from
the primitives above.
| Method | Pattern | Matches |
|---|---|---|
word() |
[a-zA-Z0-9]+ |
One or more alphanumeric characters (e.g. Espresso, Cappuccino42) |
handle() |
[a-zA-Z0-9_-]+ |
Like word() but also allows _ and - — usernames and slugs (e.g. cold-brew_2024) |
email() |
see source | A broadly valid email address (e.g. barista@coffee.shop) |
url() |
see source | An HTTP or HTTPS URL (e.g. https://coffee.shop/menu) |
email()andurl()are intentionally permissive. Pair withstartOfText()/endOfText()for strict whole-string validation.
| Method | What it matches |
|---|---|
sentence() |
A capital-letter-led sequence of words ending with ., !, or ?
|
paragraph() |
One or more sentences separated by single spaces |
val paragraphPattern = kexpresso { paragraph() }
paragraphPattern.matches("Latte is smooth. Espresso is bold!") // true
paragraphPattern.matches("latte is lowercase.") // falseNote:
sentence()builds the first word asuppercaseLetter()+word(), so the first word must be at least two characters long (one uppercase letter followed by at least one alphanumeric character).
These helpers in Domains.kt let you match common real-world formats in one call.
Pair with startOfText()/endOfText() for whole-string validation.
| Helper | Matches | Caveats |
|---|---|---|
ipv4() |
IPv4 address, e.g. 192.168.1.1
|
Decimal only; no CIDR notation |
uuid() |
RFC 4122 UUID versions 1–5, e.g. 550e8400-e29b-41d4-a716-446655440000
|
Nil UUID and versions 6+ rejected |
slug() |
URL/CMS slug, e.g. cold-brew
|
Lowercase only; no underscores |
hexColor() |
CSS hex color #RGB, #RGBA, #RRGGBB, #RRGGBBAA, e.g. #1a2b3c
|
5- and 7-digit forms are invalid CSS and do not match |
semanticVersion() |
SemVer 2.0.0 string, e.g. 1.0.0-rc.1+build.42
|
No leading v; partial forms like 1.0 rejected |
isoDate() |
ISO-8601 date YYYY-MM-DD, e.g. 2024-01-15
|
Does NOT validate day-of-month (Feb 30 passes) |
isoTime() |
ISO-8601 time HH:MM[:SS][Z|±HH:MM], e.g. 14:30:00Z
|
Leap seconds and fractional seconds not supported |
integerNumber() |
Signed/unsigned integer without leading zeros, e.g. -7, 42
|
No upper bound on digit count |
decimalNumber() |
Decimal with optional fractional part, e.g. 3.14, -0.5
|
Bare .5 and scientific notation not supported |
hashtag() |
Social-media hashtag #word, e.g. #Espresso
|
First char after # must be a letter, not a digit |
mention() |
@mention (Twitter/X), 1–50 chars, e.g. @barista
|
Other platforms may allow longer names |
e164Phone() |
E.164 phone number, e.g. +14155552671
|
Compact form only — no separators; no country-code validation |
ipv6() |
IPv6 address — full or :: -compressed, e.g. 2001:db8::1, ::1
|
Embedded IPv4 (::ffff:192.168.1.1) and zone IDs (%eth0) not supported |
macAddress() |
IEEE 802 MAC address, colon- or hyphen-separated, e.g. 01:23:45:67:89:AB
|
Cisco dot notation not supported; mixed separators rejected |
base64() |
Standard Base64 string with optional =/== padding, e.g. S2V4cHJlc3Nv
|
Also matches empty string; URL-safe Base64 (-/_) not matched |
jwt() |
JSON Web Token — three base64url segments separated by dots | Structural only — signature not verified, payload not decoded |
Example — validate an IPv4 address:
val ipValidator = kexpresso {
startOfText()
ipv4()
endOfText()
}
ipValidator.matches("192.168.1.1") // true
ipValidator.matches("256.0.0.1") // false — octet out of rangeExample — extract all hashtags from a post:
val hashtagPattern = kexpresso { hashtag() }
val post = "Loving my #Espresso and #ColdBrew today! #Coffee"
val tags = hashtagPattern.findAll(post).map { it.value }.toList()
// ["#Espresso", "#ColdBrew", "#Coffee"]kexpresso { } returns a KexpressoPattern — an immutable, thread-safe wrapper
around a compiled Regex.
val p = kexpresso { oneOrMore { letter() } }
p.matches("Espresso") // true — entire string must match
p.containsMatchIn("Order: Espresso please") // true — match anywhere in the stringval wordPattern = kexpresso { oneOrMore { letter() } }
// First match only
val first = wordPattern.find("Espresso Latte")
first?.value // "Espresso"
// Skip ahead with startIndex
val second = wordPattern.find("Espresso Latte", startIndex = 9)
second?.value // "Latte"
// All non-overlapping matches (returns a lazy Sequence)
val drinks = wordPattern.findAll("Espresso Latte Cappuccino").map { it.value }.toList()
// ["Espresso", "Latte", "Cappuccino"]KexpressoPattern exposes convenience methods that delegate to the underlying Regex:
replaceFirst — replace the first match:
val drink = kexpresso { oneOrMore { letter() } }
drink.replaceFirst("espresso latte", "ESPRESSO") // "ESPRESSO latte"replaceAll with a fixed string — replace every match:
val drink = kexpresso { oneOrMore { letter() } }
drink.replaceAll("espresso latte", "brew") // "brew brew"replaceAll with a transform — compute the replacement per match:
val drink = kexpresso { oneOrMore { letter() } }
drink.replaceAll("espresso latte") { it.value.uppercase() } // "ESPRESSO LATTE"split — split around matches:
val sep = kexpresso { literal(", ") }
sep.split("Espresso, Latte, Cappuccino") // ["Espresso", "Latte", "Cappuccino"]
sep.split("Espresso, Latte, Cappuccino", limit = 2) // ["Espresso", "Latte, Cappuccino"]matchEntire — full-string match with group access:
val drinkOrder = kexpresso {
capture("drink") { oneOrMore { letter() } }
whitespace()
capture("size") { oneOrMore { letter() } }
}
val result = drinkOrder.matchEntire("Latte Large")
result?.groups?.get("drink")?.value // "Latte"
result?.groups?.get("size")?.value // "Large"Reading captured groups from a MatchResult is normally verbose and stringly-typed:
result.groups["year"]?.value?.toInt(). The Captures API wraps any MatchResult and
provides type-safe accessors:
val datePattern = kexpresso {
capture("year") { exactly(4) { digit() } }
literal("-")
capture("month") { exactly(2) { digit() } }
literal("-")
capture("day") { exactly(2) { digit() } }
}
val caps = datePattern.find("2026-06-03")?.captures
caps?.int("year") // 2026
caps?.int("month") // 6
caps?.int("day") // 3
caps?.string("day") // "03"Use ...OrThrow variants when the group is guaranteed to be present — they give clear
error messages instead of silent nulls:
val pricePattern = kexpresso {
literal("\$")
capture("dollars") { oneOrMore { digit() } }
}
val caps = pricePattern.find("\$42")?.captures ?: error("no match")
caps.intOrThrow("dollars") // 42
caps.intOrThrow("missing") // throws NoSuchElementException: "Named group 'missing'…"
caps.intOrThrow("dollars") // throws NumberFormatException if value isn't an IntBy index — index 0 is the whole match, 1 is the first capturing group, etc.:
val pricePattern = kexpresso {
literal("\$")
capture { oneOrMore { digit() } }
}
val caps = pricePattern.find("\$42")?.captures
caps?.string(0) // "\$42" — whole match
caps?.int(1) // 42 — first capture groupSupported types — string, int, long, double, boolean (strict: "true"/"false" only).
All nullable variants return null on absent/unparseable values; ...OrThrow variants throw
NoSuchElementException, NumberFormatException, or IllegalArgumentException with a message
that names the group and the offending value.
val p = kexpresso { digit(); letter() }
p.source // "\\d[a-zA-Z]" — raw regex string
p.options // emptySet() — Set<RegexOption>Every pattern can explain itself in plain English. describe() walks the internal AST
(the same representation that renders the regex) and returns a deterministic, comma-joined
phrase — handy for code review, logging, or learning what a pattern does:
val p = kexpresso { startOfText(); oneOrMore { digit() }; endOfText() }
p.source // "\\A(?:\\d)+\\z"
p.describe() // "start of text, one or more of (a digit), end of text"Domain helpers (e.g. email()) are emitted as raw fragments, so they describe as
raw regex `…` rather than a fully decomposed phrase.
examples() walks the internal AST and produces strings that satisfy matches():
val drinkCode = kexpresso {
uppercaseLetter()
oneOrMore { lowercaseLetter() }
}
drinkCode.examples(3) // e.g. ["Ac", "Bs", "Ct"] — each passes drinkCode.matches(it)
// Deterministic: same seed → same list
drinkCode.examples(5, seed = 42)
// Exact repetition
val pinPattern = kexpresso { exactly(4) { digit() } }
pinPattern.examples(3) // e.g. ["5279", "1836", "4021"]
// Alternation
val drinkMenu = kexpresso { oneOf({ literal("Espresso") }, { literal("Latte") }) }
drinkMenu.examples(5) // ["Espresso", "Latte"]Honesty contract — when examples are guaranteed to match:
examples() guarantees that every returned string satisfies matches() when the
pattern's AST contains only supported nodes: Sequence, Literal, Token primitives
(digit, letter, whitespace, …), Quantifier, Group, and Alternation.
Best-effort cases (no match guarantee):
Raw fragments — including domain helpers (email(), ipv4(), isoDate(), …) and
Kexpresso.from(rawRegex), which all use raw nodes internally.followedBy, precededBy, …) — zero-width; skipped during generation.In best-effort mode examples() still returns without throwing — the results simply may
not satisfy matches().
val p = kexpresso { literal("Cappuccino") }
val kotlinRegex: Regex = p.toRegex() // all targets
val javaPattern: java.util.regex.Pattern = p.toPattern() // JVM only
toRegex()is available on every target.toPattern()is a JVM-only extension (java.util.regex.Patterndoes not exist on Kotlin/JS).
Pass any number of RegexOption values to the kexpresso { } call or to
Kexpresso.pattern { }:
val caseInsensitive = kexpresso(RegexOption.IGNORE_CASE) {
literal("espresso")
}
caseInsensitive.matches("ESPRESSO") // true
caseInsensitive.matches("Espresso") // true
val multiline = kexpresso(RegexOption.MULTILINE) {
startOfLine()
literal("Espresso")
endOfLine()
}
multiline.containsMatchIn("Espresso\nCappuccino") // trueIf you prefer an object-oriented style, use Kexpresso.pattern { } — it is identical
to the top-level kexpresso { } function:
val p = Kexpresso.pattern(RegexOption.IGNORE_CASE) { literal("Ristretto") }
p.matches("ristretto") // trueCertain regex patterns can cause catastrophic backtracking — an attacker who controls
input can make the regex engine take exponential time. The classic shape is nested
unbounded quantifiers such as (?:a+)+.
Kexpresso provides a best-effort static analyzer to catch this shape at development time:
// DSL produces (?:(?:[a-zA-Z])+)+ — nested unbounded quantifiers
val risky = kexpresso { oneOrMore { oneOrMore { letter() } } }
val report = risky.analyze()
if (report.isPotentiallyVulnerable) {
println("Findings:")
report.findings.forEach { println(" [${it.severity}] ${it.message}") }
}
// Findings:
// [WARNING] Nested unbounded quantifier at index 0: (?:(?:[a-zA-Z])+)+ …
// Convenience shorthand
if (risky.isPotentiallyVulnerable) { /* warn or reject */ }This is a best-effort heuristic, not a guarantee. It detects the canonical "evil
regex" shape — a group with an outer unbounded quantifier (*, +, or {n,}) whose
body also contains an inner unbounded quantifier. It does NOT detect all ReDoS patterns
(e.g. alternation-based catastrophic backtracking), and a clean result does not prove
the pattern is safe. Use it as an early-warning signal alongside proper input constraints
and performance testing.
Inherited a cryptic regex? Kexpresso.from(...) reads it back: it compiles the regex and
lets you explain it (describe()) or rewrite it as kexpresso DSL (toKexpressoCode()).
val pattern = Kexpresso.from("\\d{4}-\\d{2}-\\d{2}")
pattern.describe()
// "exactly 4 of (a digit), the literal "-", exactly 2 of (a digit), the literal "-", exactly 2 of (a digit)"
println(pattern.toKexpressoCode())
// kexpresso {
// exactly(4) { digit() }
// literal("-")
// exactly(2) { digit() }
// literal("-")
// exactly(2) { digit() }
// }toKexpressoCode() works on any KexpressoPattern — whether you built it with the DSL or
parsed it with from — so you can round-trip between the two representations.
Matching is always exact; parsing is best-effort. Kexpresso.from(r) compiles r
verbatim, so Kexpresso.from(r).matches(x) is always identical to Regex(r).matches(x).
The structural parse that powers describe() and toKexpressoCode() models the common
constructs (literals, predefined classes, anchors, quantifiers, groups, lookarounds,
alternation, back-references) and honestly degrades anything it doesn't model to raw("…")
(e.g. possessive quantifiers, atomic groups (?>…), inline-flag groups (?i)). The generated
code stays compilable and never changes match behaviour. An invalid regex throws
PatternSyntaxException, exactly as Regex(...) would.
# Compile, run all tests, Detekt static analysis, and the Kover coverage check
./gradlew build
# Run tests only
./gradlew test
# Run Detekt only
./gradlew detektSee CONTRIBUTING.md for the full contributor guide — including how to add a new DSL primitive — and docs/ARCHITECTURE.md for a map of how the codebase fits together.
Found a vulnerability? Please report it privately — see SECURITY.md. The project also runs CodeQL static analysis and an OpenSSF Scorecard supply-chain check on every push, and uses Dependabot to keep dependencies and GitHub Actions current.
For catastrophic-backtracking (ReDoS) risk in your own patterns, kexpresso ships a best-effort analyzer — see Safety: ReDoS analysis.
This project follows the Contributor Covenant. By participating, you are expected to uphold it.
kexpresso is free and open source. If it saves you time, you can say thanks:
☕ https://buymeacoffee.com/elzinko — every coffee fuels another release.
MIT — Copyright (c) 2026 Thomas Couderc.