
Syntax highlighting for Compose with purpose-built lexical tokenizers, role-based theming, predictable token roles, drop-in text helpers, and 39 built-in languages.
Kotlin Multiplatform syntax highlighting for Compose.
Purpose-built lexical tokenizers, role-based theming, and drop-in Compose text helpers. 39 built-in languages, no JS runtime, no regex grammars, no platform code.
SyntaxRole value (keyword, keyword.control, variable.parameter, ...), plus the non-null LanguageId that produced it. Themes match roles with progressive parent fallback.SyntaxStyle is just color + weight + style. Host TextStyle owns font family, size, line height, base color, and backgrounds."kotlin" or "kt"; the engine resolves built-in aliases and extension aliases. The engine is a pure function of (code, languageLabel). Easy to test, safe to share.rememberSyntaxAnnotatedString + BasicText for read-only views, or buildSyntaxStyledSpans + applySyntaxStyledSpans for BasicTextField editors. Engine and theme scoping is the host's choice.39 built-in languages, all driven by shared scanners, scanner options, and explicit vocabulary inputs. Pass a language label such as a built-in id or any true alias for the same public language identity (js, ts, env, bash, zsh, kts, pgsql, sqlite3, ...).
|
• Bash • C • C# • C++ • CSS • CSV • Dart • Diff / patch • Dockerfile • Dotenv |
• Go • GraphQL • HTML • INI • Java • JavaScript • JSON • JSX • Kotlin • Makefile |
• Markdown • PHP • PostgreSQL • PowerShell • Properties • Protobuf • Python • Ruby • Rust • Shell |
• SQL • SQLite • Swift • TOML • TSX • TypeScript • XML • YAML • Zsh |
Built-in language constants and the built-in set live on LanguageId.
Embedded languages are wired up internally for HTML script/style blocks, Markdown fenced code, and the script/style regions of JSX and TSX. See docs/embedded-languages.md for the full routing table and what's deliberately out of scope.
SyntaxMP targets JVM, Android, iOS arm64, iOS simulator arm64, and web through Kotlin/Wasm.
[versions]
syntaxmpVersion = "0.2.0"
[libraries]
syntaxmp = { module = "com.gallatinapps.syntaxmp:syntaxmp", version.ref = "syntaxmpVersion" }kotlin {
sourceSets {
commonMain.dependencies {
implementation(libs.syntaxmp)
}
}
}The syntaxmp coordinate is the Compose highlighter and pulls in the pure tokenizer layer transitively. Token-only consumers that build their own renderer can depend on com.gallatinapps.syntaxmp:syntaxmp-tokenizer directly; see docs/architecture.md.
SyntaxMP has two Compose paths, depending on whether the text is displayed or editable.
Display highlighted text. rememberSyntaxAnnotatedString builds an AnnotatedString you can drop into BasicText. The Composable handles its own remember chain, so theme changes restyle without retokenizing:
@Composable
fun CodeSnippet(
code: String,
languageLabel: String?,
engine: SyntaxTokenizer,
theme: SyntaxTheme,
) {
BasicText(
text = rememberSyntaxAnnotatedString(
code = code,
languageLabel = languageLabel,
engine = engine,
theme = theme,
),
style = TextStyle(fontFamily = FontFamily.Monospace, fontSize = 14.sp),
)
}Editable text. buildSyntaxStyledSpans plus the TextFieldBuffer.applySyntaxStyledSpans extension drops into a BasicTextField outputTransformation:
@Composable
fun CodeField(
state: TextFieldState,
languageLabel: String?,
engine: SyntaxTokenizer,
theme: SyntaxTheme,
) {
BasicTextField(
state = state,
outputTransformation = {
val code = asCharSequence().toString()
val tokens = engine.tokenize(code = code, languageLabel = languageLabel)
val spans = buildSyntaxStyledSpans(code = code, spans = tokens, theme = theme)
applySyntaxStyledSpans(spans)
},
textStyle = TextStyle(fontFamily = FontFamily.Monospace),
)
}Construct the engine once for the scope that owns your syntax configuration, and pass it through your app's existing wiring (a host-defined staticCompositionLocalOf, DI, or a one-surface remember). SyntaxMP deliberately doesn't ship that wiring.
See docs/building-an-editor.md for engine sharing, line splitting, caching, and large-document guidance.
SyntaxTheme.DefaultLight and SyntaxTheme.DefaultDark are starter themes, but most apps should define a theme that fits their own editor surface. SyntaxMP themes only syntax roles: color, optional weight, and optional style. Font family, size, line height, and backgrounds stay in your app's TextStyle and layout.
val roleStyles = SyntaxRoleStyles(
SyntaxRole.Keyword to SyntaxStyle(color = Color(0xFF3B73D9)),
SyntaxRole.Operator to SyntaxStyle(color = Color(0xFF4B5563)),
SyntaxRole.Punctuation to SyntaxStyle(color = Color(0xFF6B7280)),
SyntaxRole.Function to SyntaxStyle(color = Color(0xFF6F42C1)),
SyntaxRole.Type to SyntaxStyle(color = Color(0xFFB45309)),
SyntaxRole.Property to SyntaxStyle(color = Color(0xFF0F766E)),
SyntaxRole.String to SyntaxStyle(color = Color(0xFF2E7D5B)),
SyntaxRole.Number to SyntaxStyle(color = Color(0xFFAD3DA4)),
SyntaxRole.Tag to SyntaxStyle(color = Color(0xFF22863A)),
SyntaxRole.Attribute to SyntaxStyle(color = Color(0xFF6F42C1)),
SyntaxRole.Comment to SyntaxStyle(
color = Color(0xFF7A7F87),
fontStyle = FontStyle.Italic,
),
)
val theme = SyntaxTheme(roleStyles = roleStyles)
BasicText(
text = rememberSyntaxAnnotatedString(
code = code,
languageLabel = "kotlin",
engine = engine,
theme = theme,
),
style = TextStyle(fontFamily = FontFamily.Monospace),
)See docs/theming.md for the full role tree, resolution policy, all four copy/override helpers, and worked per-language overrides.
By default the engine enables all 39 built-ins. Shrink the surface (smaller construction cost, fewer code paths reachable) by passing a Set<LanguageId>:
val enabledLanguages = setOf(
LanguageId.Kotlin,
LanguageId.Json,
LanguageId.Markdown,
LanguageId.Shell,
)
val engine = SyntaxTokenizer(builtInLanguages = enabledLanguages)Labels resolving to a disabled language return emptyList(). The engine never throws "unknown language."
Implement LanguageTokenizer, wrap it in a LanguageExtension, register the extension on the engine, and your tokenizer runs alongside the built-ins:
val myql = LanguageId.fromString("myql")
val engine = SyntaxTokenizer(
extensions = listOf(
LanguageExtension(
languageId = myql,
aliases = setOf("mql"),
tokenizer = myqlTokenizer,
),
),
)Extensions resolve before built-ins, so you can override a built-in too. See docs/language-extension.md for full working examples.
SyntaxRole is, the root and refinement constants, and how custom roles work.LanguageId constants, and embedded-language routing.LanguageExtension, with a worked tokenizer and a testing recipe.BasicTextField. Engine sharing, line splitting, caching, and large-document guidance.How does it know what language my code is?
It doesn't. You tell it. Pass a language label such as "kotlin" or "kt", or get back no spans. Auto-detection is a separate problem with different correctness and performance tradeoffs, and is out of scope here.
What if the language isn't recognized?
The engine returns emptyList(). rememberSyntaxAnnotatedString falls back to a plain unstyled AnnotatedString when the label is null or blank, without ever invoking the engine. The engine itself never throws for unknown or unregistered languages.
My language isn't supported. What should I do?
You can add project-local support with LanguageExtension, including aliases and overrides for built-ins. If you want a language added as a built-in, search existing issues first. If there is no issue, open one with the language name, why it belongs in the built-in set, common labels or extensions, and a few representative snippets that should highlight well. PRs are welcome, especially when they start from a working extension, but built-in additions are not guaranteed: the core set stays focused on broadly useful languages so the library remains maintainable.
Can I tokenize large files?
Yes, within limits, and the limit is usually Compose text rendering rather than SyntaxMP. Tokenization itself is fast: a single-pass, full-document lexical scan with no regex or grammar runtime. For read-only display (rememberSyntaxAnnotatedString + BasicText) that scales to large files. Editing is the real constraint: BasicTextField doesn't virtualize text layout, so every keystroke re-measures the whole document. That base layout cost is unavoidable, but the styling cost is not: applying styled spans only to the visible range and tokenizing off the main thread keep large editable documents usable. See docs/building-an-editor.md for windowing, off-thread tokenization, caching, and large-document guidance.
Token color looks wrong. Is that a bug?
Maybe. Scanners are heuristic. Before filing it, check (a) what span.role.value and span.languageId.value the wrong span actually got, and (b) whether the issue is the scanner choosing the wrong role, or the theme styling that role in an unexpected way. The fixes differ.
Why is SyntaxStyle so restrictive?
The narrow shape is deliberate. Color + weight + style covers the visual decisions that should live with the syntax theme; font family, size, line height, and surfaces belong with your app's design system. If you need full SpanStyle control for a token, resolve styles yourself from raw SyntaxTokenSpans and skip the theme system.
See CHANGELOG.md.
Copyright 2026 Gallatin Applications LLC.
SyntaxMP is licensed under the Apache License, Version 2.0. See LICENSE.
The demo app bundles JetBrains Mono font files under the SIL Open Font License, Version 1.1. See syntaxmp-demo/THIRD_PARTY_NOTICES.md.
Kotlin Multiplatform syntax highlighting for Compose.
Purpose-built lexical tokenizers, role-based theming, and drop-in Compose text helpers. 39 built-in languages, no JS runtime, no regex grammars, no platform code.
SyntaxRole value (keyword, keyword.control, variable.parameter, ...), plus the non-null LanguageId that produced it. Themes match roles with progressive parent fallback.SyntaxStyle is just color + weight + style. Host TextStyle owns font family, size, line height, base color, and backgrounds."kotlin" or "kt"; the engine resolves built-in aliases and extension aliases. The engine is a pure function of (code, languageLabel). Easy to test, safe to share.rememberSyntaxAnnotatedString + BasicText for read-only views, or buildSyntaxStyledSpans + applySyntaxStyledSpans for BasicTextField editors. Engine and theme scoping is the host's choice.39 built-in languages, all driven by shared scanners, scanner options, and explicit vocabulary inputs. Pass a language label such as a built-in id or any true alias for the same public language identity (js, ts, env, bash, zsh, kts, pgsql, sqlite3, ...).
|
• Bash • C • C# • C++ • CSS • CSV • Dart • Diff / patch • Dockerfile • Dotenv |
• Go • GraphQL • HTML • INI • Java • JavaScript • JSON • JSX • Kotlin • Makefile |
• Markdown • PHP • PostgreSQL • PowerShell • Properties • Protobuf • Python • Ruby • Rust • Shell |
• SQL • SQLite • Swift • TOML • TSX • TypeScript • XML • YAML • Zsh |
Built-in language constants and the built-in set live on LanguageId.
Embedded languages are wired up internally for HTML script/style blocks, Markdown fenced code, and the script/style regions of JSX and TSX. See docs/embedded-languages.md for the full routing table and what's deliberately out of scope.
SyntaxMP targets JVM, Android, iOS arm64, iOS simulator arm64, and web through Kotlin/Wasm.
[versions]
syntaxmpVersion = "0.2.0"
[libraries]
syntaxmp = { module = "com.gallatinapps.syntaxmp:syntaxmp", version.ref = "syntaxmpVersion" }kotlin {
sourceSets {
commonMain.dependencies {
implementation(libs.syntaxmp)
}
}
}The syntaxmp coordinate is the Compose highlighter and pulls in the pure tokenizer layer transitively. Token-only consumers that build their own renderer can depend on com.gallatinapps.syntaxmp:syntaxmp-tokenizer directly; see docs/architecture.md.
SyntaxMP has two Compose paths, depending on whether the text is displayed or editable.
Display highlighted text. rememberSyntaxAnnotatedString builds an AnnotatedString you can drop into BasicText. The Composable handles its own remember chain, so theme changes restyle without retokenizing:
@Composable
fun CodeSnippet(
code: String,
languageLabel: String?,
engine: SyntaxTokenizer,
theme: SyntaxTheme,
) {
BasicText(
text = rememberSyntaxAnnotatedString(
code = code,
languageLabel = languageLabel,
engine = engine,
theme = theme,
),
style = TextStyle(fontFamily = FontFamily.Monospace, fontSize = 14.sp),
)
}Editable text. buildSyntaxStyledSpans plus the TextFieldBuffer.applySyntaxStyledSpans extension drops into a BasicTextField outputTransformation:
@Composable
fun CodeField(
state: TextFieldState,
languageLabel: String?,
engine: SyntaxTokenizer,
theme: SyntaxTheme,
) {
BasicTextField(
state = state,
outputTransformation = {
val code = asCharSequence().toString()
val tokens = engine.tokenize(code = code, languageLabel = languageLabel)
val spans = buildSyntaxStyledSpans(code = code, spans = tokens, theme = theme)
applySyntaxStyledSpans(spans)
},
textStyle = TextStyle(fontFamily = FontFamily.Monospace),
)
}Construct the engine once for the scope that owns your syntax configuration, and pass it through your app's existing wiring (a host-defined staticCompositionLocalOf, DI, or a one-surface remember). SyntaxMP deliberately doesn't ship that wiring.
See docs/building-an-editor.md for engine sharing, line splitting, caching, and large-document guidance.
SyntaxTheme.DefaultLight and SyntaxTheme.DefaultDark are starter themes, but most apps should define a theme that fits their own editor surface. SyntaxMP themes only syntax roles: color, optional weight, and optional style. Font family, size, line height, and backgrounds stay in your app's TextStyle and layout.
val roleStyles = SyntaxRoleStyles(
SyntaxRole.Keyword to SyntaxStyle(color = Color(0xFF3B73D9)),
SyntaxRole.Operator to SyntaxStyle(color = Color(0xFF4B5563)),
SyntaxRole.Punctuation to SyntaxStyle(color = Color(0xFF6B7280)),
SyntaxRole.Function to SyntaxStyle(color = Color(0xFF6F42C1)),
SyntaxRole.Type to SyntaxStyle(color = Color(0xFFB45309)),
SyntaxRole.Property to SyntaxStyle(color = Color(0xFF0F766E)),
SyntaxRole.String to SyntaxStyle(color = Color(0xFF2E7D5B)),
SyntaxRole.Number to SyntaxStyle(color = Color(0xFFAD3DA4)),
SyntaxRole.Tag to SyntaxStyle(color = Color(0xFF22863A)),
SyntaxRole.Attribute to SyntaxStyle(color = Color(0xFF6F42C1)),
SyntaxRole.Comment to SyntaxStyle(
color = Color(0xFF7A7F87),
fontStyle = FontStyle.Italic,
),
)
val theme = SyntaxTheme(roleStyles = roleStyles)
BasicText(
text = rememberSyntaxAnnotatedString(
code = code,
languageLabel = "kotlin",
engine = engine,
theme = theme,
),
style = TextStyle(fontFamily = FontFamily.Monospace),
)See docs/theming.md for the full role tree, resolution policy, all four copy/override helpers, and worked per-language overrides.
By default the engine enables all 39 built-ins. Shrink the surface (smaller construction cost, fewer code paths reachable) by passing a Set<LanguageId>:
val enabledLanguages = setOf(
LanguageId.Kotlin,
LanguageId.Json,
LanguageId.Markdown,
LanguageId.Shell,
)
val engine = SyntaxTokenizer(builtInLanguages = enabledLanguages)Labels resolving to a disabled language return emptyList(). The engine never throws "unknown language."
Implement LanguageTokenizer, wrap it in a LanguageExtension, register the extension on the engine, and your tokenizer runs alongside the built-ins:
val myql = LanguageId.fromString("myql")
val engine = SyntaxTokenizer(
extensions = listOf(
LanguageExtension(
languageId = myql,
aliases = setOf("mql"),
tokenizer = myqlTokenizer,
),
),
)Extensions resolve before built-ins, so you can override a built-in too. See docs/language-extension.md for full working examples.
SyntaxRole is, the root and refinement constants, and how custom roles work.LanguageId constants, and embedded-language routing.LanguageExtension, with a worked tokenizer and a testing recipe.BasicTextField. Engine sharing, line splitting, caching, and large-document guidance.How does it know what language my code is?
It doesn't. You tell it. Pass a language label such as "kotlin" or "kt", or get back no spans. Auto-detection is a separate problem with different correctness and performance tradeoffs, and is out of scope here.
What if the language isn't recognized?
The engine returns emptyList(). rememberSyntaxAnnotatedString falls back to a plain unstyled AnnotatedString when the label is null or blank, without ever invoking the engine. The engine itself never throws for unknown or unregistered languages.
My language isn't supported. What should I do?
You can add project-local support with LanguageExtension, including aliases and overrides for built-ins. If you want a language added as a built-in, search existing issues first. If there is no issue, open one with the language name, why it belongs in the built-in set, common labels or extensions, and a few representative snippets that should highlight well. PRs are welcome, especially when they start from a working extension, but built-in additions are not guaranteed: the core set stays focused on broadly useful languages so the library remains maintainable.
Can I tokenize large files?
Yes, within limits, and the limit is usually Compose text rendering rather than SyntaxMP. Tokenization itself is fast: a single-pass, full-document lexical scan with no regex or grammar runtime. For read-only display (rememberSyntaxAnnotatedString + BasicText) that scales to large files. Editing is the real constraint: BasicTextField doesn't virtualize text layout, so every keystroke re-measures the whole document. That base layout cost is unavoidable, but the styling cost is not: applying styled spans only to the visible range and tokenizing off the main thread keep large editable documents usable. See docs/building-an-editor.md for windowing, off-thread tokenization, caching, and large-document guidance.
Token color looks wrong. Is that a bug?
Maybe. Scanners are heuristic. Before filing it, check (a) what span.role.value and span.languageId.value the wrong span actually got, and (b) whether the issue is the scanner choosing the wrong role, or the theme styling that role in an unexpected way. The fixes differ.
Why is SyntaxStyle so restrictive?
The narrow shape is deliberate. Color + weight + style covers the visual decisions that should live with the syntax theme; font family, size, line height, and surfaces belong with your app's design system. If you need full SpanStyle control for a token, resolve styles yourself from raw SyntaxTokenSpans and skip the theme system.
See CHANGELOG.md.
Copyright 2026 Gallatin Applications LLC.
SyntaxMP is licensed under the Apache License, Version 2.0. See LICENSE.
The demo app bundles JetBrains Mono font files under the SIL Open Font License, Version 1.1. See syntaxmp-demo/THIRD_PARTY_NOTICES.md.