
Expose Compose UIs over a REST-like HTTP API for AI agents to inspect (semantic tree, screenshots) and interact (clicks, gestures, text), record GIFs, run headlessly without production changes.
Make AI Agents see and control your Compose UI.
Compose Driver enables AI agents and automated tools to interact with any Jetpack Compose UI (
Android and Desktop) through a simple HTTP API. It wraps your Composable in a test harness, exposing
a server that translates HTTP requests into ComposeUiTest actions.
Try it out: Clone this repository and open the sample/ directory with your favorite AI coding tool.
Add the plugin to your settings.gradle.kts.
plugins {
id("io.github.jdemeulenaere.compose.driver") version "0.5.0"
}
composeDriver {
// Enable the platforms you need
android()
desktop()
}The plugin will automatically create two new subprojects (:compose-driver-android and
:compose-driver-desktop) in your build, configured to run your Composables within the driver
environment. :compose-driver-android will depend on all your Android and Multiplatform subprojects
that use Compose, and :compose-driver-desktop will depend on all JVM and Multiplatform subprojects
that use Compose.
You can customize the generated driver projects using the composeDriver block.
composeDriver {
android {
name = "compose-driver-android"
robolectric {
sdk = 36
qualifiers = "w410dp-h920dp-xhdpi" // see https://robolectric.org/device-configuration/
}
// Manually add dependencies (e.g. Compose BOM)
dependencies {
add("implementation", platform("androidx.compose:compose-bom:2025.01.00"))
}
// Resolve dependency ambiguity for flavored projects (e.g. "nowinandroid")
missingDimensionStrategy("contentType", "demo")
}
desktop {
name = "compose-driver-desktop"
width = 1024
height = 768
density = 1.0f
}
}To start the driver, use the generated run task. You must specify the Composable you want to drive
using the compose.driver.composable system property.
Desktop
./gradlew :compose-driver-desktop:run -Dcompose.driver.composable=com.example.app.MainKt.MainScreenAndroid
./gradlew :compose-driver-android:run -Dcompose.driver.composable=com.example.app.MainKt.MainScreen[!NOTE] Android runs via Robolectric and starting the server might take a few seconds. For this reason, I recommend using the Desktop driver when working on multiplatform code.
Once running, the server listens at http://localhost:8080.
Example: Automated Login Flow
# 1. Wait for login screen
curl "http://localhost:8080/waitForNode?nodeTag=login_screen"
# 2. Input credentials
curl "http://localhost:8080/textInput?nodeTag=username&text=admin"
curl "http://localhost:8080/textInput?nodeTag=password&text=secret"
# 3. Click on the "Login" button and record the transition (requires ffmpeg)
curl "http://localhost:8080/click?nodeText=login&nodeTextIgnoreCase=true&nodeTextSubstring=true&gifDurationMs=2000" > login_flow.gif[!IMPORTANT] Most endpoints accept node selection parameters:
nodeTag: Targets a specificModifier.testTag().nodeText: Matches a node by its text content.nodeTextSubstring: Iftrue,nodeTextmatches as a substring (default:false, exact match).nodeTextIgnoreCase: Iftrue,nodeTextmatching is case-insensitive (default:false).If both
nodeTagandnodeTextare provided, the node must match both. If neither is provided, the action applies to the root node.
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/status |
Check if server is ready. Returns "ok". | |
GET |
/printTree |
Returns the semantic node tree as text. | |
GET |
/screenshot |
Returns a PNG screenshot of the target node or root. | |
GET |
/waitForIdle |
Waits for the UI to be idle (no pending changes). | |
GET |
/waitForNode |
timeout |
Waits for a matching node to exist. Default timeout: 5s. |
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/click |
Click a node. | |
GET |
/doubleClick |
Double-click a node. | |
GET |
/longClick |
Long-press a node. | |
GET |
/textInput |
text (req) |
Enter text into a field. |
GET |
/textReplacement |
text (req) |
Replace existing text in a field. |
GET |
/textClearance |
Clear text from a field. | |
GET |
/navigateBack |
Trigger the system "Back" action. | |
GET |
/scrollTo |
Scrolls to the node. |
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/swipe |
direction (req) |
Swipe UP, DOWN, LEFT, or RIGHT. |
GET |
/pointerInput/down |
x, y (req), pointerId
|
Send pointer down event at (x,y). |
GET |
/pointerInput/moveBy |
x, y (req), pointerId
|
Move pointer by delta (x,y). |
GET |
/pointerInput/moveTo |
x, y (req), pointerId
|
Move pointer to absolute (x,y). |
GET |
/pointerInput/up |
pointerId |
Send pointer up event. |
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/reset |
composable |
Reset the UI state. Optionally switch to a different Composable class. |
All endpoints (like /click, /swipe, etc.) also accept an optional gifDurationMs
parameter. If provided, the server will record a GIF of the interaction for the specified
duration (max 5,000 ms) and return it instead of the standard "ok" response.
[!NOTE] Note: This feature requires
ffmpegto be installed on the host machine and available in the system PATH.
Make AI Agents see and control your Compose UI.
Compose Driver enables AI agents and automated tools to interact with any Jetpack Compose UI (
Android and Desktop) through a simple HTTP API. It wraps your Composable in a test harness, exposing
a server that translates HTTP requests into ComposeUiTest actions.
Try it out: Clone this repository and open the sample/ directory with your favorite AI coding tool.
Add the plugin to your settings.gradle.kts.
plugins {
id("io.github.jdemeulenaere.compose.driver") version "0.5.0"
}
composeDriver {
// Enable the platforms you need
android()
desktop()
}The plugin will automatically create two new subprojects (:compose-driver-android and
:compose-driver-desktop) in your build, configured to run your Composables within the driver
environment. :compose-driver-android will depend on all your Android and Multiplatform subprojects
that use Compose, and :compose-driver-desktop will depend on all JVM and Multiplatform subprojects
that use Compose.
You can customize the generated driver projects using the composeDriver block.
composeDriver {
android {
name = "compose-driver-android"
robolectric {
sdk = 36
qualifiers = "w410dp-h920dp-xhdpi" // see https://robolectric.org/device-configuration/
}
// Manually add dependencies (e.g. Compose BOM)
dependencies {
add("implementation", platform("androidx.compose:compose-bom:2025.01.00"))
}
// Resolve dependency ambiguity for flavored projects (e.g. "nowinandroid")
missingDimensionStrategy("contentType", "demo")
}
desktop {
name = "compose-driver-desktop"
width = 1024
height = 768
density = 1.0f
}
}To start the driver, use the generated run task. You must specify the Composable you want to drive
using the compose.driver.composable system property.
Desktop
./gradlew :compose-driver-desktop:run -Dcompose.driver.composable=com.example.app.MainKt.MainScreenAndroid
./gradlew :compose-driver-android:run -Dcompose.driver.composable=com.example.app.MainKt.MainScreen[!NOTE] Android runs via Robolectric and starting the server might take a few seconds. For this reason, I recommend using the Desktop driver when working on multiplatform code.
Once running, the server listens at http://localhost:8080.
Example: Automated Login Flow
# 1. Wait for login screen
curl "http://localhost:8080/waitForNode?nodeTag=login_screen"
# 2. Input credentials
curl "http://localhost:8080/textInput?nodeTag=username&text=admin"
curl "http://localhost:8080/textInput?nodeTag=password&text=secret"
# 3. Click on the "Login" button and record the transition (requires ffmpeg)
curl "http://localhost:8080/click?nodeText=login&nodeTextIgnoreCase=true&nodeTextSubstring=true&gifDurationMs=2000" > login_flow.gif[!IMPORTANT] Most endpoints accept node selection parameters:
nodeTag: Targets a specificModifier.testTag().nodeText: Matches a node by its text content.nodeTextSubstring: Iftrue,nodeTextmatches as a substring (default:false, exact match).nodeTextIgnoreCase: Iftrue,nodeTextmatching is case-insensitive (default:false).If both
nodeTagandnodeTextare provided, the node must match both. If neither is provided, the action applies to the root node.
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/status |
Check if server is ready. Returns "ok". | |
GET |
/printTree |
Returns the semantic node tree as text. | |
GET |
/screenshot |
Returns a PNG screenshot of the target node or root. | |
GET |
/waitForIdle |
Waits for the UI to be idle (no pending changes). | |
GET |
/waitForNode |
timeout |
Waits for a matching node to exist. Default timeout: 5s. |
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/click |
Click a node. | |
GET |
/doubleClick |
Double-click a node. | |
GET |
/longClick |
Long-press a node. | |
GET |
/textInput |
text (req) |
Enter text into a field. |
GET |
/textReplacement |
text (req) |
Replace existing text in a field. |
GET |
/textClearance |
Clear text from a field. | |
GET |
/navigateBack |
Trigger the system "Back" action. | |
GET |
/scrollTo |
Scrolls to the node. |
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/swipe |
direction (req) |
Swipe UP, DOWN, LEFT, or RIGHT. |
GET |
/pointerInput/down |
x, y (req), pointerId
|
Send pointer down event at (x,y). |
GET |
/pointerInput/moveBy |
x, y (req), pointerId
|
Move pointer by delta (x,y). |
GET |
/pointerInput/moveTo |
x, y (req), pointerId
|
Move pointer to absolute (x,y). |
GET |
/pointerInput/up |
pointerId |
Send pointer up event. |
| Method | Endpoint | Params | Description |
|---|---|---|---|
GET |
/reset |
composable |
Reset the UI state. Optionally switch to a different Composable class. |
All endpoints (like /click, /swipe, etc.) also accept an optional gifDurationMs
parameter. If provided, the server will record a GIF of the interaction for the specified
duration (max 5,000 ms) and return it instead of the standard "ok" response.
[!NOTE] Note: This feature requires
ffmpegto be installed on the host machine and available in the system PATH.