
On-device and remote LLM inference via native llama.cpp bindings, offering embeddings, context-aware text generation (streaming & non-streaming), lightweight HTTP client/server and GGUF model support.
Run LLMs locally on Android, iOS, and Desktop — using a single Kotlin API.
Offline-first · Privacy-preserving · Kotlin Multiplatform
Llamatik is a Kotlin Multiplatform library that lets you run:
llama.cpp
whisper.cpp
...fully on-device, with optional remote inference — all behind a unified Kotlin API.
No Python.
No mandatory servers.
Your models, your data, your device.
Designed for privacy-first, offline-capable, and cross-platform AI applications.
llamatik-backend)Want to see Llamatik in action before integrating it?
The Llamatik App showcases:
Your App
│
▼
LlamaBridge (shared Kotlin API)
│
├─ llamatik-core → Native llama.cpp (on-device)
├─ llamatik-client → Remote HTTP inference
└─ llamatik-backend → llama.cpp-compatible server
Switching between local and remote inference requires no API changes — only configuration.
Llamatik is published on Maven Central and follows semantic versioning.
dependencyResolutionManagement {
repositories {
google()
mavenCentral()
}
}
commonMain.dependencies {
implementation("com.llamatik:library:0.16.0")
}// Resolve model path (place GGUF in assets / bundle)
val modelPath = LlamaBridge.getModelPath("phi-2.Q4_0.gguf")
// Load model
LlamaBridge.initGenerateModel(modelPath)
// Generate text
val output = LlamaBridge.generate(
"Explain Kotlin Multiplatform in one sentence."
)The public Kotlin API is defined in LlamaBridge (an expect object with platform-specific actual implementations).
@Suppress("EXPECT_ACTUAL_CLASSIFIERS_ARE_IN_BETA_WARNING")
expect object LlamaBridge {
// Utilities
@Composable
fun getModelPath(modelFileName: String): String // copy asset/bundle model to app files dir and return absolute path
fun shutdown() // free native resources
// Embeddings
fun initModel(modelPath: String): Boolean // load embeddings model
fun embed(input: String): FloatArray // return embedding vector
// Text generation (non-streaming)
fun initGenerateModel(modelPath: String): Boolean // load generation model
fun generate(prompt: String): String
fun generateWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String
): String
// Text generation (streaming)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
callback: GenStream
)
// Text generation with JSON schema (non-streaming)
fun generateJson(prompt: String, jsonSchema: String? = null): String
fun generateJsonWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
jsonSchema: String? = null
): String
// Convenience streaming overload (callbacks)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
system: String,
context: String,
user: String,
onDelta: (String) -> Unit,
onDone: () -> Unit,
onError: (String) -> Unit
)
// Text generation with JSON schema (streaming)
fun generateJsonStream(prompt: String, jsonSchema: String? = null, callback: GenStream)
fun generateJsonStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
jsonSchema: String? = null,
callback: GenStream
)
fun nativeCancelGenerate() // cancel generation
}
interface GenStream {
fun onDelta(text: String)
fun onComplete()
fun onError(message: String)
}WhisperBridge exposes a small, platform-friendly wrapper around whisper.cpp for on-device speech-to-text.
The workflow is:
object WhisperBridge {
/** Returns a platform-specific absolute path for the model filename. */
fun getModelPath(modelFileName: String): String
/** Loads the model at [modelPath]. Returns true if loaded. */
fun initModel(modelPath: String): Boolean
/**
* Transcribes a WAV file and returns text.
* Tip: record WAV as 16 kHz, mono, 16-bit PCM for best compatibility.
*/
fun transcribeWav(wavPath: String, language: String? = null): String
/** Frees native resources. */
fun release()
}import com.llamatik.library.platform.WhisperBridge
val modelPath = WhisperBridge.getModelPath("ggml-tiny-q8_0.bin")
// 1) Init once (e.g. app start)
WhisperBridge.initModel(modelPath)
// 2) Record to a WAV file (16kHz mono PCM16) using your own recorder
val wavPath: String = "/path/to/recording.wav"
// 3) Transcribe
val text = WhisperBridge.transcribeWav(wavPath, language = null).trim()
println(text)
// 4) Optional: release on app shutdown
WhisperBridge.release()Note: WhisperBridge expects a WAV file path. Llamatik’s app uses AudioRecorder + AudioPaths.tempWavPath() to generate the WAV before calling transcribeWav(...).
Please go to the Backend README.md for more information.
Llamatik is already used in production apps on Google Play and App Store.
Want to showcase your app here? Open a PR and add it to the list 🚀
Llamatik is 100% open-source and actively developed.
All contributions are welcome!
This project is licensed under the MIT License.
See LICENSE for details.
Built with ❤️ for the Kotlin community.
Run LLMs locally on Android, iOS, and Desktop — using a single Kotlin API.
Offline-first · Privacy-preserving · Kotlin Multiplatform
Llamatik is a Kotlin Multiplatform library that lets you run:
llama.cpp
whisper.cpp
...fully on-device, with optional remote inference — all behind a unified Kotlin API.
No Python.
No mandatory servers.
Your models, your data, your device.
Designed for privacy-first, offline-capable, and cross-platform AI applications.
llamatik-backend)Want to see Llamatik in action before integrating it?
The Llamatik App showcases:
Your App
│
▼
LlamaBridge (shared Kotlin API)
│
├─ llamatik-core → Native llama.cpp (on-device)
├─ llamatik-client → Remote HTTP inference
└─ llamatik-backend → llama.cpp-compatible server
Switching between local and remote inference requires no API changes — only configuration.
Llamatik is published on Maven Central and follows semantic versioning.
dependencyResolutionManagement {
repositories {
google()
mavenCentral()
}
}
commonMain.dependencies {
implementation("com.llamatik:library:0.16.0")
}// Resolve model path (place GGUF in assets / bundle)
val modelPath = LlamaBridge.getModelPath("phi-2.Q4_0.gguf")
// Load model
LlamaBridge.initGenerateModel(modelPath)
// Generate text
val output = LlamaBridge.generate(
"Explain Kotlin Multiplatform in one sentence."
)The public Kotlin API is defined in LlamaBridge (an expect object with platform-specific actual implementations).
@Suppress("EXPECT_ACTUAL_CLASSIFIERS_ARE_IN_BETA_WARNING")
expect object LlamaBridge {
// Utilities
@Composable
fun getModelPath(modelFileName: String): String // copy asset/bundle model to app files dir and return absolute path
fun shutdown() // free native resources
// Embeddings
fun initModel(modelPath: String): Boolean // load embeddings model
fun embed(input: String): FloatArray // return embedding vector
// Text generation (non-streaming)
fun initGenerateModel(modelPath: String): Boolean // load generation model
fun generate(prompt: String): String
fun generateWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String
): String
// Text generation (streaming)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
callback: GenStream
)
// Text generation with JSON schema (non-streaming)
fun generateJson(prompt: String, jsonSchema: String? = null): String
fun generateJsonWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
jsonSchema: String? = null
): String
// Convenience streaming overload (callbacks)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
system: String,
context: String,
user: String,
onDelta: (String) -> Unit,
onDone: () -> Unit,
onError: (String) -> Unit
)
// Text generation with JSON schema (streaming)
fun generateJsonStream(prompt: String, jsonSchema: String? = null, callback: GenStream)
fun generateJsonStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
jsonSchema: String? = null,
callback: GenStream
)
fun nativeCancelGenerate() // cancel generation
}
interface GenStream {
fun onDelta(text: String)
fun onComplete()
fun onError(message: String)
}WhisperBridge exposes a small, platform-friendly wrapper around whisper.cpp for on-device speech-to-text.
The workflow is:
object WhisperBridge {
/** Returns a platform-specific absolute path for the model filename. */
fun getModelPath(modelFileName: String): String
/** Loads the model at [modelPath]. Returns true if loaded. */
fun initModel(modelPath: String): Boolean
/**
* Transcribes a WAV file and returns text.
* Tip: record WAV as 16 kHz, mono, 16-bit PCM for best compatibility.
*/
fun transcribeWav(wavPath: String, language: String? = null): String
/** Frees native resources. */
fun release()
}import com.llamatik.library.platform.WhisperBridge
val modelPath = WhisperBridge.getModelPath("ggml-tiny-q8_0.bin")
// 1) Init once (e.g. app start)
WhisperBridge.initModel(modelPath)
// 2) Record to a WAV file (16kHz mono PCM16) using your own recorder
val wavPath: String = "/path/to/recording.wav"
// 3) Transcribe
val text = WhisperBridge.transcribeWav(wavPath, language = null).trim()
println(text)
// 4) Optional: release on app shutdown
WhisperBridge.release()Note: WhisperBridge expects a WAV file path. Llamatik’s app uses AudioRecorder + AudioPaths.tempWavPath() to generate the WAV before calling transcribeWav(...).
Please go to the Backend README.md for more information.
Llamatik is already used in production apps on Google Play and App Store.
Want to showcase your app here? Open a PR and add it to the list 🚀
Llamatik is 100% open-source and actively developed.
All contributions are welcome!
This project is licensed under the MIT License.
See LICENSE for details.
Built with ❤️ for the Kotlin community.