Low-Latency AI Engine For Mobiles & Wearables
by cactus-compute
4.4k
Fast, lightweight inference framework for energy-efficient on-device AI: numerical computation graph API, OpenAI-compatible inference engine, INT8 optimizations and model/tooling for compact, low-power deployments.
#llm, #framework, #ai