
Implements Unicode confusable detection per UTS #39, extends String with toSkeleton and isConfusable, embeds confusables and ignorable code point data with build-time table generation.
Kotlin Multiplatform (KMP) library that implements Unicode confusable detection based on Unicode Technical Standard #39 - Unicode Security Mechanisms.
It extends String with:
toSkeleton(): returns the UTS #39 confusable skeleton (specifically, internalSkeleton).isConfusable(other): returns whether two strings have the same skeleton.[!WARNING] A skeleton is intended only for internal use when testing confusability; it is not suitable for display and should not be treated as a general “normalization” of identifiers.
"paypal".isConfusable("p\u0430yp\u0430l") // => true (Cyrillic 'а')
"ѕсоре".toSkeleton() // => "scope"repositories {
mavenCentral()
}
kotlin {
sourceSets {
val commonMain by getting {
dependencies {
implementation("com.doist.x:confusables:1.0.0")
}
}
}
}This library embeds data from:
confusables.txt (Unicode 17.0.0)Default_Ignorable_Code_Point (Unicode 17.0.0)Kotlin tables are generated into build/ at build time from the pinned resources/unicode-data/ inputs.
All Unicode data is subject to Unicode’s Terms of Use.
Run:
./gradlew updateUnicodeData -PunicodeVersion=17.0.0Released under the MIT License.
Kotlin Multiplatform (KMP) library that implements Unicode confusable detection based on Unicode Technical Standard #39 - Unicode Security Mechanisms.
It extends String with:
toSkeleton(): returns the UTS #39 confusable skeleton (specifically, internalSkeleton).isConfusable(other): returns whether two strings have the same skeleton.[!WARNING] A skeleton is intended only for internal use when testing confusability; it is not suitable for display and should not be treated as a general “normalization” of identifiers.
"paypal".isConfusable("p\u0430yp\u0430l") // => true (Cyrillic 'а')
"ѕсоре".toSkeleton() // => "scope"repositories {
mavenCentral()
}
kotlin {
sourceSets {
val commonMain by getting {
dependencies {
implementation("com.doist.x:confusables:1.0.0")
}
}
}
}This library embeds data from:
confusables.txt (Unicode 17.0.0)Default_Ignorable_Code_Point (Unicode 17.0.0)Kotlin tables are generated into build/ at build time from the pinned resources/unicode-data/ inputs.
All Unicode data is subject to Unicode’s Terms of Use.
Run:
./gradlew updateUnicodeData -PunicodeVersion=17.0.0Released under the MIT License.