Not Kotdog: Using Computer Vision to Detect Hot Dogs in Kotlin
Adam McNeilly
Posted on June 20, 2017
Kotlin is a programming language from JetBrains that makes it very easy to implement a RESTful API. Now that it’s officially supported for Android development, I wanted to build an app using Kotlin and Clarifai. Luckily, the creators of HBO’s Silicon Valley gave me all the inspiration I needed in last week’s episode where Jian-Yang builds an app that uses machine learning to identify if something is “HotDog or “Not HotDog. I mean, if Clarifai knows anything, it knows hot dogs (see: food model or NSFW model)!
We will be assuming some basic knowledge of Android to get off the ground, but if you are unfamiliar you can learn about building your first app here. If you have a version of Android Studio prior to 3.0, you can refer to these guides on starting a Kotlin project. Otherwise, read on to learn how you can build your own Not HotDog KotDog app using Clarifai and Kotlin!
Defining Our Models
Let’s start by discussing our model objects. If you take a look at the Clarifai Predict Documentation you’ll see the body of the Curl Request looks like this:
{
"inputs": [
{
"data": {
"image": {
"base64": "'"$(base64 /home/user/image.jpeg)"'"
}
}
}
]
}
What we have here is an image object, within a data object, within an input object, that is part of an array. So ultimately we will need four classes here. I will call each of them ClarifaiImage
, ClarifaiData
, ClarifaiInput
, ClarifaiPredictRequest
, respectfully. Here is how they will be defined in Kotlin:
data class ClarifaiImage(val base64: String? = "")
data class ClarifaiData(val image: ClarifaiImage? = null)
data class ClarifaiInput(val data: ClarifaiData? = null)
data class ClarifaiPredictRequest(val inputs: List<ClarifaiInput>? = ArrayList())
Yes, it is true, each of these classes only needs a single line! Kotlin provides us with data classes which provide default implementations for common methods such as toString()
, equals()
, and copy()
. Kotlin classes already provide us with getter and setter methods. Another benefit of Kotlin as opposed to some other languages is the use of default parameters in the constructor. If we look at ClarifaiImage
, for example - the constructor takes in an argument for the base64 value, but if it is not passed in it will be assigned to an empty string. You can learn more about those here.
In addition to all of those, we need to make an AuthToken
class that will come back from our authorization call, discussed next:
data class AuthToken(
@Json(name = "access_token") val accessToken: String? = "",
@Json(name = "expires_in") val expiresIn: Int? = 0
)
Notice that in this class, we use the @Json(name = "")
annotation to specify what the JSON key is for a field. If you don’t specify this, Retrofit will just use the variable name. In this case, though, the JSON convention conflicts with Kotlin variable name convention, so we’re using the annotation to override that.
If you would like to see all of the model classes for this project, including the ones used for a ClarifaiPredictResponse
, you can view them here.
Retrofit & Authorization
Now that we’ve defined the necessary models used to make our calls, the next thing we need to implement in our app is an authorization call. We will do so using Retrofit, an HTTP Client for Android that was built by Square. This is an industry standard library used for making network requests. Let’s start by adding the necessary dependencies into our build.gradle file:
compile 'com.squareup.retrofit2:retrofit:2.1.0'
compile 'com.squareup.retrofit2:converter-moshi:2.1.0'
compile 'com.squareup.okhttp3:logging-interceptor:3.3.1'
compile 'com.jakewharton.timber:timber:4.5.1'
While only the first two are required, I’ve included a logging interceptor for the HTTP calls for debug purposes, as well as a common logging library by Jake Wharton called Timber.
To implement Retrofit, we start by creating an interface that defines any calls we want to make. So far, we’ll need one for authorize()
that takes in a RequestBody
object, and will return an AuthToken
result. Here is what the interface code looks like:
interface ClarifaiAPI {
@POST("/v2/token")
fun authorize(@Body requestBody: RequestBody): Call<AuthToken>
}
The annotation is what tells retrofit that this is a POST request, and provides any extension onto the base URL. Where is the base URL coming from? I’m glad you asked! Let’s build our ClarifaiManager
class!
class ClarifaiManager(context: Context, apiId: String, apiSecret: String) {
private val clarifaiApi: ClarifaiAPI
init {
val authInterceptor = AuthorizationInterceptor(apiId, apiSecret, context)
val loggingInterceptor = HttpLoggingInterceptor().setLevel(HttpLoggingInterceptor.Level.BODY)
val client = OkHttpClient.Builder().addInterceptor(authInterceptor).addInterceptor(loggingInterceptor).build()
val retrofit = Retrofit.Builder()
.baseUrl("https://api.clarifai.com/")
.addConverterFactory(MoshiConverterFactory.create())
.client(client)
.build()
clarifaiApi = retrofit.create(ClarifaiAPI::class.java)
}
fun authorize(requestBody: RequestBody): Call<AuthToken> {
return clarifaiApi.authorize(requestBody)
}
}
The ClarifaiManager.kt
class maintains a reference to our ClarifaiApi
interface. This class defines the OkHttp client we want to use, and any initializations. Here we define our logging intercepter, an authorization interceptor (explained next), and our client which has a base url of “https://api.clarifai.com/ and uses Moshi to convert the JSON response to our Kotlin objects.
The AuthorizationInterceptor.kt
file is an interceptor class that will intercept all outgoing Retrofit calls, and preform any necessary actions. In this case, we know that we need to include an Authorization header on every call, so defining this in an interceptor is easier than applying it to every call in the ClarifaiApi.kt
interface. Here is the code for the interceptor:
class AuthorizationInterceptor(val apiId: String, val apiSecret: String, val context: Context) : Interceptor {
override fun intercept(chain: Interceptor.Chain?): Response {
// Get request path.
val uri = chain?.request()?.url()?.uri()
val path = uri?.path
val authValue: String
if (path == "/v2/token") {
authValue = Credentials.basic(apiId, apiSecret)
} else {
val prefs = context.getSharedPreferences(App.PREFS_NAME, Context.MODE_PRIVATE)
val authString = prefs.getString(App.AUTH_TOKEN_KEY, "")
val authResponse = Moshi.Builder().build().adapter(AuthToken::class.java).fromJson(authString)
authValue = "Bearer ${authResponse?.accessToken}"
}
val request = chain?.request()?.newBuilder()?.addHeader("Authorization", authValue)?.build()
return chain?.proceed(request)!!
}
}
The class accepts two strings, which are your API ID and API Secret (found under your application, as well as a context which is used for shared preferences. Our interceptor does one of two things:
If we are trying to hit the token endpoint, we use basic authorization credentials.
If we are trying to access any other endpoint, we use the authorization token that’s been stored in shared preferences. We read back the
AuthToken.kt
object as a string and use Moshi to convert it back to an object. We’ll discuss how to save that next.
Now that we have our Retrofit service defined, it’s time to implement it. We’ll do this in our MainActivity.kt
file inside the onCreate()
method. Here is a snippet of our activity file that is relevant up to this point:
class MainActivity : AppCompatActivity() {
val manager: ClarifaiManager by lazy { ClarifaiManager(this, getString(R.string.api_id), getString(R.string.api_secret)) }
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
authorizeUser()
}
private fun authorizeUser() {
val call = manager?.authorize(RequestBody.create(MEDIA_TYPE_JSON, GRANT_TYPE_CREDENTIALS))
call?.enqueue(object : Callback<AuthToken> {
override fun onFailure(call: Call<AuthToken>?, t: Throwable?) {
Timber.e(t)
}
override fun onResponse(call: Call<AuthToken>?, response: Response<AuthToken>?) {
Timber.v("Success! Token ${response?.body()?.accessToken}")
val authString = Moshi.Builder().build().adapter(AuthToken::class.java).toJson(response?.body())
val prefs = getSharedPreferences(App.PREFS_NAME, Context.MODE_PRIVATE)
val editor = prefs.edit()
editor.putString(App.AUTH_TOKEN_KEY, authString)
editor.apply()
}
})
}
companion object {
private val MEDIA_TYPE_JSON = MediaType.parse("application/json; charset=utf8")
private val GRANT_TYPE_CREDENTIALS = "\"grant_type\":\"client_credentials\""
}
}
Inside the onCreate()
method we create our ClarifaiManager
instance using our API credentials, and then using the authorizeMember()
method we get the call and implement a Callback
using an anonymous class that will handle the success or failure response. If it is a failure, we simply log the error. If we are successful, we convert the AuthToken
response to a string using Moshi and store it in SharedPreferences
, so it can be read by the interceptor we’ve already created.
Break Point
This would be a good point to pause from the tutorial and test that your application works. Before you run it, here are some additional steps that didn’t get covered:
Include the internet permission in your AndroidManifest.xml file by adding
<uses-permission android:name="android.permission.INTERNET" />
outside of the<application>
tag.Add an App.kt file which defines your application and has some constants and the Timber setup. You can copy the source here.
-
Following the tutorial, you should now be able to run your app. When the activity starts, you should see something similar to this in your logcat:
05-25 13:54:34.619 24830-24830/com.clarifai.notkotdog V/MainActivity$authorizeU: Success! Token jU85Sdyz2moNlGOK6Pl4MVHEu2ZJJj
If you experienced any errors, please double check the source code from GitHub and let us know in the comments so we can update the tutorial accordingly.
Additional Code
Before diving into implementing the prediction calls, there’s some additional code you will want to add to your sample app, only if you are following along. If you are just reading through, skip to the predict call section.
This is code that’s not in scope of what this post was designer for. If you would like additional clarification on any of it, please ask in the comments!
Add a photo icon drawable to be used for the FAB.
Update your
AndroidManifest.xml
file to include theFileProvider
and additional permissions. You must also add provider paths as an XML file.Update your
activity_main.xml
andcontent_main.xml
files from here.The full
MainActivity.kt
code can be found here, but we will discuss some of it still.
Predict Call
Once you’ve verified that you can run the app and successfully get an authorization token, we can begin implementing the predict call.
First things first, let’s make the corresponding changes to ClarifaiAPI.kt
and ClarifaiManager.kt
. These changes should not come as a surprise, they’re implemented the same way the authorize()
call was:
interface ClarifaiAPI {
...
@POST("/v2/models/{model_id}/outputs")
fun predict(@Path("model_id") modelId: String, @Body requestBody: ClarifaiPredictRequest): Call<ClarifaiPredictResponse>
}
class ClarifaiManager(context: Context, apiId: String, apiSecret: String) {
...
fun predict(modelId: String, request: ClarifaiPredictRequest): Call<ClarifaiPredictResponse> {
return clarifaiApi.predict(modelId, request)
}
}
Next, we can implement the predict call inside our activity. Here is the logic it should follow for Not KotDog:
Show the image we are predicting and a loading state.
Encode the image bytes as a base64 string and build our
ClarifaiPredictRequest
.Make the call with Retrofit, determine if the picture is a hot dog, and update the view accordingly.
To determine if we have a hot dog, we will use the concepts returned in the response along with Kotlin’s Collection.Any
method to see if any of the concepts match the name we’re looking for.
private fun predict(modelId: String, imageBytes: ByteArray?) {
// If bytes are null just return
if (imageBytes == null) {
return
}
// Clear out previous and show loading
resultView?.visibility = View.GONE
progressBar?.visibility = View.VISIBLE
imageView?.setImageBitmap(BitmapFactory.decodeByteArray(imageBytes, 0, imageBytes.size))
// Build out the request
val image = ClarifaiImage(
Base64.encodeToString(imageBytes, 0)
)
val data = ClarifaiData(image = image)
val input = ClarifaiInput(data)
val request = ClarifaiPredictRequest(arrayListOf(input))
val call = manager?.predict(modelId, request)
call?.enqueue(object : Callback<ClarifaiPredictResponse> {
override fun onResponse(call: Call<ClarifaiPredictResponse>?, response: Response<ClarifaiPredictResponse>?) {
Timber.v("Success!")
Timber.v("${response?.body()}")
val matchedConcept = response?.body()?.outputs?.first()?.data?.concepts?.any { it.name == HOTDOG_KEY } ?: false
val resultTextResource = if (matchedConcept) R.string.hotdog_success else R.string.hotdog_failure
val resultColorResource = if (matchedConcept) R.color.green else R.color.red
resultView?.text = getString(resultTextResource)
resultView?.setBackgroundColor(ContextCompat.getColor(this@MainActivity, resultColorResource))
resultView?.visibility = View.VISIBLE
progressBar?.visibility = View.GONE
}
override fun onFailure(call: Call<ClarifaiPredictResponse>?, t: Throwable?) {
Timber.e(t)
resultView?.text = getString(R.string.hotdog_error)
resultView?.setBackgroundColor(ContextCompat.getColor(this@MainActivity, R.color.red))
resultView?.visibility = View.VISIBLE
progressBar?.visibility = View.GONE
}
})
}
To modify this to fit your needs, you’ll just need to pass in the appropriate model id (which can be found here)[https://developer.clarifai.com/models], and change the onResponse()
logic to look for things other than a hot dog.
After implementing your predict call, as well as the other necessary code changes mentioned above, you should have something like this:
Posted on June 20, 2017
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.