Using regular expressions to parse strings in Kotlin.

theplebdev

Tristan Elliott

Posted on August 3, 2023

Using regular expressions to parse strings in Kotlin.

Table of contents

  1. What we are doing
  2. Regular expressions
  3. Mapping to the data class

The code

Introduction

  • I have embarked on my next app, a Twitch client app. This series will be all my notes and problems faced when creating this app.

What we are doing

  • So, when you hook into Twitch's IRC servers, upon every message sent by a user, you will be given a string of text that looks like this:

"@badge-info=subscriber/77;badges=subscriber/36,sub-gifter/50;client-nonce=d7a543c7dc514886b439d55826eeeb5b;color=;display-name=namers_namers;emotes=;first-msg=0;flags=;id=fd594314-969b-4f5e-a83f-5e2f74261e6c;mod=0;returning-chatter=0;room-id=19070311;subscriber=1;tmi-sent-ts=1690747946900;turbo=0;user-id=144252234;user-type= :marc_malabanan!marc_malabanan@marc_malabanan.tmi.twitch.tv PRIVMSG #a_seagull :LOL"

  • We want to be able to parse that string and turn it into a data class object like so:
data class TwitchUserData(
    val badgeInfo: String?,
    val badges: String?,
    val clientNonce: String?,
    val color: String?,
    val displayName: String?,
    val emotes: String?,
    val firstMsg: String?,
    val flags: String?,
    val id: String?,
    val mod: String?,
    val returningChatter: String?,
    val roomId: String?,
    val subscriber: Boolean,
    val tmiSentTs: Long?,
    val turbo: Boolean,
    val userId: String?,
    val userType: String?
)
Enter fullscreen mode Exit fullscreen mode
  • The data class will make recreating the Twitch UI much easier

  • We can do this with the help of regular expressions

Regular expressions

  • If you look above at the text we have to parse you will notice that it follows a pattern of characters = characters ;. So we have to build a regular expression to represent that pattern. Long story short the code looks like this:
val pattern = "([^;@]+)=([^;]+)".toRegex()
    val matchResults = pattern.findAll(input)
Enter fullscreen mode Exit fullscreen mode
  • The main things I want to point out are

  • 1) [^;@]+ this is how to say match all characters excluding ; and @. The + we will match this one or more times

  • 2) = This is a literal character matching. we are saying we want to match a equal sign.

  • 3) [^;]+ Similar to the first result, this is how we match all characters that come after an equal sign. Minus the ;

  • 4) brackets () We are using the brackets to form what is called a capture group. This will be useful when we have to do the mapping over to our data class. Just know that it allows us to easily identify the characters before and after the equals sign. This will be important later

  • For ease of working with we can take the Sequence<MatchResult> given to use by the pattern.findAll(input) method and convert it into a map like so:

fun parseStringBaby(input: String): Map<String, String> {
    val pattern = "([^;@]+)=([^;]+)".toRegex()
    val matchResults = pattern.findAll(input)

    val parsedData = mutableMapOf<String, String>()


    for (matchResult in matchResults) {
        val (key, value) = matchResult.destructured
        parsedData[key] = value
    }

    return parsedData
}

Enter fullscreen mode Exit fullscreen mode
  • notice the val (key, value) = matchResult.destructured. This is how we are accessing the capture groups that we set up with the brackets from earlier.

  • Now we can easily take this Map and map it to our TwitchUserData class

Mapping to the data class

  • Map the parsed data to our data class like so:
fun mapToTwitchUserData(parsedData: Map<String, String>): TwitchUserData {
return TwitchUserData(
    badgeInfo = parsedData["badge-info"],
    badges = parsedData["badges"],
    clientNonce = parsedData["client-nonce"],
    color = parsedData["color"],
    displayName = parsedData["display-name"],
    emotes = parsedData["emotes"],
    firstMsg = parsedData["first-msg"],
    flags = parsedData["flags"],
    id = parsedData["id"],
    mod = parsedData["mod"],
    returningChatter = parsedData["returning-chatter"],
    subscriber = parsedData["subscriber"]?.toIntOrNull() == 1,
    roomId = parsedData["room-id"],
    tmiSentTs = parsedData["tmi-sent"]?.toLongOrNull(),
    turbo = parsedData["turbo"]?.toIntOrNull() == 1,
    userType = parsedData["user-type"],
    userId = parsedData["user-id"],
)
}

Enter fullscreen mode Exit fullscreen mode
  • Is is the most eloquent looking code? No, but it works.

  • Running the code together would look like this:

val parsedData = parseStringBaby(inputString)
val mappedData =mapToTwitchUserData(parsedData)

Enter fullscreen mode Exit fullscreen mode

Conclusion

  • Thank you for taking the time out of your day to read this blog post of mine. If you have any questions or concerns please comment below or reach out to me on Twitter.
💖 💪 🙅 🚩
theplebdev
Tristan Elliott

Posted on August 3, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related