Hanukkah of Data 2022 - Puzzle 1

ajkerrigan

AJ Kerrigan

Posted on December 29, 2022

Hanukkah of Data 2022 - Puzzle 1

Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. See the introductory post for a bit more detail, but the pitch in my head is "Advent of Code meets SQL Murder Mystery". This post walks through my approach to the first puzzle.

Pre-work

Before actually solving any puzzles, there's a mini warm-up puzzle that lets you access the fictional data set. With that done, it's helpful to explore the data first just to get a feel for the shape/size/relationships. I'm not going to document that here, but in my case the gist was "open with VisiData and poke around a bit."

Warning: From here on, this post unavoidably contains spoilers. If you'd rather do the puzzles on your own first, please close this browser tab and run away :).

Thinking

The first daily puzzle involves finding a customer whose phone number matches their last name as typed on a phone's keypad:

“They came in awhile ago and showed me their business card, and that’s what it said. Skilled Private Investigator. And their phone number was their last name spelled out. I didn’t know what that meant, but apparently before there were smartphones, people had to remember phone numbers or write them down. If you wanted a phone number that was easy-to-remember, you could get a number that spelled something using the letters printed on the phone buttons: like 2 has “ABC”, and 3 “DEF”, etc. And I guess this person had done that, so if you dialed the numbers corresponding to the letters in their name, it would call their phone number!

So...

  • Define a phone letter --> number translation
  • Find cases where the phone number shows up in the customer name

Doing (VisiData)

Since this was the first puzzle and I already had the data set open in VisiData, I stayed there. I wrote a little plugin a while back that lets me drop into a ptipython REPL that preserves VisiData context and keeps a VisiData-specific history. I find it super helpful for quick code explorations that I intend to throw away - if something is useful, I can adapt bits from the saved history into .visidatarc snippets or plugins later. So my actual work started there...

  • From a REPL, define a translation table and save it to the global vd object so it's available in future vd expressions:
phone_pad = {
    "abc": 2,
    "def": 3,
    "ghi": 4,
    "jkl": 5,
    "mno": 6,
    "pqrs": 7,
    "tuv": 8,
    "wxyz": 9,
}

vd.phone_lookup = str.maketrans(
    {letter: str(number) for letters, number in phone_pad.items() for letter in letters}
)
Enter fullscreen mode Exit fullscreen mode
  • Add a new column (= or addcol-expr) based on this expression:
name.lower().replace(' ', '').translate(vd.phone_lookup)
Enter fullscreen mode Exit fullscreen mode
  • Rename the new column (^ or rename-col) to a friendly name like name_translated

  • Find rows where the phone number appears in the translated name, using z| or select-expr with this expression:

phone.replace('-', '') in name_translated
Enter fullscreen mode Exit fullscreen mode
  • Open a new sheet with only the selected row, using " or dup-selected

Refining / Reusing

While I find this "one step at a time" iterative approach very friendly for ad-hoc exploration, sometimes "just this once..." operations keep coming back. So how would I clean this up just enough so that it was more easily reusable?

  • First, turn those custom REPL bits into a custom VisiData command, and include it in my .visidatarc - either directly or as part of an isolated plugin file.
  • Then, save my manual steps using VisiData's command log and adapt/replay it in the future.

There's sample code for that in this gist. With the Python bits included or imported into .visidatarc and a noahs-customers.csv file in the current directory, vd -p puzzle1.vdj finds the solution.

Solutions from Elsewhere

I love Dian Fey's approach to this puzzle using Postgres!

Dolmen's solution here leaned into the SQLite approach I snuck up on and rejected in favor of VisiData. It's good to see it up in GitHub for the sake of comparison - pure SQLite is a feature!

💖 💪 🙅 🚩
ajkerrigan
AJ Kerrigan

Posted on December 29, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Hanukkah of Data 2022 - Puzzle 1
hanukkahofdata Hanukkah of Data 2022 - Puzzle 1

December 29, 2022