paulapivat

Paul Apivat

Posted on October 27, 2020

Dictionaries

Dictionaries are good for storing structured data. They have a key/value pair so you can look up values of certain keys. The author provides some ways to initialize a dictionary, with comments about what is more or less pythonic (I'll take the author's word for it, but open to other perspectives).

Some of the things you can do with dictionaries are query keys, values, assign new key/value pairs, check for existence of keys or retrieve certain values.


empty_dict = {}                   # most pythonic
empty_dict2 = dict()              # less pythonic
grades = {"Joel": 80, "Grus": 99} # dictionary literal

type(grades)  # type check, dict

# use bracket to look up values
grades["Grus"]  # 99
grades["Joel"]  # 80

# KeyError for looking up non-existent keys
try:
   kate_grades = grades["Kate"]
except KeyError:
   print("That key doesn't exist")

# use in operator to check existence of key
joe_has_grade = "Joel" in grades  
joe_has_grade # true

kate_does_not = "Kate" in grades
kate_does_not # false

# use 'get' method to get values in dictionaries
grades.get("Joel") # 80
grades.get("Grus") # 99
grades.get("Kate") # default: None

# assign new key/value pair using brackets
grades["Tim"] = 93

grades # {'Joel': 80, 'Grus': 99, 'Tim': 93}

Enter fullscreen mode Exit fullscreen mode

Dictionaries are good for representing structured data that can be queried. The key take-away here is that in order to iterate through dictionaries to get either keys, values or both, we'll need to use specific methods likes keys(), values() or items().


tweet = {
    "user": "paulapivat",
    "text": "Reading Data Science from Scratch",
    "retweet_count": 100,
    "hashtags": ["#66daysofdata", "datascience", "machinelearning", "python", "R"]
    }

# query specific values
tweet["retweet_count"] # 100

# query values within a list
tweet["hashtags"] # ['#66daysofdata', 'datascience', 'machinelearning', 'python', 'R']
tweet["hashtags"][2] # 'machinelearning'

# retrieve ALL keys
tweet_keys = tweet.keys()
tweet_keys              # dict_keys(['user', 'text', 'retweet_count', 'hashtags'])
type(tweet_keys)        # different data type: dict != dict_keys

# retrieve ALL values
tweet_values = tweet.values() 
tweet_values  # dict_values(['paulapivat', 'Reading Data Science from Scratch', 100, ['#66daysofdata', 'datascience', 'machinelearning', 'python', 'R']])

type(tweet_values)      # different data type: dict != dict_values

# create iterable for Key-Value pairs (in tuple)
tweet_items = tweet.items()

# iterate through tweet_items()
for key,value in tweet_items:
    print("These are the keys:", key)
    print("These are the values:", value)

# cannot iterate through original tweet dictionary
# ValueError: too many values to unpack (expected 2)
for key, value in tweet:
    print(key)

# cannot use 'enumerate' because that only provides index and key (no value)
for key, value in enumerate(tweet):
    print(key)   # print 0 1 2 3 - index values
    print(value) # user text retweet_count hashtags (incorrectly print keys)
Enter fullscreen mode Exit fullscreen mode

Just like in lists and tuples, you can use the in operator to find membership. The one caveat is you cannot look up values that are in lists, unless you use bracket notation to help.


# search keys
"user" in tweet # true
"bball" in tweet # false

"paulapivat" in tweet_values # true
'python' in tweet_values # false (python is nested in 'hashtags')
"hashtags" in tweet  # true

# finding values inside a list requires brackets to help
'python' in tweet['hashtags']  # true

Enter fullscreen mode Exit fullscreen mode

What is or is not hashable?

Dictionary keys must be hashable.

Strings are hashable. So we can use strings as dictionary keys, but we cannot use lists because they are not hashable.


paul = "paul"
type(paul)        # check type, str

hash(paul)        # -3897810863245179227 ; strings are hashable
paul.__hash__()   # -3897810863245179227 ; another way to find the hash

jake = ['jake']   # this is a list
type(jake)        # check type, list

# lists are not hashable - cannot be used as dictionary keys
try:
   hash(jake)
except TypeError:
   print('lists are not hashable')

Enter fullscreen mode Exit fullscreen mode

For more content on data science, machine learning, R, Python, SQL and more, find me on Twitter.

πŸ’– πŸ’ͺ πŸ™… 🚩
paulapivat
Paul Apivat

Posted on October 27, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related