Part Two: Data Types
Simon Chalder
Posted on November 19, 2022
Welcome to part two. In this series I hope to introduce the basics of coding in Python for absolute beginners in an easy to follow and hopefully fun way. In this article we will look at some of the types of data we can use in Python, but first let's take a look at the solution for the challenge set in part one.
"In the spirit of science, there really is no such thing as a ‘failed experiment.’ Any test that yields valid data is a valid test." – Adam Savage
Solution to part one's challenge
The way we can put 2 variables together and print them on the same line is through something called 'concatenation'. This fancy sounding term simply means to stick 2 things together and we do it with the '+' sign. Like many tasks in coding there are several different ways of doing the same thing so here are a few ways of accomplishing this task
Option 1
genus = "Pinus"
species = "sylvestris"
solution = genus + species
print(solution)
# Output
Pinussylvestris
Option 2
genus = "Pinus"
species = "sylvestris"
print(genus + species)
# Output
Pinussylvestris
Option 3 including ways to add a space
genus = "Pinus" #we could add a space here - "Pinus "
species = "sylvestris" #We could add it here " sylvestris"
solution = genus + " " + species # Or here
print(genus + " " + species)#Or here
# Output
Pinus sylvestris
Concatenation is a useful tool for formatting output or just for combining more than one value. Consider the following example:
sci_name = "Ursus arctos horribilis"
location = "Northern America"
colour = "Brown"
age = 20
sentence = "Grizzly bears, also known as " + sci_name + " are found in " + location + ". They are usually " + colour " in colour, and live to around " + age " years old"
print(sentence)
#Output
"Grizzly bears, also known as Ursus arctos horribilis are found in Northern America. They are usually Brown in colour, and live to around 20 years old"
Remember to pay attention to your quotation marks. I would highly recommend playing around in IDLE with concatenation. For more information see this link.
Some other things we can put in variables
We have now declared our first variables (a fancy term for creating a variable and giving it a value). Not all that impressive, so what else can we do with variables? As we have already learned, variables can be used to hold many different types of data. Data types are an important concept in coding. Multiplying 2 variables which contain numbers would work, but doing the same with a number variable and another which contained only text might produce unexpected results or fail all together.
There are many data types which Python can work with but for simplicity the basic, commonly used types are:
Integers - Integers or 'int' for short are whole numbers, positive or negative e.g. 12, 458, -2537 etc.
Strings - String or 'str' for short represent text and are always surrounded by either 'single' or "double" quotation marks.
Floats - Floats are decimal numbers e.g. 23.6, -945.00023.
Boolean - Booleans represent one of 2 values either 'True' or 'False'.
So we could create the following kinds of variables:
my_name = "Simon"
# Stringcount = 100
# Integermeasurement = 685.48
# Floatis_raining = True
# Boolean
Why do we care about data types?
The projects we choose to make will inevitably contain different types of data. Everything from the names of plants, animals, places (strings), to survey numbers and population counts (integers), to map coordinates, correlation coefficients or fuel prices (floats). In order to get the expected results from our projects we must ensure we are using the correct data types. Understanding data types and how they interact with each other is essential when working on projects with several different types of data.
Numerical Data
"Arithmetic is being able to count up to twenty without taking off your shoes." - Mickey Mouse
Let's say we have declared (created) a variable num1
and assign a value of 10 (num1 = 10
). We can now perform arithmetic operations (+,-,*,/) using our variable. Firstly we can perform simple calculations using a single variable. In IDLE type num1 + 2
and press Enter. We see the result of 12 displayed. If we type num1
again we can see that the value of the variable has not changed, rather we have simply performed a calculation using its value.
Other mathematical operations we can perform include '-' for subtraction (num1 - 3
), '*' for multiplication (num1 * 2
), and '/' for division (num1 / 2
).
We can also perform calculations using multiple variables:
This is all well and good but performing calculations this way gives us no way of storing the result. We may wish to perform other calculations on the result or we may wish to display it in a way other than printing to the terminal or IDLE. The solution is to store the result of our calculation in another variable. To do this we use the syntax result = num1 + num2
.
A quick note on variable names.
In the above example we do not need to call our variable 'results' - we could call it anything we wish such as 'chicken' or 'sdjhdkjh'. However, when naming variables it is good practice to give them a name which reflects what they contain or are used for. It can be tempting to name variables things like 'x' or 'y' but when we begin to write longer applications it can quickly become confusing as to what these variables are for. When naming things in Python there are certain conventions we should follow. Some of these conventions make our code easier to read and some forbid the use of protected words which Python reserves for other purposes.
For example we may need to create a variable to store the total number of birds recorded during a point survey. If we simply name the variable 'result' or 'total' this could become confusing later - which result or total is this referring to? So we would choose something suitable such as bird_survey_total
. We could also have named it birdsurveytotal
but this is harder to read. In Python one of the naming conventions when naming variables is to use snake case (it is Python after all) which means to use an underscore '_' where a space would normally sit. Therefore, bird_survey_total
would be easier to read and would follow the correct naming convention.
If you recall the previous discussion about data types we saw boolean values can be True or False. Note the capitalised first letter - 'True' and 'False' are reserved names in Python and we cannot use them to name things. You can try this in IDLE. Declare a variable named 'True' and try to give it a value - Python will give you an error to remind you not to do this.
There is more to learn with this topic and something we all get better at with practice. If you would like to know more follow this link and try to see which of the following variable names would follow Python naming conventions:
number_of_foxes
numberOfFoxes
_numberoffoxes
Number_Of_Foxes
number_of_foxes!
NUMBER_OF_FOXES
What about Floats?
Floats, just like integers can be used with mathematical operators and can be combined with integers in calculations. Be aware however that the output from such calculations will be a float, even if the result is a whole number.
num1 = 5
num2 = 3.6
num3 = num1 * num2
print(num3)
#Output
18.0 # This result is a float despite being a whole number
...and Booleans?
Boolean values represent a state - something is True or False, On or Off, 0 or 1. Booleans will come into their own in the future when we discuss flow control where we can check the state of a boolean and then perform actions depending on that state. Below is a simplified example.
bool = True
if bool is True - do this
if bool is False - do this instead
Strings
Just like numerical data, working with strings of text is an important part of coding. To recap, a string is anything inside 'single' or "double" quotations.
'Lorem ipsum' is a string
"Lorem ipsum" is a string
"12345" is a string
'34.68' is a string
Lorem ipsum is NOT a string - no quotation marks
12345 is an integer and NOT a string
We can use single and double quotations together for grammatical purposes but we must ensure that we pay attention to the kind of quotations used. For example the following would be correct:
"It's raining outside"
However this would produce an error:
'It's raining outside'
Can you see why? Python takes everything in between the same kind of quotes as a string. In the second example Python read 'It'
as a string because it is surrounded by single quotes and everything after that as a separate thing. Enclosing the text in a different quotation style allows Python to treat the whole text as a string. Another example would be using double quotes inside singles such as:
'tell her I said "hello" please'
# This would work.
However, if we instead used:
'tell her I said 'hello' please'
This would produce a string containing 'tell her I said '
and another containing ' please'
as well as giving an error message.
Mixed Data Types
Care must be taken to ensure the correct data type is used. For example, multiplying 6 and 3 will produce a result of 9. What about multiplying 6 (an integer) and "3" (a string)? Try it in IDLE for yourself, was it what you expected? In this case we have literally taken the string "3" and printed it 6 times. Perhaps this is what we wanted but we must ensure we are aware of the type of data we are using.
Slicing
"No matter how thin you slice it, it's still baloney" - Al Smith
We already know that printing the variable shows us the contents of that variable but what if we only want to know part of the contents? Perhaps we need to know the 3rd character in a string as an identifier or maybe the last 2 characters are important. To do this we use a technique known as slicing. Despite sounding like something a hacker would do in some dodgy action movie, slicing simply means breaking down a string into individual characters and then slicing out the bits we want.
Why would we care about slicing?
Suppose we have an application which tracks livestock numbers. Each animal is identified by a letter "C" for cow, "S" for sheep, or "P" for pig followed by a unique ID number. A sheep in the database may have the ID "S-2468" for example. By using slicing we can read the first character from the ID string value and determine what kind of animal it is. We can also poll all records and sort them by their string's first character to determine how many of each animal are represented. Slicing can be a powerful and time saving tool when used properly.
Indexing
Each character in a string is represented by a number (the index). Computers count in a slightly different way to most of us in that instead of starting with 1 they use 0. Therefore when we are doing our slicing we need to remember that the first character in a string is represented by the index 0 not 1. For example, if we had the string "Apple"
, Python would read the characters and give them the following index numbers:
In order to retrieve the correct character we simply tell Python the index number of the character we want. To do this we use the name of the variable containing the string followed by square brackets and inside these brackets we place the index number of the required character. It is important here to note that slicing takes a copy of the required character and the string being sliced remains unaffected. Also of note is that spaces are included as characters in strings so slicing a character which is a space will return " ".
In the above example we have a word
variable containing the string "Apple". We slice character number 1 from the word
variable and store it in a new variable named letter
. This gives us the letter 'p' - remember, we begin the count from 0 and so character 1 is actually the 2nd letter in the string. If we had wanted the first character we would have used word[0]
.
In the same way that we count the 1st character as 0 we can also count backwards. Say we had a string named phrase
which contained a long sentence of 80 characters, if we wanted to know the last character in the string we could use phrase[79]
(remember the string has 80 characters but numbers start at 0 so the last number will always be 1 less than the total), however we may not know the total characters and a simpler way to do this is to use phrase[-1]
. Using minus numbers allows us to count backwards in the same manner as going forwards. So, [-2] would be the 2nd to last character and so no.
So now we know how to extract single characters from strings. What about if we want a range of characters? The process is similar to getting a single character but we give Python a start and end position for our slice. The syntax here is variable_name[start:end]
. It is important to note that the end character number is NOT included. If we use our "Apple" string example, we want to get the 1st character through to the 3rd character.
Although 'A' is 0, 'p' is 1 and 'p' is 2 we must slice up to 3 in order for character 2 to be included.
Slicing to the end of a string can be done even if we do not know the string's length by simply leaving a blank space. For example, to include everything from the 2nd character onwards we would use word[1:]
. To include everything from the start of the string to the 4th character we would use word[:4]
.
For more information on slicing follow this link.
Time for a challenge! Here are some tasks to go through to get some practice using mathematical operators as well as some string slicing.
Task 1:
Create 3 variables - num1
equal to 12, num2
equal to 28 and num3
equal to -3. Use these variables to calculate the following:
-
num1
+num2
-
num1
*num2
-
num2
*num3
-
num1
-num3
- Bonus - try to solve the following using Python. What data type is the answer?
Task 2:
Create 3 variables - animal
equal to "Crayfish", plant
equal to "Honeysuckle", and habitat
equal to "woodland".
In a new variable store the following. Print the new variable to verify it is correct.
- The 1st character from
animal
- The last character from
habitat
- The characters 'fish' from
animal
- the characters 'Honey' from
plant
- The last 4 characters from
habitat
Conclusion
We have now covered variable creation and some of the types of data we can store in our variables. We have also gone over how to do basic mathematics with numbers and how to slice and manipulate text. In the next article we will look at data structures, that is, a way that we can store more than one value in a variable and even turn our variables into miniature databases. I will continue to leave small challenges to try and get you thinking and practising these topics. However, I will be intentionally asking you to solve problems for which the solution involves things we have not fully covered. The reason for this is that as you go on to make your own projects you will get stuck. It happens to everyone, even the professionals and a common joke is that developers are nothing more than expert Googlers!
When you do get stuck it is important to try and get yourself unstuck and learning where to find help and information is key. I will likely devote an entire article to this in the future but a good starting point is the official Python documentation or to search for the exact issue you are having such as "python string will not print". This will usually get you on the right track and usually someone has had the exact same issue as you. More on this in the future.
Thanks for reading. Constructive criticism is always appreciated. I look forward to seeing you in part 3.
Simon
Posted on November 19, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.