INTRODUCTION TO PYTHON FOR DATA ENGINEERING
fatumakaliku
Posted on August 31, 2022
Python is a high-level (makes it easy to learn and doesn't require you to understand details of a computer in order to use it), general-purpose (can be used in various domains such as web development, automation, ML, AI), interpreted (written into a source code) programming language.
python is used in data science because it is rich in mathematical tools that are required to analyze data.
python programs have extension .py and is run on the command line by python file_name.py.
Python Hello World
Here lets create our first python program, hello world program. This will require you to first create a folder and name it lets say mycode where you'll be saving your code files. Then you'll need to launch the VS code and open the folder you created, mycode.
Then create a new python file, lets name it app.py file and enter the following code and save the file.
print ('Hello World!')
The print() statement is an inbuilt function that returns the message on your screen, here it returns the Hello World! message on the screen.
Comments.
They are written with # in the beginning.
When writing code sometimes you want to document it, you want to note why a piece of code works and you can do so using comments.
Basically you use comments to explain formulas, algorithms and complex logics. When executing python programs, the python interpreter ignores the comments and only selectively interprets the code.
Python provides three kinds of comments including block comment, inline comment, and documentation string.
- Python block comment These comments explain the code that follows below it and its similarly idented as the code that it explains.
# Increase price of cat by 1000
price = price + 1000
- Python inline comments These are comments placed in the same line as statements.
cat = cat + 1000 # increase the cat price by 1000
- Documentation string. A documentation string is a string literal that you put as the first lines in a code block, for example, a function and documentation strings are called docstrings. Technically, docstrings are not the comments but they create anonymous variables that reference the strings. Also, they’re not ignored by the Python interpreter.
def sort():
""" sort the list using sort algorithm """
Variables
Variables are labels that you can assign values to. Their sole purpose is to label and store data in memory. This data can then be used throughout your program.
favorite_animal = "cat"
print(favorite_animal)
The variable favorite_animal can hold various values at different times. And its value can change throughout the program.
Arithmetic Operations
print(15 + 5) # 20 (addition)
print(11 - 9) # 2 (subtraction)
print(4 * 4) # 16 (multiplication)
print(4 / 2) # 2.0 (division)
print(2 ** 8) # 256 (exponent)
print(7 % 2) # 1 (remainder of the division)
print(11 // 2) # 5 (floor division)
Comparison and Logical Operators
Python comparison operators are used to compare two values;
==, !=, >, <, >=, <=.
Python Logical Operators
-Logical operators are used to combine conditional statements:
and, or, not
Python Arithmetic Operators
- Arithmetic operators are used with numeric values to perform common mathematical operations: +, -, , /, %, *, //
Data Types.
1. Strings
Strings in python are surrounded by either single quotation marks('), or double quotation marks(")
You can display a string literal with the print() function.
2. Numbers.
There are three numeric types in Python:
integer
float
complex
x = 1 # int
y = 2.8 # float
z = 1j # complex
3. Booleans.
- Booleans represent one of two values: True or False
- When you run a condition in an if statement, Python returns True or False
#booleans
a = 1000
b = 200
if b > a:
print("b is greater than a")
else:
print("b is not greater than a")
4. Lists
Lists are used to store multiple items in a single variable.
Lists are one of 4 built-in data types in Python used to store collections of data, the other 3 are Tuple, Set, and Dictionary, all with different qualities and usage and are created using square brackets[]
# can store any data type
Multiple_types = [False, 5.7, "Hello"]
# accessed and modified
favourite_animals = ["cats", "dogs", "rabbits"]
print(favourite_animals[1]) # dogs
favourite_animal[0] = "parrots"
print(favourite_animal[0]) # parrots
# subsets
print(favourite_animals[1:3]) # ['cats', 'rabbits']
print(favourite_animals[2:]) # ['rabbits']
print(favourite_animals[0:2]) # ['parrots', 'dogs']
# append
favourite_animals.append("bunnies")
# insert at index
favourite_animals.insert(1, "horses")
# remove
favourite_animals.remove("bunnies")
5. Dictionaries
Dictionaries are key-value pairs. They are surrounded by {}. A dictionary is a collection which is ordered*, changeable and do not allow duplicates.
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
#access,modify,delete
print(thisdict["brand"]) # Ford
print(thisdict["model"]) # Mustang
del thisdict["year"]
6. Loops
# With the **while** **loop** we can execute a set of statements as long as a condition is true.
i = 1
while i < 6:
print(i)
i += 1
# A **for** **loop** is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string).
fruits = ["apple", "banana", "cherry"]
for x in fruits:
print(x)
# Looping Through a String
for x in "banana":
print(x)
File I/O
The simplest way to produce output is using the print statement where you can pass zero or more expressions separated by commas.
print "Python is really a great language,", "isn't it?"
# This produces the following result on your standard screen −
Python is really a great language, isn't it?
# Python provides two built-in functions to read a line of text from standard input, which by default comes from the keyboard
str = raw_input("Enter your input: ")
print "Received input is : ", str
# I typed "Hello Python!"
Hello Python
Posted on August 31, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.