Sachin
Posted on November 14, 2024
You must have used functions provided by the os
module in Python several times in your projects. These could be used to create a file, walk down a directory, get info on the current directory, perform path operations, and more.
In this article, we’ll discuss the functions that are as useful as any function in the os
module but are rarely used.
os.path.commonpath()
When working with multiple files that share a common directory structure, you might want to find the longest shared path. os.path.commonpath()
does just that. This can be helpful when organizing files or dealing with different paths across environments.
Here’s an example:
import os
paths = ['/user/data/project1/file1.txt', '/user/data/project2/file2.txt']
common_path = os.path.commonpath(paths)
print("Common Path:", common_path)
This code will give us the common path shared by these two paths.
Common Path: /user/data
You can see that os.path.commonpath()
takes a list of path names, which might be impractical to manually write them down.
In that case, it is best to iterate over all of the directories, subdirectories, and file names and then look for the common path.
import os
def get_file_paths(directory, file_extension=None):
# Collect all file paths in the directory (and subdirectories, if any)
file_paths = []
for root, dirs, files in os.walk(directory):
for file in files:
if file_extension is None or file.endswith(file_extension):
file_paths.append(os.path.join(root, file))
return file_paths
# Specify the root directory to start from
directory_path = 'D:/SACHIN/Pycharm/Flask-Tutorial'
# If you want to filter by file extension
file_paths = get_file_paths(directory_path, file_extension='.html')
# Find the common path among all files
if file_paths:
common_path = os.path.commonpath(file_paths)
print("Common Path:", common_path)
else:
print("No files found in the specified directory.")
In this example, the function get_file_paths()
traverses a directory from top to bottom and appends all the paths found in the file_paths
list. This function optionally takes a file extension if we want to look out for specific files.
Now we can easily find the common path of any directory.
Common Path: D:\SACHIN\Pycharm\Flask-Tutorial\templates
os.scandir()
If you’re using os.listdir()
to get the contents of a directory, consider using os.scandir()
instead. It’s not only faster but also returns DirEntry
objects, which provide useful information like file types, permissions, and whether the entry is a file or a directory.
Here’s an example:
import os
with os.scandir('D:/SACHIN/Pycharm/osfunctions') as entries:
for entry in entries:
print(f"{entry.name} : \n"
f">>>> Is File: {entry.is_file()} \n"
f">>>> Is Directory: {entry.is_dir()}")
In this example, we used os.scandir()
and passed a directory and then we iterated over this directory and printed the info.
.idea :
>>>> Is File: False
>>>> Is Directory: True
main.py :
>>>> Is File: True
>>>> Is Directory: False
sample.py :
>>>> Is File: True
>>>> Is Directory: False
os.path.splitext()
Let’s say you’re working with files and need to check their extension, you can get help from os.path.splitext()
function. It splits the file path into the root and extension, which can help you determine the file type.
import os
filename = 'report.csv'
root, ext = os.path.splitext(filename)
print(f"Root: {root} \n"
f"Extension: {ext}")
Output
Root: report
Extension: .csv
Look at some cases where paths can be weird, at that time how os.path.splitext()
works.
import os
filename = ['.report', 'report', 'report.case.txt', 'report.csv.zip']
for idx, paths in enumerate(filename):
root, ext = os.path.splitext(paths)
print(f"{idx} - {paths}\n"
f"Root: {root} | Extension: {ext}")
Output
0 - .report
Root: .report | Extension:
1 - report
Root: report | Extension:
2 - report.case.txt
Root: report.case | Extension: .txt
3 - report.csv.zip
Root: report.csv | Extension: .zip
os.makedirs()
There's already a frequently used function that allows us to create directories. But what about when you create nested directories?
Creating nested directories can be a hassle with os.mkdir()
since it only makes one directory at a time. os.makedirs()
allows you to create multiple nested directories in one go, and the exist_ok=True
argument makes sure it doesn’t throw an error if the directory already exists.
import os
os.makedirs('project/data/files', exist_ok=True)
print("Nested directories created!")
When we run this program, it will create specified directories and sub-directories.
Nested directories created!
If we run the above program again, it won’t throw an error due to exist_ok=True
.
os.replace()
Similar to os.rename()
, os.replace()
moves a file to a new location, but it safely overwrites any existing file at the destination. This is helpful for tasks where you’re updating or backing up files and want to ensure that old files are safely replaced.
import os
os.replace(src='main.py', dst='new_main.py')
print("File replaced successfully!")
In this code, main.py
file will be renamed to new_main.py
just as os.rename()
function but this operation is like take it all or nothing. It means the file replacement happens in a single, indivisible step, so either the entire operation succeeds or nothing changes at all.
File replaced successfully!
os.urandom()
For cryptographic purposes, you need a secure source of random data. os.urandom()
generates random bytes suitable for things like generating random IDs, tokens, or passwords. It’s more secure than the random
module for sensitive data.
os.urandom()
uses randomness generated by the operating system you are using from various resources to make bytes (data) unpredictable.
In Windows, it uses BCryptGenRandom()
to generate random bytes.
import os
secure_token = os.urandom(16) # 16 bytes of random data
print("Secure Token:", secure_token)
#Making it human-readable
print("Secure Token:", secure_token.hex())
Output
Secure Token: b'\x84\xd6\x1c\x1bKB\x7f\xcd\xf6\xb7\xc4D\x92z\xe3{'
Secure Token: 84d61c1b4b427fcdf6b7c444927ae37b
os.path.samefile()
The os.path.samefile()
function in Python is used to check if two paths refer to the same file or directory on the filesystem. It’s particularly helpful in scenarios where multiple paths might point to the same physical file, such as when dealing with symbolic links, hard links, or different absolute and relative paths to the same location.
import os
is_same = os.path.samefile('/path/to/file1.txt', '/different/path/to/symlink_file1.txt')
print("Are they the same file?", is_same)
os.path.samefile()
is designed to return True
only if both paths reference the same file on disk, such as a file that’s hard-linked or symlinked to the same data on the filesystem.
os.path.relpath()
os.path.relpath()
is a computation function that computes the relative path between two paths. This is particularly useful when building file paths dynamically or working with relative imports.
Consider the following example:
import os
# Target file path
target_path = "D:/SACHIN/Pycharm/osfunctions/project/engine/log.py"
# Starting point
start_path = "D:/SACHIN/Pycharm/osfunctions/project/interface/character/specific.py"
relative_path = os.path.relpath(target_path, start=start_path)
print(relative_path)
In this example, we have target_path
which contains a path where we have to navigate and start_path
contains a path from where we have to start calculating the relative path to target_path
.
When we run this, we get the following output.
..\..\..\engine\log.py
This means we have to go up three directories and then down to engine/log.py
.
os.fsync()
When we perform a file writing (file.write()
) operation, the data isn’t saved to disk instantly instead the data is saved into the system’s buffer and if something unexpected happens before writing the data to the disk, the data gets lost.
os.fsync()
forces the data to be written, ensuring data integrity. It’s especially useful in logging or when writing critical data that must not be lost.
import os
with open('data.txt', 'w') as f:
f.write("gibberish!")
os.fsync(f.fileno()) # Ensures data is written to disk
os.fsync(f.fileno())
is called to make sure the data is immediately written to the disk and not left in the buffer.
os.fsync()
takes file descriptor that’s why we passed f.fileno()
which is a unique integer assigned by the system to the file on which we are operating.
os.get_terminal_size()
If you’re creating CLI tools, formatting the output to fit the terminal width can make the output cleaner. os.get_terminal_size()
gives you the current terminal width and height, making it easy to dynamically format content.
import os
size = os.get_terminal_size()
print(f"Terminal Width: {size.columns}, Terminal Height: {size.lines}")
When we run this code in the terminal, we get the size of the terminal on which we are running this script.
PS > py sample.py
Terminal Width: 158, Terminal Height: 12
Note: You may get an error when directly running the script on IDE where the program doesn’t have access to the terminal.
🏆Other articles you might be interested in if you liked this one
✅Streaming videos on the frontend in FastAPI.
✅How to fix circular imports in Python.
✅Template inheritance in Flask.
✅How to use type hints in Python?
✅How to find and delete mismatched columns from datasets in pandas?
✅How does the learning rate affect the ML and DL models?
That’s all for now.
Keep Coding✌✌.
Posted on November 14, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.