Optimizing PostgreSQL Buffer Cache: Automating Analysis with a Bash Script

dm8ry

Dmitry Romanoff

Posted on September 7, 2024

Optimizing PostgreSQL Buffer Cache: Automating Analysis with a Bash Script

Image description

Efficient buffer cache management is a crucial aspect of optimizing PostgreSQL database performance. The buffer cache plays a vital role in holding data pages in memory, thereby reducing disk I/O operations and improving query execution times. Understanding how buffer cache is allocated across different databases can provide valuable insights into optimizing performance and ensuring effective resource utilization.

In this article, we’ll walk through a practical Bash script that automates the process of checking buffer cache allocation for all databases on a PostgreSQL instance. The script simplifies the analysis and helps database administrators identify which tables are consuming the most buffer cache. This can be particularly useful for performance tuning and resource optimization.

Script Overview
The provided Bash script performs the following tasks:

  1. Retrieves Database List: It connects to the PostgreSQL instance and retrieves the list of databases currently in use.
  2. Collects Buffer Cache Information: For each database, the script queries buffer cache information to determine how much cache is allocated to different tables.
  3. Formats and Reports: The script formats the output for readability and displays a report of buffer cache usage.

How the Script Works
Let’s break down the script into its key components:

1. Setup and Initialization:

#!/bin/bash

# Script to check buffer cache allocation for all the databases on some db instance

nPort=$1

current_date_time="$(date +'%Y%m%d_%H%M%S')" 

fileOne="./temp_output/check_buffer_cache_${current_date_time}.tmp"

fileTwo="./temp_output/check_buffer_cache_${current_date_time}_formatted.tmp"

echo " "
echo "Started..."
echo " "
Enter fullscreen mode Exit fullscreen mode

The script begins by setting up the necessary variables, including the port number (nPort) and file paths for storing temporary output files. It also captures the current date and time for timestamping the output files.

2. Iterating Over Databases:

for i in `psql -h localhost -p ${nPort} -U postgres -t -c "select datname from pg_database where datname in (select datname from pg_stat_activity) order by 1"`; do 

echo "Collecting buffercache info about database: $i"
Enter fullscreen mode Exit fullscreen mode

The script queries the list of databases that are currently active and iterates over each database. For each database, it collects buffer cache information.

3. Collecting Buffer Cache Information:

psql -h localhost -p ${nPort} -U $i -d $i -qt <<SQL >> ${fileOne}

\\c ${i};

SELECT '${i}' as db_name, n.nspname, c.relname, count(*) AS buffers
             FROM pg_buffercache b JOIN pg_class c
             ON b.relfilenode = pg_relation_filenode(c.oid) AND
                b.reldatabase IN (0, (SELECT oid FROM pg_database
                                      WHERE datname = current_database()))
             JOIN pg_namespace n ON n.oid = c.relnamespace and n.nspname ='public'
             GROUP BY n.nspname, c.relname
             ORDER BY 4 DESC
             LIMIT 10;

SQL
Enter fullscreen mode Exit fullscreen mode

For each database, the script executes a SQL query to retrieve buffer cache usage. It joins several PostgreSQL system catalog tables to get details about buffer allocation for each table, filtering to include only those in the ‘public’ schema.

4. Formatting and Displaying the Report:

cat ${fileOne} | awk -F"|" ' { if (NF==4) {  printf("%-40s %-20s %-50s %s\n",$1, $2, $3, $4); } } ' | sort -nrk4 > ${fileTwo}

echo " "
echo "Report:"
echo " "

cat ${fileTwo}

echo " "
echo "Ended..."
echo " "
Enter fullscreen mode Exit fullscreen mode

After collecting the data, the script formats it using awk to ensure readability. It sorts the output by the number of buffers in descending order and saves the result to another temporary file. Finally, it displays the formatted report.

Running the Script

To execute this script, you need to pass the port number of the PostgreSQL instance as an argument:

./check_buffer_cache.sh 5432
Enter fullscreen mode Exit fullscreen mode

Replace 5432 with the port number of your PostgreSQL instance. The script will generate and display a report of buffer cache usage for all active databases.

Conclusion

This script is a handy tool for PostgreSQL administrators looking to gain insights into buffer cache allocation. By automating the collection and formatting of buffer cache data, the script helps in identifying performance bottlenecks and optimizing resource usage across multiple databases. For best results, consider running this script during different periods to monitor changes in cache usage and make informed decisions about database tuning.

Here’s the full script for your reference:

#!/bin/bash

# Script to check buffer cache allocation for all the databases on some db instance

nPort=$1

current_date_time="$(date +'%Y%m%d_%H%M%S')" 

fileOne="./temp_output/check_buffer_cache_${current_date_time}.tmp"

fileTwo="./temp_output/check_buffer_cache_${current_date_time}_formatted.tmp"

echo " "
echo "Started..."
echo " "

for i in `psql -h localhost -p ${nPort} -U postgres -t -c "select datname from pg_database where datname in (select datname from pg_stat_activity) order by 1"`; do 

echo "Collecting buffercache info about database: $i"
#echo " "

psql -h localhost -p ${nPort} -U $i -d $i -qt <<SQL >> ${fileOne}

\\c ${i};

SELECT '${i}' as db_name, n.nspname, c.relname, count(*) AS buffers
             FROM pg_buffercache b JOIN pg_class c
             ON b.relfilenode = pg_relation_filenode(c.oid) AND
                b.reldatabase IN (0, (SELECT oid FROM pg_database
                                      WHERE datname = current_database()))
             JOIN pg_namespace n ON n.oid = c.relnamespace and n.nspname ='public'
             GROUP BY n.nspname, c.relname
             ORDER BY 4 DESC
             LIMIT 10;

SQL

done

cat ${fileOne} | awk -F"|" ' { if (NF==4) {  printf("%-40s %-20s %-50s %s\n",$1, $2, $3, $4); } } ' | sort -nrk4 > ${fileTwo}

echo " "
echo "Report:"
echo " "

cat ${fileTwo}

echo " "
echo "Ended..."
echo " "
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
dm8ry
Dmitry Romanoff

Posted on September 7, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related