GBase 8c Distributed Cluster Backup Principles and Impact

congcong

Cong Li

Posted on October 10, 2024

GBase 8c Distributed Cluster Backup Principles and Impact

GBase 8c multi-model distributed database supports various storage modes such as row-based, column-based, and in-memory storage, as well as multiple deployment forms like standalone, primary-standby, and distributed clusters. To ensure data security, regular backups of the database cluster are necessary. This article primarily introduces the execution principles of logical and physical backups in a distributed cluster and the impact on the database cluster during the backup process.

1. Logical Backup

The logical backup in the GBase 8c database is mainly implemented through the built-in gs_dump tool.

gs_dump is a tool used to export database-related information in GBase 8c. Users can customize the export of a database or specific objects (schemas, tables, views, etc.), excluding objects in the recycle bin. The exported database can be the default postgres database or a custom database. It is executed by the operating system user gbase.

1.1 Execution Principles of gs_dump

The execution principles of gs_dump are as follows:

  1. gs_dump establishes a connection with the CN (Coordinator Node).
  2. gs_dump starts a transaction and sets the isolation level to "Repeatable Read" to ensure consistency in the dumped data.
  3. gs_dump accesses system tables and collects the objects to be backed up based on the input parameters (database name, schema name, table name, etc.).
  4. It executes LOCK TABLE table_name IN ACCESS SHARE MODE for all tables to be exported, applying a read lock on the tables.
  5. It executes the COPY table_name TO stdout command for all tables to be exported.
  6. The CN pushes the COPY command to all DNs (Data Nodes) for execution.
  7. DNs execute the COPY command, perform a full table scan using the snapshot taken at the start of the transaction, and send the data to the CN.
  8. The CN forwards the data received from the DNs to gs_dump.
  9. gs_dump writes the received data to the backup file.

1.2 Impact of gs_dump on the Database

1.2.1 Increased CPU Load

During the execution of gs_dump, there will be one service thread on the CN and one service thread on each DN. The execution of gs_dump is equivalent to performing a full table scan on all tables to be exported. The impact of a single thread on the CPU is minimal.

1.2.2 Increased IO Load

The impact on IO includes both network IO and disk IO.

  • Network IO Impact: gs_dump transmits all table data to the CN, which then transmits it to the gs_dump program. The impact on network IO in a single-threaded scenario is minimal.
  • Disk IO Impact: On the DN side, the DN loads all table data into the buffer, and the impact of a single thread is minimal. On the gs_dump side, it dumps all received table data, and the impact is also minimal in a single-threaded scenario.

1.2.3 Impact on Business Operations

  • Table Locking Impact: Before copying data, gs_dump locks all tables to be exported by executing the LOCK TABLE command. The lock is only released after all copy operations are completed. If there is a large amount of data to be backed up, the gs_dump process can take a long time, resulting in a long table lock duration. During the table lock, operations such as DROP TABLE, TRUNCATE TABLE, ALTER TABLE, and VACUUM FULL will be blocked. Solutions (choose one):
  • Use the --non-lock-table parameter. If set, gs_dump will not execute the LOCK TABLE operation in advance, avoiding blocking. However, if a table is deleted during the gs_dump process, the table's data will not be exported.
  • Use the --exclude-table-file parameter to exclude specific tables from being exported. This parameter supports wildcards, and intermediate tables can be excluded.
  • Group all intermediate tables that cause DDL blocking into a specific schema, and gs_dump can skip exporting that schema to avoid blocking.

  • Impact on Historical Data Cleanup:

    gs_dump uses the Repeatable Read isolation level and accesses all tables using the snapshot taken at the start of the execution. As a result, historical data after this point will not be cleaned by the vacuum thread. If the gs_dump process takes a long time and there are frequent updates to the tables during this period, the size of these table files may gradually increase.

    Solution: Avoid frequent updates to these tables if the gs_dump process takes a long time (regular updates usually have no impact).

2. Physical Backup

2.1 Physical Backup Principles

Physical backups in the GBase 8c database include base backups and log archiving. A base backup involves copying all database files to a backup server. Log archiving involves copying newly generated xlog files to the backup server using the scp command. Log files can only be used with a corresponding base backup.

Physical backups in the GBase 8c distributed database are performed node by node, and only the files of the primary nodes are backed up. This includes the GTM primary node, the primary CN, and the primary DN nodes. Each primary node has a data-sending thread that corresponds to a data-receiving thread on the backup server. Physical backup is essentially data copying, where files are copied from the primary node to the backup server. The physical backup process does not interact with the database and does not affect database operations.

2.2 Impact of Physical Backup on the Database

2.2.1 Increased CPU Load

During the base backup, all files on the primary nodes are copied, and each primary node starts a data-sending thread that runs until all files are sent. The CPU consumption is equivalent to that of transferring files using the scp command. During log archiving, the Arch thread sends xlog files, and since the Arch thread often sleeps, it consumes minimal CPU resources.

2.2.2 Increased IO Load

The amount of data transferred during a base backup is large, and it is equivalent to transferring database files from one server to another using the scp command, which can occupy a significant amount of physical bandwidth. Base backups should be avoided during peak business hours. Log archiving only copies new files, and its impact on IO load is minimal.

2.2.3 Impact on Business Operations

The physical backup process does not interact with the GBase database and does not affect database operations.

💖 💪 🙅 🚩
congcong
Cong Li

Posted on October 10, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related