GBase 8c Streaming Replication Parameter Configuration
Cong Li
Posted on July 23, 2024
GBase 8c uses streaming replication for data synchronization between primary and standby nodes in both primary-standby and distributed deployment modes. Streaming replication, introduced in PostgreSQL 9.0, transmits WAL logs from the primary to the standby as soon as they are generated.
Streaming replication consists of two main phases: instance recovery and primary-standby synchronization.
Instance Recovery Phase
When the pg database crashes, upon restart, it replays all WAL logs from the last checkpoint before the crash to restore the database to a consistent state. During the setup of the standby database, the primary database usually remains online, causing the backup to be in an inconsistent state. When the standby starts, the startup process performs instance recovery.Primary-Standby Synchronization Phase
Thewal receiver
process sends the standby's latest LSN to the primary. Thewal sender
then sends the WAL logs from the primary's latest LSN to the standby's latest LSN. Thewal receiver
receives the WAL logs and writes them to disk, while the startup process replays the WAL logs on the standby.
Below are the main parameters involved in GBase 8c streaming replication:
1. Sending Server Parameters
max_wal_senders
-
Description: Specifies the maximum number of concurrent connections for transaction log sending processes. Should not be greater than
max_connections
. - Parameter Type: POSTMASTER
-
Note:
wal_level
must be set toarchive
orhot_standby
to allow standby connections. - Range: Integer, 0-1024
- Suggested Range: 8-100
- Default: 16
wal_keep_segments
-
Description: Number of Xlog segments to keep. Sets the minimum number of transaction log files to retain in the
pg_xlog
directory for streaming replication. - Parameter Type: SIGHUP
- Range: Integer, 2 - INT_MAX
- Default: 16
-
Recommendations:
- May need to be increased for large data imports (e.g., 50G data may require a setting of 1000).
- Should be increased when
synchronous_commit
level is belowLOCAL_FLUSH
to avoid rebuilding failures.
wal_sender_timeout
- Description: Maximum wait time for the transaction log receiver to receive logs.
- Parameter Type: SIGHUP
- Range: Integer, 0 - INT_MAX (ms)
- Default: 6s
logical_sender_timeout
- Description: Maximum wait time for the logical log receiver to receive logs.
- Parameter Type: USERSET
- Range: Integer, 0 - INT_MAX (ms)
- Default: 30s
max_replication_slots
- Description: Sets the number of replication slots on the primary server.
- Parameter Type: POSTMASTER
- Range: Integer, 0-1024
- Default: 8
enable_slot_log
- Description: Enables logical replication slot synchronization between primary and standby.
- Parameter Type: USERSET
- Range: Boolean
- Default: off
max_changes_in_memory
- Description: Maximum memory size for caching a single transaction during logical decoding.
- Parameter Type: POSTMASTER
- Range: Integer, 1-2147483647
- Default: 4096
max_cached_tuplebufs
- Description: Maximum memory size for caching total tuple information during logical decoding.
- Parameter Type: POSTMASTER
- Range: Integer, 1-2147483647
- Default: 8192
logical_decode_options_default
- Description: Specifies the global default values for logical decoding options when not explicitly specified.
- Parameter Type: SIGHUP
- Range: Comma-separated key=value strings
- Default: ""
enable_wal_shipping_compression
- Description: Enables WAL log compression in streaming disaster recovery mode.
- Parameter Type: SIGHUP
- Range: Boolean
- Default: false
repl_auth_mode
- Description: Sets the authentication mode for primary-standby replication and standby rebuilds.
- Parameter Type: SIGHUP
- Range: Enum (off, default, uuid)
- Default: default
replconninfo1
- replconninfo8
- Description: Sets the information for the first to eighth nodes for listening and authentication.
- Parameter Type: SIGHUP
- Range: String
- Default: ""
cross_cluster_replconninfo1
- cross_cluster_replconninfo8
- Description: Sets the information for the first to eighth nodes for cross-cluster listening and authentication.
- Parameter Type: SIGHUP
- Range: String
- Default: ""
available_zone
- Description: Sets the zone information for the local node.
- Parameter Type: POSTMASTER
- Range: String
- Default: ""
max_keep_log_seg
- Description: Flow control parameter. Limits the number of unparsed physical log files during logical replication.
- Parameter Type: USERSET
- Range: Integer, 0 - 2147483647
- Default: 0
2. Primary Server Parameters
synchronous_standby_names
- Parameter Description: List of names of potential synchronous standby servers, separated by commas. This parameter belongs to the SIGHUP type.
- Notes:
- The current synchronous standby server connected is the first name in the list. If the current synchronous standby server loses connection, it will immediately switch to the next higher priority standby server and put the name of this standby server in the list.
- The standby server name can be specified by setting the environment variable
PGAPPNAME
. -
Value Range: String. When the value is
*
, it matches any provided synchronous standby server name. The configuration supports the following formats:ANY num_sync(standby_name [, …]) [, ANY num_sync (standby_name [, …])]
[FIRST] num_sync (standby_name [, …])
standby_name [, …]
Explanation:
-
num_sync
is the number of synchronous standby servers that the transaction needs to wait for a response from.standby_name
is the name of the standby server.FIRST
andANY
specify the strategy for selecting synchronous standby servers from the listed servers. -
ANY N (node1,node2,…)
means that any N host names within the parentheses are selected as the list of synchronous standby server names. For example,ANY 1 (node1,node2)
means that either node1 or node2 is selected as the synchronous standby server name. -
ANY N1 (node1,node2,…), ANY N2 (node3,node4,…)
means grouped potential synchronous standby server names. Within the first group of parentheses, any N1 host names are selected as the first group of synchronous standby server names. Within the second group of parentheses, any N2 host names are selected as the second group of synchronous standby server names. At this point, the groups have an AND relationship. Both groups must meet their respective synchronous standby server numbers for the local transaction to be committed. -
FIRST N (node1,node2,…)
means selecting the first N host names in the order they appear as the list of synchronous standby server names. For example,FIRST 1 (node1,node2)
means selecting node1 as the synchronous standby server name. -
node1,node2,…
andFIRST 1 (node1,node2,…)
have the same meaning.
If you use the gs_guc
tool to set this parameter, you need to set it as follows:
gs_guc reload -Z datanode -N @NODE_NAME@ -D @DN_PATH@ -c "synchronous_standby_names='ANY NODE 1(dn_instanceId1, dn_instanceId2)'";
or:
gs_guc reload -Z datanode -N @NODE_NAME@ -D @DN_PATH@ -c "synchronous_standby_names='ANY 1(AZ1, AZ2)'";
-
Default Value:
*
- Notes:
- Duplicate names should not appear in the standby server name list. The
num_sync
in the configuration should not be greater than the number of the standby server list. - For multi-group synchronous standby server configurations like
ANY N1 (node1,node2,…), ANY N2 (node3,node4,…)
, the groups have an AND relationship. Currently, only multi-ANY grouping is supported. Using*
for fuzzy matching or having duplicate standby servers in the configuration is not allowed.
most_available_sync
-
Parameter Description: When there is a synchronous standby failure, the host transactions will not be blocked by the synchronous standby failure. For example, if there are two synchronous standby servers, one fails and the other is normal, then the host transactions will only wait for the normal standby server and will not be blocked by the failed standby server. Another example is using the quorum protocol with one master and three synchronous standbys, configured as
ANY 2(node1,node2,node3)
. If node1 and node3 fail while node2 is normal, the host transactions will not be blocked. - This parameter belongs to the SIGHUP type.
- Value Range: Boolean
-
on
means the host is not blocked when there is a synchronous standby failure. -
off
means the host is blocked when there is a synchronous standby failure. -
Default Value:
off
keep_sync_window
- Parameter Description: The delay time to enter the maximum available mode.
- When the
most_available_sync
is configured ason
and in a master-slave scenario, if a synchronous standby failure causes the current configuration of the number of synchronous standby servers (see the meaning ofsynchronous_standby_name
for details) to be unmet, if thekeep_sync_window
parameter is configured, the host will continue to maintain the maximum protection mode, blocking the host's transaction commit, and delaying the entry into the maximum available mode for the time set by thekeep_sync_window
. - If the synchronous standby server recovers within the
keep_sync_window
timeout window and meets the current configuration of the number of synchronous standby servers, the transaction will not be blocked, and the system will return to normal status. - If setting
keep_sync_window
, it is recommended to set a minimum of 5 seconds to avoid false positives for network instability by the monitoring system. - This parameter belongs to the SIGHUP type.
- Value Range: Integer, range 0~INT_MAX, in seconds.
-
0
means not setting akeep_sync_window
timeout window, i.e., directly entering the maximum available mode. - Others mean the size of the
keep_sync_window
timeout window. -
Default Value:
0
- Notes:
- Configuring this parameter may affect the RPO. If the host fails within the configured timeout window, data from the beginning of the block to the host failure time window may be lost.
enable_stream_replication
- Parameter Description: Controls whether data and log synchronization is performed between master and standby, master and slave.
- This parameter belongs to the SIGHUP type. Notes:
- This parameter is for performance testing, used to test performance parameters with and without standby servers. When this parameter is closed, switching, fault, and other abnormal scenarios cannot be tested, otherwise, there will be inconsistencies between the master and standby servers.
- This parameter is a controlled parameter and is not recommended to be turned off in normal business scenarios.
- The current version does not support master-slave deployment mode by default.
- Value Range: Boolean
-
on
means enabling master-standby, master-slave synchronization. -
off
means disabling master-standby, master-slave synchronization. -
Default Value:
on
enable_mix_replication
- Parameter Description: Controls the way WAL logs and data are replicated between master and standby, master and slave.
- This parameter belongs to the INTERNAL type parameter, the default value is
off
, and external modifications are not allowed. - Notes:
- This parameter is not allowed to be changed in normal business scenarios, i.e., closing the WAL log and data page mixed replication mode.
- The current version does not support master-slave deployment mode by default.
- Value Range: Boolean
-
on
means enabling WAL log and data page mixed replication mode. -
off
means disabling WAL log and data page mixed replication mode. -
Default Value:
off
vacuum_defer_cleanup_age
-
Parameter Description: Specifies the number of transactions for VACUUM to use, and VACUUM will delay the cleanup of invalid row storage table records. The number of delayed transactions is set by
vacuum_defer_cleanup_age
. VACUUM and VACUUM FULL operations will not immediately clean up newly deleted tuples. - This parameter belongs to the SIGHUP type.
- Value Range: Integer, 0~1000000, 0 means no delay.
-
Default Value:
0
data_replicate_buffer_size
- Parameter Description: The size of the memory queue occupied when passing data pages between the sender and receiver. This parameter affects the replication buffer size between the master and standby servers.
- This parameter belongs to the POSTMASTER type.
- Value Range: Integer, 4096~1072693248, in KB.
-
Default Value:
16MB
(i.e., 16384KB)
walsender_max_send_size
- Parameter Description: Sets the size of the log or data sending buffer on the master server.
- This parameter belongs to the POSTMASTER type.
- Value Range: Integer, 8~INT_MAX, in KB.
-
Default Value:
8MB
(i.e., 8192KB)
enable_data_replicate
- Parameter Description: When importing data into a row storage table, the master and standby servers can choose the data synchronization method.
- This parameter belongs to the USERSET type.
- Value Range: Boolean
-
on
means data synchronization between the master and standby servers uses data page mode when importing data into a row storage table. If thereplication_type
parameter is 1, it is not allowed to set it toon
. If it is set toon
using theguc
tool, it will be forcibly changed tooff
. -
off
means data synchronization between the master and standby servers uses log (Xlog) mode when importing data into a row storage table. -
Default Value:
off
ha_module_debug
- Parameter Description: Used to view the replication status log of specific data blocks during data replication.
- This parameter belongs to the USERSET type.
- Value Range: Boolean
-
on
means logging the status of each data block during data replication. -
off
means not logging the status of each data block during data replication. -
Default Value:
off
enable_incremental_catchup
- Parameter Description: Controls the method of catch-up between the primary and standby databases. The default setup currently does not support the primary-standby deployment mode.
- This parameter is of type
SIGHUP
. - Value Range: Boolean
-
on
: During standby catch-up, the incremental catch-up method is used. This means the standby database scans its local data files to obtain a list of differential data files between the primary and standby databases and performs the catch-up. -
off
: During standby catch-up, the full catch-up method is used. This means the standby database scans all data files on the primary database to obtain a list of differential data files between the primary and standby databases and performs the catch-up. -
Default Value:
on
sync_config_strategy
- Parameter Description: The synchronization strategy for configuration files between the primary and standby databases and between the standby and cascading standby databases.
- This parameter is of type
POSTMASTER
. - Value Range: Enum
-
all_node
: When the primary is configured toall_node
, it allows the primary to actively synchronize configuration files to all standby databases. When the standby is configured toall_node
, it allows the current standby to send synchronization requests to its primary and actively synchronize configuration files to all its cascading standbys. When the cascading standby is configured toall_node
, it allows the current cascading standby to send synchronization requests to the previous standby. -
only_sync_node
: When the primary is configured toonly_sync_node
, it allows the primary to actively synchronize configuration files only to all synchronous standby databases. When the standby is configured toonly_sync_node
, it allows the current standby to send synchronization requests to its primary but does not allow the current standby to actively synchronize configuration files to all its cascading standbys. When the cascading standby is configured toonly_sync_node
, it allows the current cascading standby to send synchronization requests to its standby. -
none_node
: When the primary is configured tonone_node
, it does not allow the primary to actively synchronize configuration files to any standby databases. When the standby is configured tonone_node
, it does not allow the current standby to send synchronization requests to its primary and does not allow the current standby to actively synchronize configuration files to all its cascading standbys. When the cascading standby is configured tonone_node
, it does not allow the current cascading standby to send synchronization requests to its standby. -
Default Value:
all_node
- Notes:
- In a GBase 8s cluster consisting of primary, standby, and cascading standby databases, the primary is the sender relative to the standby, the standby is the receiver relative to the primary, the standby is the sender relative to the cascading standby, and the cascading standby is the receiver relative to the standby.
- The synchronization of configuration files by the sender to the receiver and the synchronization request by the receiver to the sender are two independent events. Both will trigger the synchronization of configuration files. If you do not want the configuration files to be synchronized, you need to configure the
sync_config_strategy
parameter of all nodes in the cluster tonone_node
. If you only want to synchronize configuration files between the primary and synchronous standby, configure the primary'ssync_config_strategy
parameter toonly_sync_node
and all other nodes tonone_node
. If you want all nodes to synchronize configuration files, configure thesync_config_strategy
parameter of all nodes toall_node
. Currently, custom synchronization strategies between arbitrary nodes are not supported. - The specific behavior of configuration parameter synchronization is that the sender sends configuration files to the receiver, directly overwriting the corresponding parameters in the receiver's configuration file. If the configuration file synchronization strategy is set, modifying the configuration parameters on the receiver side will immediately be overwritten by the sender's configuration parameters, making the receiver's modification ineffective.
- Even if the configuration file synchronization strategy is set, some configuration parameters will not be synchronized. They include:
application_name
archive_command
audit_directory
available_zone
cgbase_control_port
cgbase_sctp_port
listen_addresses
log_directory
port
replconninfo1
replconninfo2
replconninfo3
replconninfo4
replconninfo5
replconninfo6
replconninfo7
replconninfo8
ssl
ssl_ca_file
ssl_cert_file
ssl_ciphers
ssl_crl_file
ssl_key_file
ssl_renegotiation_limit
ssl_cert_notify_time
synchronous_standby_names
local_bind_address
perf_directory
query_log_directory
asp_log_directory
streaming_router_port
enable_upsert_to_merge
archive_dest
recovery_min_apply_delay
sync_config_strategy
3. Standby Server Parameters
hot_standby
-
Parameter Description: Sets whether the standby server allows connections and queries during recovery.
This parameter is of type
POSTMASTER
. Notes: - If this parameter is set to
on
,wal_level
must be set tohot_standby
, otherwise, the database will fail to start. - In a dual-machine environment, the
hot_standby
parameter cannot be set tooff
as it will affect other functions of the dual-machine setup. - If the
hot_standby
parameter was previously turned off and thewal_level
parameter was set below thehot_standby
level, before turning thehot_standby
parameter back on, ensure that all logs on the standby server waiting for replay support standby queries by following these steps:- Adjust the
wal_level
parameter on both the primary and standby tohot_standby
or higher, and restart the instance to take effect. - Execute a checkpoint operation on the primary server, and use the
pg_stat_get_wal_senders()
system function to confirm that each standby server'sreceiver_replay_location
has caught up with the primary's currentsender_flush_location
, ensuring thewal_level
adjustment is synchronized and effective on the standby, without needing to replay previous lower-level logs. - Turn on the
hot_standby
parameter (set toon
) on both the primary and standby, and restart the instance to take effect. Value Range: Boolean
- Adjust the
-
on
: Allows connections and queries on the standby server during recovery. -
off
: Disallows connections and queries on the standby server during recovery. -
Default Value:
on
max_standby_archive_delay
- Parameter Description: When hot standby mode is enabled, this parameter sets the maximum time the standby server waits before canceling queries that conflict with the processing of archived WAL logs.
- This parameter is of type
SIGHUP
. -
Notes:
-1
means the standby server will wait indefinitely for conflicting queries to complete. -
Value Range: Integer, range:
-1
toINT_MAX
, in milliseconds. -
Default Value:
3s
(3000ms)
max_standby_streaming_delay
- Parameter Description: When hot standby mode is enabled, this parameter sets the maximum time the standby server waits before canceling queries that conflict with the processing of streaming replication WAL logs.
- This parameter is of type
SIGHUP
. -
Notes:
-1
means the standby server will wait indefinitely for conflicting queries to complete. -
Value Range: Integer, range:
-1
toINT_MAX
, in milliseconds. -
Default Value:
3s
(3000ms)
wal_receiver_status_interval
- Parameter Description: Sets the maximum time interval for the WAL receiver process to notify the primary server of its status.
- This parameter is of type
SIGHUP
. -
Value Range: Integer, range:
0
toINT_MAX
, in milliseconds. -
Default Value:
5s
(5000ms) Notes: - Setting this parameter to
0
disables the standby server's feedback to the primary server about the log reception position, which may lead to primary server transaction commit blocking, switchover operation failures, and other anomalies. It is not recommended to set this parameter to0
in normal business scenarios.
hot_standby_feedback
- Parameter Description: Sets whether the standby server is allowed to feedback the results of queries executed on it to the primary server, which can avoid query conflicts.
- This parameter is of type
SIGHUP
. - Value Range: Boolean
-
on
: Allows the standby server to feedback the minimum transaction ID of executed queries to the primary server. -
off
: Disallows the standby server from feeding back the minimum transaction ID of executed queries to the primary server. -
Default Value:
off
Notes: - When this parameter is
on
, the cleanup of old version data on the primary server will be restricted by the transactions being read on the standby server. The primary server will only clean up changes made by transactions with IDs smaller than those fed back by the standby server. Therefore, enabling this parameter may affect the primary server's performance.
wal_receiver_timeout
- Parameter Description: Sets the maximum wait time for receiving data from the primary server.
- This parameter is of type
SIGHUP
. -
Value Range: Integer,
0
toINT_MAX
, in milliseconds. -
Default Value:
6s
(6000ms)
wal_receiver_connect_timeout
- Parameter Description: Sets the maximum timeout for connecting to the primary server.
- This parameter is of type
SIGHUP
. -
Value Range: Integer,
0
toINT_MAX / 1000
, in seconds. -
Default Value:
2s
wal_receiver_connect_retries
- Parameter Description: Sets the maximum number of attempts to connect to the primary server.
- This parameter is of type
SIGHUP
. -
Value Range: Integer,
1
toINT_MAX
. -
Default Value:
1
wal_receiver_buffer_size
- Parameter Description: The size of the memory buffer for storing Xlog received from the primary and standby servers. The default setup currently does not support the primary-standby-secondary deployment mode.
- This parameter is of type
POSTMASTER
. -
Value Range: Integer,
4096
to1047552
, in KB. -
Default Value:
64MB
(65536KB)
primary_slotname
- Parameter Description: Sets the slot name of the primary server corresponding to the standby server, used for primary-standby verification and WAL log deletion mechanism.
- This parameter is of type
SIGHUP
. - Value Range: String
- Default Value: Empty string
max_logical_replication_workers
- Parameter Description: The maximum number of apply worker threads for the subscription.
- This parameter is of type
POSTMASTER
. -
Value Range: Integer,
0
to262143
. -
Default Value:
4
max_sync_workers_per_subscription
- Parameter Description: The maximum number of tablesync worker threads per subscription.
- This parameter is of type
SIGHUP
. -
Value Range: Integer,
0
to262143
. -
Default Value:
2
Posted on July 23, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024