GBase 8a Solution in Real-Time Data Transmission System (1)

congcong

Cong Li

Posted on July 23, 2024

GBase 8a Solution in Real-Time Data Transmission System (1)

1. Overview

The goal is to build a high-performance computing system with a complementary storage system to provide efficient, large-capacity storage for meteorological radar data. This will support the long-term online storage of model products and facilitate model development.

2. Project Scope

The implementation includes a 6-node GBase 8a MPP cluster. The current hardware configuration is detailed in the table below.

3. Construction Details

We will establish a 6-node GBase 8a MPP cluster environment, including preparation and configuration for installation, pre-installation checks, cluster deployment, post-installation verification, and application connection planning.

Equipment Specifications

Device Model Device Description (Data Node) Quantity
Server Model - 6
Basic Configuration - CPU: 2 x 14 Core, 2.6GHz or higher
- Memory: 256GB or higher
- RAID Card: Cache 1GB
- Data Node Disks: 2 x 480GB SSD + 12 x 1.2TB (10,000 RPM or higher) SAS disks
- Network Card: 2 x Dual-port 10GbE (with modules) + 2 x Dual-port 1GbE
- Redundant Power Supply, Fans
- Management Network Interface
6

4. Technical Plan

4.1 System Architecture

The structure of the 6-node GBase 8a MPP cluster is shown below:

Image description

4.2 System Layout

The GBase 8a MPP nodes are divided into 3 coordinator nodes and 6 data nodes, with the 3 coordinator nodes also functioning as data nodes. The IP allocation for the 6 machines is as follows:

Hostname 10Gbps IP 1Gbps IP Purpose
Node1 TBD TBD Data + Coordinator
Node2 TBD TBD Data + Coordinator
Node3 TBD TBD Data + Coordinator
Node4 TBD TBD Data
Node5 TBD TBD Data
Node6 TBD TBD Data

5. Environment Configuration

5.1 Hardware Deployment and Network Planning

In actual projects, physical deployment of hardware such as cabinets, power, hosts, disks, network cards, and switches, as well as network communication, must be considered. Before installing the OS and GBase 8a MPP Cluster, consider the physical deployment and network planning as shown below:

Image description

In the above diagram, there are two racks (each rack must have independent power supply), which house the GBase 8a MPP Cluster nodes and application servers. The network communication between them is achieved through switches. To ensure efficient network operation, the business network in the actual project also needs to connect to these two redundant switches.

Below are the principles of hardware physical deployment and network planning:

High Availability of Power Supply: The power supply of the two racks is independent and does not affect each other, adhering to the principle of high availability for host power supply.

High Availability of Switches: Each rack is equipped with a switch, totaling 2 switches across both racks. These two switches are redundant; if one switch fails, the other immediately provides service, adhering to the principle of high availability for switches.

High Availability of Node Machines: The 6 machines marked with dashed lines in the diagram are used to deploy the GBase 8a MPP Cluster.

5.2 Server IP Address Planning

It is recommended to configure 3 coordinator nodes and 6 data nodes, with 3 machines doubling as both. Due to the automatic routing feature of the JDBC, all coordinator nodes can be listed in the hostlist parameter of the JDBC configuration URL in the web application middleware. This ensures continuous application use even if a node goes offline.

The IP address allocation for each cluster node is currently as follows. The 10G network IP address is used for internal data communication.

Hostname Purpose 10G Network IP Gigabit Network IP
Node1 Data Node + Manager Node TBA TBA
Node2 Data Node + Manager Node TBA TBA
Node3 Data Node + Manager Node TBA TBA
Node4 Data Node TBA TBA
Node5 Data Node TBA TBA
Node6 Data Node TBA TBA

5.3 RAID Configuration Planning

RAID Type Configuration Capacity After Configuration Purpose
RAID1 2 * 480GB Approx. 480GB Operating System Installation
RAID5 12 x 1.2TB SAS drives (6 drives in RAID5, then two RAID5 sets in RAID0) Approx. 12TB per node, 72TB total for 6 data nodes Data File Storage and Data Log File Storage (/data)

RAID Configuration Reference:

  1. RAID1: 2 SSD drives for OS installation and swap partition.
  2. RAID5: 12 SAS drives, configured as 6 drives in RAID5, then two RAID5 sets in RAID0 for data storage. Use the largest stripe size for RAID card, minimum 1MB.
    • Access Policy: Set to RW
    • Read Policy: Set to Ahead
    • Write Policy: Set to Write Back with BBU (allow RAID controller to switch to Write Through mode automatically)
    • IO Policy: OS disk (RAID1) set to Direct IO, cluster installation disk (RAID50) set to Cached IO
    • Other settings can be left as default.

5.4 Operating System Configuration Planning

The operating system for each cluster node is SUSE Linux Enterprise Server 12 SP4 (64-bit). The version information is as follows:

  • Description: SUSE Linux Enterprise Server 12 SP4
  • Release: 12.4
  • Install using the "Desktop" or "Software Workstation" option.
  • Root user password: Confirm with the relevant authority before installation. It is recommended to use a combination of letters and numbers without special characters.
  • Gbase user password: Default to "gbase". Confirm with the relevant authority before installation.
  • Language: Default to English (Chinese support needs to be installed manually later).
  • Keyboard: Choose U.S. English.
  • Hostname definition: As defined by the relevant authority.
  • Time zone: Select Universal Time (GMT).
  • Disk partitioning:
    • sda 480G: [swap: 128GB, boot: 960MB, root: remaining space]
    • sdb 5TB: [5TB allocated to /data]
    • All partitions should be formatted as ext4.
  • Installation method: Choose Desktop or Software Workstation, customize now, and select language support (Chinese support).
  • No need to create new users after the operating system restarts.
  • Modify the current date and time.
  • Enable kdump.

5.5 Network Configuration Planning

  1. The gigabit and 10-gigabit network addresses for the cluster are pending allocation by the relevant authorities.
  2. Cluster nodes need to be configured with both gigabit and 10-gigabit network IP addresses (gigabit for office network connections and 10-gigabit for internal data communication within the cluster). Both gigabit and 10-gigabit networks should use dual NIC bonding in active-backup mode (mode=1). Each gigabit NIC should connect to the primary and backup gigabit switches, and each 10-gigabit NIC should connect to the primary and backup 10-gigabit switches.
  3. NIC bonding must be set to activate automatically on startup.
  4. It is not recommended to configure gigabit and 10-gigabit networks within the same subnet. For example, use private network addresses for the 10-gigabit network and office environment addresses for the gigabit network.
  5. The cluster servers and switches should be deployed in the same room and within the same IP local area network segment (or configured within the same VLAN) to avoid deploying cluster nodes across switch cascades.
  6. It is recommended to place the 6 nodes in two different cabinets (within the same room), with 3 nodes in the left cabinet and 3 nodes in the right cabinet.

5.6 Port Planning

Ensure the following ports are open for the cluster to operate correctly:

Program Port
gcluster service 5258
gnode service 5050
syncserver service 5288
Data export range 6066-6165
Monitoring agent 9110
Collection center 9999
Alert service 9111
Monitoring web 8080

5.7 GBase Database Version Planning

  • Version: GBase8a MPP v95XX series
  • Cluster Architecture Planning: GBASE cluster architecture is planned to have 6 nodes, and the file server (loader) reuses one node.
  • Advise to plan 3 coordinator nodes and 6 data nodes (3 data nodes can be reused with coordinator nodes).

5.8 Application Connection Resource Planning

Business applications should connect through different coordinator nodes to distribute load. Use the automatic routing feature in JDBC to ensure continuous application connection even if nodes go offline.

Future articles will cover pre-installation checks, detailed steps, and post-installation verification.


By understanding the deployment and configuration of the GBase 8a MPP cluster, administrators and developers can effectively design, optimize, and troubleshoot the database system, enhancing efficiency and performance.

đź’– đź’Ş đź™… đźš©
congcong
Cong Li

Posted on July 23, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related