GBase 8a Solution in Real-Time Data Transmission System (1)
Cong Li
Posted on July 23, 2024
1. Overview
The goal is to build a high-performance computing system with a complementary storage system to provide efficient, large-capacity storage for meteorological radar data. This will support the long-term online storage of model products and facilitate model development.
2. Project Scope
The implementation includes a 6-node GBase 8a MPP cluster. The current hardware configuration is detailed in the table below.
3. Construction Details
We will establish a 6-node GBase 8a MPP cluster environment, including preparation and configuration for installation, pre-installation checks, cluster deployment, post-installation verification, and application connection planning.
Equipment Specifications
Device Model | Device Description (Data Node) | Quantity |
---|---|---|
Server Model | - | 6 |
Basic Configuration | - CPU: 2 x 14 Core, 2.6GHz or higher - Memory: 256GB or higher - RAID Card: Cache 1GB - Data Node Disks: 2 x 480GB SSD + 12 x 1.2TB (10,000 RPM or higher) SAS disks - Network Card: 2 x Dual-port 10GbE (with modules) + 2 x Dual-port 1GbE - Redundant Power Supply, Fans - Management Network Interface |
6 |
4. Technical Plan
4.1 System Architecture
The structure of the 6-node GBase 8a MPP cluster is shown below:
4.2 System Layout
The GBase 8a MPP nodes are divided into 3 coordinator nodes and 6 data nodes, with the 3 coordinator nodes also functioning as data nodes. The IP allocation for the 6 machines is as follows:
Hostname | 10Gbps IP | 1Gbps IP | Purpose |
---|---|---|---|
Node1 | TBD | TBD | Data + Coordinator |
Node2 | TBD | TBD | Data + Coordinator |
Node3 | TBD | TBD | Data + Coordinator |
Node4 | TBD | TBD | Data |
Node5 | TBD | TBD | Data |
Node6 | TBD | TBD | Data |
5. Environment Configuration
5.1 Hardware Deployment and Network Planning
In actual projects, physical deployment of hardware such as cabinets, power, hosts, disks, network cards, and switches, as well as network communication, must be considered. Before installing the OS and GBase 8a MPP Cluster, consider the physical deployment and network planning as shown below:
In the above diagram, there are two racks (each rack must have independent power supply), which house the GBase 8a MPP Cluster nodes and application servers. The network communication between them is achieved through switches. To ensure efficient network operation, the business network in the actual project also needs to connect to these two redundant switches.
Below are the principles of hardware physical deployment and network planning:
High Availability of Power Supply: The power supply of the two racks is independent and does not affect each other, adhering to the principle of high availability for host power supply.
High Availability of Switches: Each rack is equipped with a switch, totaling 2 switches across both racks. These two switches are redundant; if one switch fails, the other immediately provides service, adhering to the principle of high availability for switches.
High Availability of Node Machines: The 6 machines marked with dashed lines in the diagram are used to deploy the GBase 8a MPP Cluster.
5.2 Server IP Address Planning
It is recommended to configure 3 coordinator nodes and 6 data nodes, with 3 machines doubling as both. Due to the automatic routing feature of the JDBC, all coordinator nodes can be listed in the hostlist parameter of the JDBC configuration URL in the web application middleware. This ensures continuous application use even if a node goes offline.
The IP address allocation for each cluster node is currently as follows. The 10G network IP address is used for internal data communication.
Hostname | Purpose | 10G Network IP | Gigabit Network IP |
---|---|---|---|
Node1 | Data Node + Manager Node | TBA | TBA |
Node2 | Data Node + Manager Node | TBA | TBA |
Node3 | Data Node + Manager Node | TBA | TBA |
Node4 | Data Node | TBA | TBA |
Node5 | Data Node | TBA | TBA |
Node6 | Data Node | TBA | TBA |
5.3 RAID Configuration Planning
RAID Type | Configuration | Capacity After Configuration | Purpose |
---|---|---|---|
RAID1 | 2 * 480GB | Approx. 480GB | Operating System Installation |
RAID5 | 12 x 1.2TB SAS drives (6 drives in RAID5, then two RAID5 sets in RAID0) | Approx. 12TB per node, 72TB total for 6 data nodes | Data File Storage and Data Log File Storage (/data) |
RAID Configuration Reference:
- RAID1: 2 SSD drives for OS installation and swap partition.
-
RAID5: 12 SAS drives, configured as 6 drives in RAID5, then two RAID5 sets in RAID0 for data storage. Use the largest stripe size for RAID card, minimum 1MB.
- Access Policy: Set to RW
- Read Policy: Set to Ahead
- Write Policy: Set to Write Back with BBU (allow RAID controller to switch to Write Through mode automatically)
- IO Policy: OS disk (RAID1) set to Direct IO, cluster installation disk (RAID50) set to Cached IO
- Other settings can be left as default.
5.4 Operating System Configuration Planning
The operating system for each cluster node is SUSE Linux Enterprise Server 12 SP4 (64-bit). The version information is as follows:
- Description: SUSE Linux Enterprise Server 12 SP4
- Release: 12.4
- Install using the "Desktop" or "Software Workstation" option.
- Root user password: Confirm with the relevant authority before installation. It is recommended to use a combination of letters and numbers without special characters.
- Gbase user password: Default to "gbase". Confirm with the relevant authority before installation.
- Language: Default to English (Chinese support needs to be installed manually later).
- Keyboard: Choose U.S. English.
- Hostname definition: As defined by the relevant authority.
- Time zone: Select Universal Time (GMT).
- Disk partitioning:
- sda 480G: [swap: 128GB, boot: 960MB, root: remaining space]
- sdb 5TB: [5TB allocated to /data]
- All partitions should be formatted as ext4.
- Installation method: Choose Desktop or Software Workstation, customize now, and select language support (Chinese support).
- No need to create new users after the operating system restarts.
- Modify the current date and time.
- Enable kdump.
5.5 Network Configuration Planning
- The gigabit and 10-gigabit network addresses for the cluster are pending allocation by the relevant authorities.
- Cluster nodes need to be configured with both gigabit and 10-gigabit network IP addresses (gigabit for office network connections and 10-gigabit for internal data communication within the cluster). Both gigabit and 10-gigabit networks should use dual NIC bonding in active-backup mode (mode=1). Each gigabit NIC should connect to the primary and backup gigabit switches, and each 10-gigabit NIC should connect to the primary and backup 10-gigabit switches.
- NIC bonding must be set to activate automatically on startup.
- It is not recommended to configure gigabit and 10-gigabit networks within the same subnet. For example, use private network addresses for the 10-gigabit network and office environment addresses for the gigabit network.
- The cluster servers and switches should be deployed in the same room and within the same IP local area network segment (or configured within the same VLAN) to avoid deploying cluster nodes across switch cascades.
- It is recommended to place the 6 nodes in two different cabinets (within the same room), with 3 nodes in the left cabinet and 3 nodes in the right cabinet.
5.6 Port Planning
Ensure the following ports are open for the cluster to operate correctly:
Program | Port |
---|---|
gcluster service | 5258 |
gnode service | 5050 |
syncserver service | 5288 |
Data export range | 6066-6165 |
Monitoring agent | 9110 |
Collection center | 9999 |
Alert service | 9111 |
Monitoring web | 8080 |
5.7 GBase Database Version Planning
- Version: GBase8a MPP v95XX series
- Cluster Architecture Planning: GBASE cluster architecture is planned to have 6 nodes, and the file server (loader) reuses one node.
- Advise to plan 3 coordinator nodes and 6 data nodes (3 data nodes can be reused with coordinator nodes).
5.8 Application Connection Resource Planning
Business applications should connect through different coordinator nodes to distribute load. Use the automatic routing feature in JDBC to ensure continuous application connection even if nodes go offline.
Future articles will cover pre-installation checks, detailed steps, and post-installation verification.
By understanding the deployment and configuration of the GBase 8a MPP cluster, administrators and developers can effectively design, optimize, and troubleshoot the database system, enhancing efficiency and performance.
Posted on July 23, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.