GBase 8a MPP Cluster Data Loading SFTP Server Configuration
Cong Li
Posted on September 20, 2024
Reference Articles
- FTP File Server Configuration Guide
- HTTP File Server Configuration Guide
- HDFS Server Configuration Guide
Today, we will introduce the configuration of an SFTP server. SFTP service does not require additional software packages to be deployed, you only need to enable the sshd
service.
1. Check if the sshd
Service is Running
# service sshd status
openssh-daemon (pid 2243) is running…
2. Default Directory Access Configuration
By default, sshd
does not restrict user directory access. After logging in via SFTP, users can navigate through any directory they have permission to access. In this case, the file path in the URL of the following load statement is the absolute path of the system:
load data infile ‘sftp://gbase:gbase@192.168.10.114/opt/data/test.tbl’ into table test.t data_format 3;
3. Modify the Default sshd
Configuration
When the number of concurrent cluster load tasks and the maximum number of load machines per task are large, SFTP file loading failures may occur. In this case, you can modify the sshd
configuration file as follows:
1) Edit the /etc/ssh/sshd_config
file:
# vi /etc/ssh/sshd_config
2) Modify the configuration file as follows:
# The MaxStartups value is represented as "start:rate:full". The default value is 10:30:100.
# When the number of unauthenticated connections reaches `start` (10), new connection attempts have a `rate/100` (30%) chance of being rejected by `sshd`.
# When the number of unauthenticated connections reaches `full` (100), all new connection attempts are rejected.
# For maximum concurrent cluster load tasks `N` and maximum load machines per task `M`, the recommended MaxStartups value is M*N+10:30:M*N*2.
# MaxStartups 10:30:100
MaxStartups 20:30:100
To restrict SFTP user access to their own home directories for security reasons, you need to enable the directory lock function in sshd
using chroot
. OpenSSH versions 4.8p1 and later support chroot
. You can check the OpenSSH version of the current system using the following command:
# ssh -V
OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010
3) Edit the /etc/ssh/sshd_config
file again:
# vi /etc/ssh/sshd_config
4) Modify the configuration file as follows:
# Override default of no subsystems
# Change default subsystem to internal-sftp
# Subsystem sftp /usr/libexec/openssh/sftp-server
Subsystem sftp internal-sftp
# Example of overriding settings on a per-user basis
# The following rule applies only to the user named `sftp`. To match multiple usernames, separate them with commas.
# Similarly, use `Match Group sftp` to match the group named `sftp`, separating multiple group names with commas if needed.
Match User sftp
# X11Forwarding no
# AllowTcpForwarding no
# Force the internal SFTP server to ignore commands in the `~/.ssh/rc` file.
ForceCommand internal-sftp
# Use `chroot` to lock the user's root directory to `%h`, where `%h` represents the user's home directory.
# Optional parameters include `%u`, which represents the username.
ChrootDirectory %h
Note: Key points for SFTP directory permissions:
- The owner of the directory specified by
ChrootDirectory
and all directories up to the system root must beroot
.- The permissions of the directory specified by
ChrootDirectory
and all directories up to the system root must not have group write permissions, i.e., the permission value cannot be higher than 755.
4. Restart the sshd
Service After Configuration
# service sshd restart
After configuring directory locks, the file path in the URL of the following load statement should be a relative path in the system. The absolute path of test.tbl
should be /home/gbase/opt/data/test.tbl
.
load data infile ‘sftp://gbase:gbase@192.168.10.114/opt/data/test.tbl’ into table test.t data_format 3;
That's all for today's content. Thank you for reading!
Posted on September 20, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024