Local EGA Inbox

We use the OpenSSH SFTP server (version 7.8p5), on a Linux distribution (currently CentOS7).

Authentication is performed by the Operating System, using the classic plugable mechanism (PAM), and username resolution module (called NSS).

The user’s home directory is created when its credentials are retrieved from CentralEGA. Moreover, we isolate each user in its respective home directory (i.e. we chroot the user into it).

We installed hooks inside the OpenSSH SFTP server to detect when a file is (re)uploaded, renamed or removed, in which case, a notification is sent to CentralEGA via a shovel mechanism on the local message broker. In the case of a file upload, the notification also contains extra file information, such as a SHA256 checksum, its size and a timestamp for when it was last modified.

We created the SSH deamon /opt/openssh/sbin/ega-sshd binary and configured the *ega-sshd* service to use PAM.

The ega-sshd service is configured using the -c switch to specify where the configuration file is. The service runs for the moment on port 9000.

Note that when PAM is configured as above, and a user is either not found, or its authentication fails, the access to the service is denied. No other user (not even root), other than Central EGA users, have access to that service. We force sftp connections and even disallow ssh connections on that port.


Configuration settings

OpenSSH SFTP server

We configure the OpenSSH deamon to run on port 9000, and only allow sftp connections, for users in the lega group. (We don’t even allow root, nor ssh).

We disable X11 forwarding, tunneling and other forms of relay. We allow password and public key authentication.

The configuration file is in /etc/ega/sshd_confif as:

#LogLevel VERBOSE
Port 9000
Protocol 2
Banner  /etc/ega/banner
HostKey /etc/ega/ssh_host_rsa_key
HostKey /etc/ega/ssh_host_dsa_key
HostKey /etc/ega/ssh_host_ed25519_key
# Authentication
UsePAM yes
AuthenticationMethods "publickey" "keyboard-interactive:pam"
PubkeyAuthentication yes
PasswordAuthentication no
ChallengeResponseAuthentication yes
# Faster connection
UseDNS no
# Limited access
DenyGroups *,!lega
DenyUsers root lega
PermitRootLogin no
X11Forwarding no
AllowTcpForwarding no
PermitTunnel no
AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
AcceptEnv XMODIFIERS
Subsystem sftp internal-sftp #-l INFO
AuthorizedKeysCommand /usr/local/bin/ega_ssh_keys
AuthorizedKeysCommandUser root

Message Broker

The message broker connection is configured in the file /etc/ega/mq.conf as follows:

##############################
# Message Broker configuration
##############################

# of the form amqp(s)://user:password@host:port/vhost
connection = ${MQ_CONNECTION}

# Where to send the notifications
exchange = ${MQ_EXCHANGE:-cega}
routing_key = ${MQ_ROUTING_KEY:-files.inbox}

# Sets the message broker's heartbeat (in seconds).
# Default: 0 (ie disabled).
heartbeat = 0

# This causes the TLS layer to
# verify the hostname in the certificate.
# (Only valid when using "amqps")
# Default: no
verify_hostname = no

# This causes the TLS layer to
# verify the server's certificate.
# (Only valid when using "amqps")
# Default: no (as RabbitMQ's certificate is self-signed).
verify_peer = no

# If case verify_peer = yes, this causes the
# sftp server to use the given trusted bundle.
#cacert = /path/to/ca/trusted/bundle

When the docker image is booted, the following environment variables are used to create the above configuration file.

Variable name Default value Example/Format
MQ_CONNECTION *   amqp(s)://username:password@hostname:port/vhost
MQ_EXCHANGE cega The exchange name
MQ_ROUTING_KEY files.inbox The routing key for the messages

* required

Note

RSA (/etc/ega/ssh_host_rsa_key), DSA (/etc/ega/ssh_host_dsa_key) and ed25519 (/etc/ega/ssh_host_ed25519_key) keys are (re)created at boot time.

Inbox login system

Central EGA contains a database of users with credentials (per LocalEGA instance).

The authentication is either via a password or an SSH key against CentralEGA’s database. User IDs can also be extended to use Elixir IDs, of which we strip the @elixir-europe.org suffix.

The procedure is as follows: the inbox is started without any created user. When a user wants to log into the inbox (actually, only sftp uploads are allowed), the code looks up the username in a local cache, and, if not found, queries the CentralEGA REST endpoint. Upon return, we store the user credentials in the local cache and create the user’s home directory. The user now gets logged in if the password or public key authentication succeeds. Upon subsequent login attempts, only the local cache is queried, until the user’s credentials expire. The cache has a default TTL of one hour, and is wiped clean upon reboot (as a cache should).

Configuration

The NSS and PAM modules are configured by the file /etc/ega/auth.conf.

Some configuration parameters can be specified, while others have default values in case they are not specified. Some of the parameters must be specified (mostly those for which we can’t invent a value!).

A sample configuration file can be found on the EGA-auth repository, eg:

##########################################
# Remote database settings (using ReST)
##########################################

# The username will be appended to the endpoints
cega_endpoint_username = http://cega_users/user/
cega_endpoint_uid = http://cega_users/id/
cega_creds = user:password

##########################################
# NSS settings
##########################################

# Per site configuration, to shift the users id range
# Default: 10000
#uid_shift = 1000

# The group to which all users belong.
# For the moment, only that one.
# Required setting. No default.
gid = 997

# Per site configuration, where the home directories are located
# The user's name will be appended.
# Required setting. No default.
homedir_prefix = /ega/inbox

# The user's login shell.
# Default: /bin/bash
#shell = /bin/aspshell-r

# days until change allowed
# Default: 0
shadow_min = 0

# days before change required
# Default: 0
shadow_max = 99999

# days warning for expiration
# Default: -1
shadow_warn = 7

# days before account inactive
# Default: -1
# shadow_inact = 7

# date when account expires
# Default: -1
# shadow_expire = 7

##########################################
# Cache settings
##########################################

# Use the SQLite cache
# Default: yes
#use_cache = no

# Absolute path to the SQLite database.
# Required setting. No default value.
db_path = /run/ega-users.db

# Sets how long a cache entry is valid, in seconds.
# Default: 3600 (ie 1h).
# cache_ttl = 86400

Note

After proper configuration, there is no user maintenance, it is automagic. The other advantage is to have a central location of the EGA user credentials.

Moreover, it is also possible to add non-EGA users if necessary, by reproducing the same mechanism but outside the temporary cache. Those users will persist upon reboot.

Implementation

The cache is a SQLite database, mounted in a ramfs partition (of initial size 200M). A ramfs partition does not survive a reboot, grows dynamically and does not use the swap partition (as a tmpfs partition would). By default such option is disabled but can be enabled in the inbox entrypoint script.

The NSS+PAM source code has its own repository. A makefile is provided to compile and install the necessary shared libraries.

The ega-sshd service is configured to use PAM by creating the file /etc/pam.d/ega-sshd as follows.

#%PAM-1.0
auth       requisite    /lib/security/pam_ega_auth.so
account    requisite    /lib/security/pam_ega_acct.so attrs=0700 bail_on_exists
password   required     pam_deny.so
session    requisite    /lib/security/pam_ega_session.so umask=0007

The authentication code of the library (ie the auth type) checks whether the user has a valid ssh public key. If it is not the case, the user is prompted to input a password. Central EGA stores password hashes using the BLOWFISH hashing algorithm. LocalEGA also supports the usual md5, sha256 and sha512 algorithms available on most Linux distribution (They are part of the C library).

Updating a user password is not allowed (ie therefore the password type is configured to deny every access).

The session type handles the chrooting and the umask of the running process (here the internal sftp-server. OpenSSH can also handle that but it imposes more (arguably valuable) restrictions.

The account type of the PAM module ensures the user’s home directory is created. If it already is created, it’s a pass-through that always succeeds.

Notifications

The messages sent by the OpenSSH hooks capture when a file is (re)uploaded, renamed or removed.

The body of the messages is JSON formatted as:

For a file upload:

{
                 'user': <str>,
             'filepath': <str>,
            'operation': "upload",
             'filesize': <num>,
   'file_last_modified': <num>, // a UNIX timestamp
  'encrypted_checksums': [{ 'type': <str>, 'value': <checksum as HEX> },
                          { 'type': <str>, 'value': <checksum as HEX> },
                          ...
                          ]
}

The checksum algorithm type is ‘md5’, or ‘sha256’. ‘sha256’ is preferred.

For a file removal:

{
                 'user': <str>,
             'filepath': <str>,
            'operation': "remove",
}

For a file renaming:

{
                 'user': <str>,
             'filepath': <str>,
              'oldpath': <str>,
            'operation': "rename",
}

The message headers include:

  • a content type: application/json
  • delivery mode: 2 (for persistence)
  • and a correlation id.

The correlation id is a uuid of 37 characters, generated by uuid_generate.

Logging

We leverage the logging capabilites of OpenSSH. It can use syslog, and since we (might) run the sftp-server in a chroot environment, the syslog socket /dev/log is not found.

Here are a few steps to create it. Let’s say we want to get the logs for the user john. Create the directory /ega/inbox/john/dev (since /ega/inbox/john is the home directory for john). Install rsyslod, and configure it by creating the file /etc/rsyslog.d/sftp.conf with

# create additional sockets for the sftp chrooted users
module(load="imuxsock")
input(type="imuxsock" Socket="/ega/inbox/john/dev/log" CreatePath="on")

# log internal-sftp activity to sftp.log
if $programname == 'internal-sftp' then /var/log/sftp.log
& stop

This will create the /dev/log socket in the environment where the ssh daemon in running (ie, in john’s home directory).

Finally, you can update the LogLevel setting in the /etc/ega/sshd_config configuration file, or pass it as command-line argument with -o LogLevel=<level>. INFO and VERBOSE are useful examples of LogLevel.

You should now see the logs in /var/log/sftp.log.

Note

While compiling and installing use make && make install, you can see more output information by compiling the ega-sshd with more output, using make debug1, make debug2 or make debug3.

Version 1.0 | Generated January 14, 2023