SSH, Who is really on the other side?

Oct 8, 2020 · 1531 words · 8 minute read devops security SSH

It is interesting to observe how efficient engineers are able to deliver their work in the short period of time. But if we look deeper in the way how they do that, then we can see it is not so simple. Good engineers usually are doing that by having rich professional experience, they will use frameworks with high level of automation. But of course, there is another side of fast delivery, some product will be delivered but with ignoring good security practice, such as ignoring of strict SSH host checking.

Problem description

On Pan-Net cloud all services are built on the top of an OpenStack platform as a foundation. OpenStack is open source and API based, that enables a high level of automation of product delivery, by using a series of pipelines that will provision virtual machines (VM), deliver and setup applications that are fitting our needs.

So what seems to be a problem? In cloud, things like virtual instances (compute, storage, network) are delivered via cloud API using provisioning tools (Heat, Terraform), and that’s clear. But then operating system needs to be configured and application should be delivered and configured as well. Ansible can help here, but in that case Linux based system should be manipulated by using SSH as managing protocol.

Simple Pipeline

Ansible will usually use the dynamic discovery of all provisioned instances and will in parallel access to the many provisioned virtual machines at the same time, and then identify itself with SSH private/public keys. This is not all, SSH client by default will try authenticate SSH server as well. To that purpose SSH server during service initialization will generate its key pair, usually stores it inside of /etc/ssh/ folder. During connection establishing phase will deliver the public key to the client, that client needs to trust (is visible to the client as FINGERPRINT) to have a valid connection.

But wait; we have new Linux machines, and SSH client doesn’t have trusted established connection (recorded inside of knows_hosts file) between the SSH client and server so the client needs to accept/reject fingerprint of SSH server. That is blocking deployment automation. To avoid this kind of behavior what an engineer typically does, is to disable strict SSH host checking in Ansible configuration and treat all SSH servers as trusted.

1
2
3
4
[defaults]
remote_user=ubuntu
roles_path=roles/
host_key_checking = False

Basically what this configuration does, it turns off SSH server authentication, that SSH protocol supports, by providing its public SSH key to the client.

Threat modeling

Let’s assume, malicious actor can do some tricks as prerequisite, like using standard L2 way of attacking (like ARP poisoning), exploiting some vulnerability in SDN (Software Defined Network) or DNS spoofing and redirect traffic to him-/herself. So let’s analyze what can go wrong:

  1. Malicious actor can execute MITM and stand in the channel between SSH client and Server, proxy SSH call and steal the private keys.

MITM

Actually, this is a wrong assumption, because the SSH private key will never leave the client. The client needs to prove to the server that it’s in the possession of the private key (authentication), with the combination of some tricks that include Diffie-Hellman.

  1. Another threat that can be addressed is actually destination forging. Upon traffic hijacking, malicious actor can pretend to be “real destination” and simulate authentication handshaking with the client. But in reality will accept any SSH connection and fake authenticate any SSH user.

destination forging

In this case, malicious actor will not be able to steal private keys, but it is giving him ability to gain all content that is planned to be delivered to VM. This content could contain but not limited to; user credentials, secrets, configurations, and all other sensitive content. So to summarize it gives malicious actor to collect scary content what later can use to exploit operating system or any other bad things as blackmailing.

So, what we can do about it?

There are two options available to mitigate this kind of possibility that threat actor will exploit the vulnerability:

  1. To use cloud-init and vendor data of its part, to deliver host private/public keys to the SSH server that are generated outside of the VM. But it potentially opens other challenges like the proper way of handling a freshly generated key pair, and at the end how this kind of solution will scale.

  2. Another more convenient approach is somehow pick up host public key out of band, for example through the OpenStack API. After some research, we found out that cloud-init, which is a basic utility that is triggered after the first boot of the VM actually prints public keys into a console that are reflected in the logs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
[  229.224272] cloud-init[3098]: ci-info: no authorized ssh keys fingerprints found for user centos.
ec2:
ec2: #############################################################
ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
ec2: 256 SHA256:BqIGYpaKldQMkvcD31UQKBCQxY/Yrtl+lOCGaQXuhh4 no comment (ECDSA)
ec2: 256 SHA256:Ofio6ROXj5QpsyXYbBot+qQ/+cLjn/IrfU8OZ5MRjqU no comment (ED25519)
ec2: 2048 SHA256:mctMZawV8jCTnnkxJIK4yjQMsbhQGtMMi2aDAzqdBVo no comment (RSA)
ec2: -----END SSH HOST KEY FINGERPRINTS-----
ec2: #############################################################
-----BEGIN SSH HOST KEY KEYS-----
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBDjvBSxuPSicMgBV575yHITjh7GwFoR71DbUJ/yRUCJFGs9GLYnPwDy28K71FVkbtrp3sVMQZ1VFsDv90LndFxc=
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO2UHj3d62yTNfRh3xbugSRNGc+VQXemP58zrqmj02RO
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPG5FnmYlAkuoqnYVGbE4gcKpkGd22o59NqVOLtOssknbMC0y2/FOc+nWknegJe27L/rdEvFcYOY+a76H6ZaZn1Qz5ifXSEdrBE7/tWh16V+MUWM9S2/lfRVjDB5qcNwa5OFt60dmgWV0CljFXJHFlFFUZ1kOyM+5TmnTilT65TIRw8hxo7iIV9aPdDn0/bUsbjQAPaoiEf5LJb7fEIn6WpnavWjWVA5+eewBldG7s+6+B5jsdxYZMaG7ZsuvjWxjfX+PYvl1CeMF/AMtx3h2biMxXA7eYSyQCu5r8ibzv31hsCLJ5pzugrg2ccWEp0MdLD8QBaDGAtKbZKhlFIwL5
-----END SSH HOST KEY KEYS-----
[  229.273049] cloud-init[3098]: Cloud-init v. 18.2 finished at Mon, 16 Dec 2019 10:58:11 +0000. Datasource DataSourceNone.  Up 229.26 seconds

Proof of concept

For scenario 2, we have all elements we need, public keys and out of band way of collecting public keys. But still, we are missing how to preserve public keys. After some time, logs will be rotated and the public keys will not be available anymore, but maybe we would like to do more customization of VMs later and enable trust with them again. For that reason we can introduce protected Gitlab branch where we can preserve public keys.

PoC

know_hosts file format

Standard man sshd command can give us some insight how client constructs know_hosts file. It simply says:

Each line in these files contains the following fields: markers (optional), hostnames, keytype, base64-encoded key, comment. The fields are separated by spaces.

Later we can see that basically entries could contain hostname in two different formats; as hashed values to hide names or in plain text form.

Alternately, hostnames may be stored in a hashed form which hides host names and addresses should the file’s contents be disclosed.

PoC implementation

Now we have enough material to implement PoC and prove that concept is really working. PoC is implemented using python so for that reason we needed to include openstack and gitlab python modules.

The pre-request is to understand how to use OpenStack API and OpenStackSDK and GitLab API in python code.

Core of PoC is to collect console logs from selected virtual server, so in SDK it is documented as get_server_console_output(server, length=None) method. Than next step is to extract keys:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
@staticmethod
def _extract_pub_key_from_console(console_output):
    i_startof_host_keys_raw = console_output.find('-----BEGIN SSH HOST KEY KEYS-----')
    if i_startof_host_keys_raw == -1:
        return ""
    i_start_of_hostkeys = i_startof_host_keys_raw + len(
        '-----BEGIN SSH HOST KEY KEYS-----')
    if i_start_of_hostkeys == -1:
        return ""
    i_end_of_hostkeys = console_output.find('-----END SSH HOST KEY KEYS-----')
    keys = console_output[i_start_of_hostkeys:i_end_of_hostkeys].split('\n')
    return keys

In the code above, there could be the case when method don’t return anything, this is the case when logs have been already rotated and unavailable to or script.

python fingerutility.py  -i xxxxx gitlab --b know_hosts -k xxxx -u  https://gitlab.example.com -p xxx
File known_hosts created inside of branch know_hosts

In example above utility will collect keys for instance (-1), and will store it into gitlab as file known_hosts by using API token (-p).

know_host

Now let’s test it:

1
2
3
4
ssh ubuntu@xx.xx.xx.xx
The authenticity of host 'xx.xx.xx.xx (xx.xx.xx.xx)' can't be established.
ECDSA key fingerprint is SHA256:yKUO+vOTXUKSYP8addtktGP1lmTfigdWJiKR8WOICR4.
Are you sure you want to continue connecting (yes/no)?

That is ok, we still don’t have new VM as trusted, but let us pick know_hosts file created and stored on gitlab by using fingerutility utility.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
curl -L  https://gitlab.example.com/project/fingerutility/-/raw/id/known_hosts -o ~/.ssh/known_hosts && ssh ubuntu@xx.xx.xx.xx
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
100  1441  100  1441    0     0   2798      0 --:--:-- --:--:-- --:--:--  2798
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 4.15.0-115-generic x86_64)

* Documentation:  https://help.ubuntu.com
* Management:     https://landscape.canonical.com
* Support:        https://ubuntu.com/advantage

System information as of Sun Oct  4 18:44:18 UTC 2020

System load:  0.0               Processes:           89
Usage of /:   5.3% of 19.21GB   Users logged in:     0
Memory usage: 2%                IP address for ens3: xx.xx.xx.xx
Swap usage:   0%


* Canonical Livepatch is available for installation.
 - Reduce system reboots and improve kernel security. Activate at:
   https://ubuntu.com/livepatch

0 packages can be updated.
0 updates are security updates.

Failed to connect to https://changelogs.ubuntu.com/meta-release-lts. Check your Internet connection or proxy settings


Last login: Sun Oct  4 18:24:33 2020 from xx.xx.xx.xx
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ubuntu@ssh-key-host:~$

So from the output, we can see that our PoC is working, and it is up to us to adapt it and integrate into our process of automation.

Dubravko Sever
Production Factory Security Senior Specialist