OpenSSH, PAM and user names

FreeBSD just published a security advisory for, amongst other issues, a piece of code in OpenSSH’s PAM integration which could allow an attacker to use one user’s credentials to impersonate another (CVE 2015-6563, original patch). I would like to clarify two things, one that is already mentioned in the advisory and one that isn’t.

The first is that in order to exploit this, the attacker must not only have valid credentials but also first compromise the unprivileged pre-authentication child process through a bug in OpenSSH itself or in a PAM service module.

The second is that this behavior, which is universally referred to in advisories and the trade press as a bug or flaw, is intentional and required by the PAM spec (such as it is). There are multiple legitimate use cases for this, such as:

  • Letting PAM, rather than the application, prompt for a user name; the spec allows passing NULL instead of a user name to pam_start(3), in which case it is the service module’s responsibility (in pam_sm_authenticate(3)) to prompt for a user name using pam_get_user(3). Note that OpenSSH does not support this.

  • Mapping multiple users with different identities and credentials in the authentication backend to a single “template” user when the application they need to access does not need to distinguish between them, or when this determination is made through other means (e.g. environment variable, which service modules are allowed to set).

  • Mapping Windows user names (which can contain spaces and non-ASCII characters that would trip up most Unix applications) to Unix user names.

That being said, I do not object to the patch, only to its characterization. Regarding the first issue, it is absolutely correct to consider the unprivileged child as possibly hostile; this is, after all, the entire point of privilege separation. Regarding the second issue, there are other (and probably better) ways to achieve the same result—performing the translation in the identity service, i.e. nsswitch, comes to mind—and the percentage of users affected by the change lies somewhere between zero and negligible.

One could argue that instead of silently ignoring the user name set by PAM, OpenSSH should compare it to the original user name and either emit a warning or drop the connection if it does not match, but that is a design choice which is entirely up to the OpenSSH developers.

Update 2015-08-27 NIST rates exploitability as “medium” rather than “low” because an attacker who is able to impersonate the UID used by the unprivileged child can use a debugger or other similar method to modify the username that the child passes back to the parent. In other words, an attacker can leverage elevated privileges into other elevated privileges. I disagree with the rating, but have never had any luck getting NIST to correct even blatantly false information in the past.

VerifyHostKeyDNS

The Internet Society likes my work. I aim to please…

One of the things I did in the process of importing LDNS and Unbound into FreeBSD 10 was to change the default value for VerifyHostKeyDNS from “no” to “yes” in our OpenSSH when compiled with LDNS support (which can be turned off by adding WITHOUT_LDNS=YES to /etc/src.conf before buildworld).

The announcement the ISOC blog post refers to briefly explains my reasons for doing so:

I consider this a lesser evil than “ask” (aka “train the user to type ‘yes’ and hit enter”) and “no” (aka “train the user to type ‘yes’ and hit enter without even the benefit of a second opinion”).

There were objections to this (which I’m too lazy to dig up and quote) along the lines of:

  • Shouldn’t OpenSSH tell you that it found and used an SSHFP record?
  • Shouldn’t known_hosts entries take precedence over SSHFP records?
  • Shouldn’t OpenSSH store the key in known_hosts after verifying it against an SSHFP record?

The answer to all of the above is “yes, but…”

Here is how host key verification should work, ideally:

  1. Obtain host key from server
  2. Gather cached host keys from various sources (known_hosts, SSHFP, LDAP…)
  3. If we found one or more cached keys:
    1. Check for and warn about inconsistencies between these sources
    2. Check for and warn about inconsistencies between the cached key and what the server sent us
    3. If we got a match from a trusted source, continue connecting
    4. Inform the user of any matches from untrusted sources
  4. Display the key’s fingerprint
  5. Ask the user whether to:
    1. Store the server’s key for future reference and continue connecting
    2. Continue connecting without storing the key
    3. Disconnect

The only configuration required here is a list of trusted and untrusted sources, the difference being that a match or mismatch from a trusted source is normative while a match or mismatch from an untrusted source is merely informative.

Unfortunately, in OpenSSH, SSHFP support seems to have been grafted onto the existing logic rather than integrated into it. Here’s how it actually works:

  1. Obtain host key from server
  2. If VerifyHostKeyDNS is “yes” or “ask”, look for SSHFP records in DNS
  3. If an SSHFP record was found:
    1. If it matches the server’s key:
      1. If it has a valid DNSSEC signature and VerifyHostKeyDNS is “yes”, continue connecting
      2. Otherwise, set a flag to indicate that a matching SSHFP record was found
    2. Otherwise, warn about the mismatch
  4. Look for cached keys in the user and system host key files
  5. If we got a match from the host key files, continue connecting
  6. If we did not find anything in the host key files:
    1. If we found a matching SSHFP record, tell the user
    2. Ask the user whether to:
      1. Store the server’s key for future reference and continue connecting
      2. Disconnect
  7. If we found a matching revoked key in the host key files, warn the user and terminate
  8. If we found a different key in the host key files, warn the user and terminate

Part of the problem is that at the point where we tell the user that we found a matching SSHFP record, we no longer know whether it was signed. By switching the default for VerifyHostKeyDNS to “yes”, I’m basically saying that I trust DNSSEC more than I trust the average user’s ability to understand the information they’re given and make an informed decision.

DNS in FreeBSD 10

Yesterday, I wrote about the local caching resolver we now have in FreeBSD 10. I’ve fielded quite a few questions about it (in email and on IRC), and I realized that although this has been discussed and planned for a long time, most people outside the 50 or so developers who attended one or both of the last two Cambridge summits (201208 and 201308) were not aware of it, and may not understand the motivation.

There are two parts to this. The first is that BIND is a support headache with frequent security advisories and a lifecycle that aligns poorly with our release schedule, so we end up having to support FreeBSD releases containing a discontinued version of BIND. The second part is the rapidly increasing adoption of DNSSEC, which requires a caching DNSSEC-aware resolver both for performance reasons (DNSSEC validation is time-consuming) and to avoid having to implement DNSSEC validation in the libc resolver.

We could have solved the DNSSEC issue by configuring BIND as a local caching resolver, but for the reasons mentioned above, we really want to remove BIND from the base system; hence the adoption of a lightweight caching resolver. An additional benefit of importing LDNS (which is a prerequisite for Unbound) is that OpenSSH can now validate SSHFP records.

Note that the dns/unbound port is not going away, and that users who want to run Unbound as a caching resolver for an entire network rather than just a single machine have the option of either moving their configuration into /var/unbound/unbound.conf, or running the base and port versions side-by-side. This should not be a problem as long as the port version doesn’t try to listen on 127.0.0.1 or ::1.

I’d like to add that since my previous post on the subject, and with the help of readers, developers and users, I have identified and corrected several issues with the initial commit

  • /etc/unbound is now a symlink to /var/unbound. My original intention was to have the configuration files in /etc/unbound and the root anchor, unbound-control keys etc. in /var/unbound, but the daemon needs to access both locations at run-time, not just on start-up, so they must all be inside the chroot. Running the daemon un-chrooted is, of course, out of the question.
  • The init script ordering has been amended so the local_unbound service now starts before most (hopefully all) services that need functioning DNS.
  • resolvconf(8) is now blocked from updating /etc/resolv.conf to avoid failing over from the DNSSEC-aware local resolver to a potentially non-DNSSEC-aware remote resolver in the event of a request returning an invalid record.
  • The configure command line and date / time are no longer included in the binary.

Finally, I just flipped the switch so that BIND is now disabled by default and the LDNS utilities are enabled. The BIND_UTILS and LDNS_UTILS build options are mutually exclusive; in hindsight, I should probably have built and installed the new host(1) as ldns-host(1) so both options could have been enabled at the same time. We don’t yet have a dig(1) wrapper for drill(1), so host(1) is the only actual conflict.