Email authentication
Email authentication, or validation, is a collection of techniques aimed at providing verifiable information about the origin of email messages by validating the domain ownership of any message transfer agents who participated in transferring and possibly modifying a message.
The original base of Internet email, Simple Mail Transfer Protocol, has no such feature, so forged sender addresses in emails have been widely used in phishing, email spam, and various types of fraud. To combat this, many competing email authentication proposals have been developed, but only fairly recently have three been widely adopted – SPF, DKIM and DMARC. The results of such validation can be used in automated email filtering, or can assist recipients when selecting an appropriate action.
This article does not cover user authentication of email submission and retrieval.
Rationale
In the early 1980s, when Simple Mail Transfer Protocol was designed, it provided for no real verification of sending user or system. This was not a problem while email systems were run by trusted corporations and universities, but since the commercialization of the Internet in the early 1990s, spam, phishing, and other crimes increasingly involve email.Email authentication is a necessary first step towards identifying the origin of messages, and thereby making policies and laws more enforceable.
Hinging on domain ownership is a stance emerged in the early 2000. It implies a coarse-grained authentication, given that domains appear on the right part of email addresses, after the at sign. Fine-grain authentication, at user level, can be achieved by other means, such as Pretty Good Privacy and S/MIME. At present, digital identity needs to be managed by each individual.
An outstanding rationale for email authentication is the ability to automate email filtering at receiving servers. That way, spoofed messages can be rejected before they arrive to a user's Inbox. While protocols strive to devise ways to reliably block distrusted mail, security indicators can tag unauthenticated messages that still reach the Inbox. A 2018 study shows that security indicators can lower the click-through ratio by more than ten points, 48.9% to 37.2% of the users who open spoofed messages.
Nature of the problem
SMTP defines message transport, not the message content. Thus, it defines the mail envelope and its parameters, such as the envelope sender, but not the header nor the body of the message itself. STD 10 and define SMTP, while STD 11 and define the message, formally referred to as the Internet Message Format.SMTP defines the trace information of a message, which is saved in the header using the following two fields:
- Received: when an SMTP server accepts a message it inserts this trace record at the top of the header.
- Return-Path: when the delivery SMTP server makes the final delivery of a message, it inserts this field at the top of the header.
The path depicted on the left can be reconstructed on the ground of the trace header fields that each host adds to the top of the header when it receives the message:
Return-Path:
Received: from D.example.org by E.example.org with SMTP; Tue, 05 Feb 2013 11:45:02 -0500
Received: from C.example.net by D.example.org with SMTP; Tue, 05 Feb 2013 11:45:02 -0500
Received: from B.example.com
by C.example.net with ESMTP id 936ADB8838C
for
Received: from A.example.com by B.example.com with SMTP; Tue, 05 Feb 2013 17:44:47 +0100
Received: from by A.example.com with SMTP; Tue, 05 Feb 2013 17:44:42 +0100
It is important to realize that the first few lines at the top of the header are usually trusted by the recipient. In fact, those lines are written by machines in the recipient's Administrative Management Domain, which act upon her or his explicit mandate. By contrast, the lines that prove the involvement of A and B, as well as of the purported author's MUA could be a counterfeit created by C. The
Received:
field shown above is an epoch-making piece of the header. The Return-Path:
is written by E, the mail delivery agent, based on the message envelope. Additional trace fields, designed for email authentication, can populate the top of the header.Normally, messages sent out by an author's ADMD go directly to the destination's MX. The sender's ADMD can add authentication tokens only if the message goes through its boxes. The most common cases can be schematized as follows:
Sending from within ADMD's network (MUA 1)
- The ADMD's MSA authenticates the user, either based on its IP address or some other SMTP Authentication means. Depending on the recipient address, the message can follow the normal path or pass through a mailing list or a forwarding service. B can be an outbound SMTP proxy or a smarthost.
- If the local network does not block outbound port 25 connections, the user can deploy some "direct-to-mx" software. Typically, zombies and other malicious hosts behave that way.
- If the MUA is badly configured, it can also use a different relay, such as an outmoded open relay, that often doesn't authenticate the user.
Roaming user (MUA 2)
- Most of the times it is still possible to use one's own ADMD MSA.
- Outbound connections to port 25 can be intercepted and tunneled to a transparent proxy.
- A MUA can be configured to use an SMTP relay that the local network provider offers as a bonus.
Disconnected user
- An e-card can send mail on behalf of a customer who typed email addresses on the local keyboard; some web forms can be considered to work similarly.
Section notes
Authentication methods in widespread use
SPF
SPF allows the receiver to check that an email claimed to have come from a specific domain comes from an IP address authorized by that domain's administrators. Usually, a domain administrator will authorize the IP addresses used by their own outbound MTAs, including any proxy or smarthost.The IP address of the sending MTA is guaranteed to be valid by the Transmission Control Protocol, as it establishes the connection by checking that the remote host is reachable. The receiving mail server receives the
HELO
SMTP command soon after the connection is set up, and a Mail from:
at the beginning of each message. Both of them can contain a domain name. The SPF verifier queries the Domain Name System for a matching SPF record, which if it exists will specify the IP addresses authorized by that domain's administrator. The result can be "pass", "fail", or some intermediate result - and systems will generally take this into account in their anti-spam filtering.DKIM
DKIM checks the message content, deploying digital signatures. Rather than using digital certificates, the keys for signature-verification are distributed via the DNS. That way, a message gets associated to a domain name.A DKIM-compliant domain administrator generates one or more pairs of asymmetric keys, then hands private keys to the signing MTA, and publishes public keys on the DNS. The DNS labels are structured as
selector._domainkey.example.com
, where selector identifies the key pair, and _domainkey
is a fixed keyword, followed by the signing domain's name so that publication occurs under the authority of that domain's ADMD. Just before injecting a message into the SMTP transport system, the signing MTA creates a digital signature that covers selected fields of the header and the body. The signature should cover substantive header fields such as From:
, To:
, Date:
, and Subject:
, and then is added to the message header itself, as a trace field. Any number of relays can receive and forward the message and at every hop, the signature can be verified by retrieving the public key from the DNS. As long as intermediate relays don't modify signed parts of a message, its DKIM-signatures remain valid.DMARC
DMARC allows the specification of a policy for authenticated messages. It is built on top of two existing mechanisms, Sender Policy Framework and DomainKeys Identified Mail.It allows the administrative owner of a domain to publish a policy in their DNS records to specify which mechanism is employed when sending email from that domain; how to check the
From:
field presented to end users; how the receiver should deal with failures - and a reporting mechanism for actions performed under those policies.Other methods
A range of other methods have been proposed, but are now either deprecated or have not yet gained widespread support. These have included Sender ID, Certified Server Validation, DomainKeys and those below:ADSP
ADSP allowed the specification of a policy for messages signed by the author's domain. A message had to go through DKIM authentication first, then ADSP could demand a punishing treatment if the message was not signed by the author domain —as per theFrom:
header field.ADSP was demoted to historic in November 2013.
VBR
VBR adds a vouch to an already authenticated identity. This method requires some globally recognized authorities that certify the reputation of domains.A sender can apply for a reference at a vouching authority. The reference, if accepted, is published on the DNS branch managed by that authority. A vouched sender should add a
VBR-Info:
header field to the messages it sends. It should also add a DKIM signature, or use some other authentication method, such as SPF. A receiver, after validating the sender's identity, can verify the vouch claimed in VBR-Info:
by looking up the reference.iprev
Applications should avoid using this method as a means of authentication. Nevertheless, it is often carried out and its results, if any, written in theReceived:
header field besides the TCP information required by the SMTP specification.The IP reverse, confirmed by looking up the IP address of the name just found, is just an indication that the IP was set up properly in the DNS. The reverse resolution of a range of IP addresses can be delegated to the ADMD that uses them, or can remain managed by the network provider. In the latter case, no useful identity related to the message can be obtained.
DNSWL
Looking up a DNSWL may provide an assessment of the sender, possibly including its identification.Authentication-Results
RFC 8601 defines a trace header fieldAuthentication-Results:
where a receiver can record the results of email authentication checks that it carried out. Multiple results for multiple methods can be reported in the same field, separated by semicolons and wrapped as appropriate.For example, the following field is purportedly written by
receiver.example.org
and reports SPF and DKIM results:Authentication-Results: receiver.example.org;
spf=pass smtp.mailfrom=example.com;
dkim=pass [email protected]
The first token after the field name,
receiver.example.org
, is the ID of the authentication server, a token known as an authserv-id. A receiver supporting RFC 8601 is responsible to remove any false header claiming to belong to its domain, so that downstream filters cannot get confused. However, those filters still need to be configured, as they have to know which identities the domain may use.For a Mail User Agent, it is slightly harder to learn what identities it can trust. Since users can receive email from multiple domains—e.g., if they have multiple email addresses -— any of those domains could let
Authentication-Results:
fields pass through because they looked neutral. That way, a malicious sender can forge an authserv-id that the user would trust if the message arrived from a different domain. A legitimate Authentication-Results:
typically appears just above a Received:
field by the same domain from which the message was relayed. Additional Received:
fields may appear between that and the top of the header, as the message got transferred internally between servers belonging to that same, trusted ADMD.The Internet Assigned Numbers Authority maintains a registry of . Not all parameters need to be registered, though. For example, there can be local "policy" values designed for a site's internal use only, which correspond to local configuration and need no registration.