NIST SPECIAL PUBLICATION 1800-16B
Securing Web Transactions
TLS Server Certificate Management
Security Risks and Recommended Best Practices
William C. Barker
The MITRE Corporation
This publication is available free of charge from: http://doi.org/10.6028/NIST.SP.1800-16
The first draft of this publication is available free of charge from: https://www.nccoe.nist.gov/projects/building-blocks/tls-server-certificate-management
Certain commercial entities, equipment, products, or materials may be identified by name or company logo or other insignia in order to acknowledge their participation in this collaboration or to describe an experimental procedure or concept adequately. Such identification is not intended to imply special status or relationship with NIST or recommendation or endorsement by NIST or NCCoE; neither is it intended to imply that the entities, equipment, products, or materials are necessarily the best available for the purpose.
National Institute of Standards and Technology Special Publication 1800-16B, Natl. Inst. Stand. Technol. Spec. Publ. 1800-16B, 108 pages, (June 2020), CODEN: NSPUE2
As a private-public partnership, we are always seeking feedback on our practice guides. We are particularly interested in seeing how businesses apply NCCoE reference designs in the real world. If you have implemented the reference design, or have questions about applying it in your environment, please email us at email@example.com.
All comments are subject to release under the Freedom of Information Act.
NATIONAL CYBERSECURITY CENTER OF EXCELLENCE
The National Cybersecurity Center of Excellence (NCCoE), a part of the National Institute of Standards and Technology (NIST), is a collaborative hub where industry organizations, government agencies, and academic institutions work together to address businesses’ most pressing cybersecurity issues. This public-private partnership enables the creation of practical cybersecurity solutions for specific industries, as well as for broad, cross-sector technology challenges. Through consortia under Cooperative Research and Development Agreements (CRADAs), including technology partners—from Fortune 50 market leaders to smaller companies specializing in information technology (IT) security—the NCCoE applies standards and best practices to develop modular, easily adaptable example cybersecurity solutions using commercially available technology. The NCCoE documents these example solutions in the NIST Special Publication 1800 series, which maps capabilities to the NIST Cybersecurity Framework and details the steps needed for another entity to recreate the example solution. The NCCoE was established in 2012 by NIST in partnership with the State of Maryland and Montgomery County, Maryland.
NIST CYBERSECURITY PRACTICE GUIDES
NIST Cybersecurity Practice Guides (Special Publication 1800 series) target specific cybersecurity challenges in the public and private sectors. They are practical, user-friendly guides that facilitate the adoption of standards-based approaches to cybersecurity. They show members of the information security community how to implement example solutions that help them align more easily with relevant standards and best practices, and provide users with the materials lists, configuration files, and other information they need to implement a similar approach.
The documents in this series describe example implementations of cybersecurity practices that businesses and other organizations may voluntarily adopt. These documents do not describe regulations or mandatory practices, nor do they carry statutory authority.
Transport Layer Security (TLS) [B5] server certificates [B3] are critical to the security of both internet-facing and private web services. A large- or medium-scale enterprise may have thousands or even tens of thousands of such certificates, each identifying a specific server in its environment. Despite the critical importance of these certificates, many organizations lack a formal TLS certificate management program and do not have the ability to centrally monitor and manage their certificates. Instead, certificate management tends to be spread across each of the different groups responsible for the various servers and systems in an organization. Central security teams struggle to make sure that certificates are being properly managed by each of these disparate groups. Where there is no central certificate management service, the organization is at risk because once certificates are deployed, it is necessary to maintain current inventories to support regular monitoring and certificate maintenance. Organizations that do not properly manage their certificates face significant risks to their core operations, including:
- application outages caused by expired TLS [B5] server certificates- hidden intrusion, exfiltration, disclosure of sensitive data, or other attacks resulting from encrypted threats or server impersonation- application outages or attacks resulting from delayed replacement of certificates and private keys in response to either certificate authority compromise or discovery of vulnerabilities in cryptographic algorithms or libraries
Despite the mission-critical nature of TLS server certificates, many organizations have not defined the clear policies, processes, roles, and responsibilities needed for effective certificate management. Moreover, many organizations do not leverage available automation tools to support effective management of the ever growing numbers of certificates. The consequence is continuing susceptibility to security incidents.
This NIST Cybersecurity Practice Guide shows large and medium enterprises how to employ a formal TLS certificate management program to address certificate-based risks and challenges. It describes the TLS certificate management challenges faced by organizations; provides recommended best practices for large-scale TLS server certificate management; describes an automated proof-of-concept implementation that demonstrates how to prevent, detect, and recover from certificate-related incidents; and provides a mapping of the demonstrated capabilities to the recommended best practices and to NIST security guidelines and frameworks.
This NIST Cybersecurity Practice Guide consists of the following volumes:
- Volume A: Executive Summary- Volume B: Security Risks and Recommended Best Practices (you are here)- Volume C: Approach, Architecture, and Security Characteristics- Volume D: How-To Guides – instructions for building the example solution
Authentication; certificate; cryptography; identity; key; key management; PKI; private key; public key; public key infrastructure; server; signature; TLS; Transport Layer Security
We are grateful to the following individuals for their generous contributions of expertise and time.
Thales Trusted Cyber Technologies (Thales TCT)
The MITRE Corporation
The MITRE Corporation
The MITRE Corporation
The MITRE Corporation
The MITRE Corporation
The MITRE Corporation
The terms “shall” and “shall not” indicate requirements to be followed strictly in order to conform to the publication and from which no deviation is permitted.
The terms “should” and “should not” indicate that among several possibilities one is recommended as particularly suitable, without mentioning or excluding others, or that a certain course of action is preferred but not necessarily required, or that (in the negative form) a certain possibility or course of action is discouraged but not prohibited.
The terms “may” and “need not” indicate a course of action permissible within the limits of the publication.
The terms “can” and “cannot” indicate a possibility and capability, whether material, physical or causal.
CALL FOR PATENT CLAIMS
This public review includes a call for information on essential patent claims (claims whose use would be required for compliance with the guidance or requirements in this Information Technology Laboratory (ITL) draft publication). Such guidance and/or requirements may be directly stated in this ITL Publication or by reference to another publication. This call also includes disclosure, where known, of the existence of pending U.S. or foreign patent applications relating to this ITL draft publication and of any relevant unexpired U.S. or foreign patents.
ITL may require from the patent holder, or a party authorized to make assurances on its behalf, in written or electronic form, either:
a) assurance in the form of a general disclaimer to the effect that such party does not hold and does not currently intend holding any essential patent claim(s); or
b) assurance that a license to such essential patent claim(s) will be made available to applicants desiring to utilize the license for the purpose of complying with the guidance or requirements in this ITL draft publication either:i) under reasonable terms and conditions that are demonstrably free of any unfair discrimination; orii) without compensation and under reasonable terms and conditions that are demonstrably free of any unfair discrimination.
Such assurance shall indicate that the patent holder (or third party authorized to make assurances on its behalf) will include in any documents transferring ownership of patents subject to the assurance, provisions sufficient to ensure that the commitments in the assurance are binding on the transferee, and that the transferee will similarly include appropriate provisions in the event of future transfers with the goal of binding each successor-in-interest.
The assurance shall also indicate that it is intended to be binding on successors-in-interest regardless of whether such provisions are included in the relevant transfer documents.
Such statements should be addressed to: firstname.lastname@example.org
List of Figures
List of Tables
Organizations risk losing revenue, customers, and reputation, and exposing internal or customer data to attackers if they do not properly manage Transport Layer Security (TLS) server certificates. TLS is the most widely used security protocol to secure web transactions and other communications on the internet and internal networks. TLS server certificates are central to the security and operation of internet-facing and internal web services. Improper TLS server certificate management results in significant outages to web applications and services—such as government services, online banking, flight operations, and mission-critical services within an organization—and increased risk of security breaches. Organizations should ensure that TLS server certificates are properly managed to avoid these issues.
The broad distribution of TLS server certificates across multiple groups and technologies within an enterprise requires that organizations establish formal management programs that include clear policies and responsibilities, a central Certificate Service, automation, and education. Successful implementation of a certificate management program relies on executive sponsorship, clear objectives, an action plan, and regular progress reviews.
The objective of this volume is to describe risks and challenges related to TLS server certificates and address those challenges by providing recommended best practices for large-scale TLS server certificate management. This document recommends that organizations establish a formal TLS certificate management program, and it enumerates elements that should be considered for inclusion in such a program. It is important to note that the best practices recommended in this guide are just that—recommendations.
The scope of this document is confined to recommendations regarding TLS server certificate management. TLS client certificate management is out of scope. This document is not intended to provide an extensive explanation of what TLS certificates and keys are or how they are used. Also, certificate management policies need to be considered within the context of an organization’s overall enterprise security policies.
It is also beyond the scope of this document to discuss the broader aspects of organizational policies and procedures [B1] with which TLS server certificate management should be consistent. For example, general recommendations regarding security policy, vulnerability management, incident response, disaster recovery, security testing, etc. that are not specifically related to certificate management are out of scope. Discussion of general security protections for certificate management system components is also beyond the scope of this document. This document assumes the security of these components is protected by recommended security best practices, e.g., patching, strong authentication, and access control that the organization has in place as part of its overall security policy.
An organization’s business operations may be internally or externally supported. For those organizations that have third parties supporting key business operations, those third parties may use TLS certificates. If a function is outsourced, the organization should ensure that its requirements are met by the third party performing the function. The TLS certificate management recommendations provided in this document can be applied to these third parties as well as to the organization itself.
In accordance with their security policies, some organizations may choose to perform inspection of internal traffic that has been encrypted using TLS, by intercepting and decrypting TLS traffic at the network edge or by performing passive decryption at locations deeper within the network. The question of whether to perform such inspection is complex, and it involves important tradeoffs between traffic security and traffic visibility that organizations should weigh carefully. It is beyond the scope of this document to advocate for or against TLS traffic inspection. Some organizations have determined that the security risks posed by inspection of internal TLS traffic are not worth the potential benefits of having visibility into the encrypted traffic. Other organizations, however, have determined that it is in their best interests to perform TLS traffic inspection. For those organizations that have a policy of performing TLS traffic inspection, this document provides recommended best practices regarding how to securely manage the TLS private keys required for this purpose.
The security and integrity of TLS relies on secure implementation and configuration of TLS servers and effective TLS server certificate management. Guidance regarding the implementation and configuration of TLS servers is outside the scope of this document. The secure implementation and configuration of TLS servers is addressed in NIST Special Publication (SP) 800-52 [B13]. Organizations should provide clear instruction to groups and individuals deploying TLS servers in their environments to read, understand, and follow the guidance provided in 800-52.
Lastly, the recommendations included in this document are generic. Each organization should determine for itself how to best apply these recommendations to its own enterprise. Volumes C and D of this Practice Guide describe a specific implementation used to demonstrate the application of these recommendations.
2 TLS Server Certificate Background¶
TLS [B5] is the security protocol used to authenticate and protect internet and internal network communications for a broad number of other protocols—including Hypertext Transfer Protocol (http) [B17] for web servers; Lightweight Directory Access Protocol (LDAP) [B18] for directory servers; and Simple Mail Transfer Protocol [B7], Post Office Protocol [B10], and Internet Message Access Protocol [B4] for email.
TLS server certificates serve as machine identities that enable clients to authenticate servers via cryptographic means. For example, when a bank customer connects across the internet to an online banking website, the customer’s browser (i.e., the TLS client) will present an error message if the server does not provide a valid certificate that matches the address the user entered in the browser. Further, TLS server certificates are used extensively inside corporate and government networks to establish trust between machines — servers, applications, devices, micro-services, etc. Most large enterprises have thousands of certificates, each identifying a specific server in their environment. (Note: Web browsers play the role of clients to web servers. As such, they contain functionality to automatically establish TLS connections on behalf of users, evaluate certificates received during the TLS handshake process, and present errors when unexpected certificate issues are encountered.) Figure 2-1 illustrates the pervasive use of certificates within organizations.
Figure 2‑1 TLS Certificates Are Broadly Used for Communications in Organizations
Each TLS server certificate contains the address of the server that it identifies (e.g., www.organization1.com) and a cryptographic key, called a public key, which is unique to the server and used by clients in securely authenticating the server (see Figure 2-2).
Figure 2‑2 Server Address, Public Key, and Issuer Information on Four of the Organization’s TLS Server Certificates
As shown in Figure 2-3, each server holds a private key that corresponds to the public key in the certificate so each server can prove it is the holder of the certificate. While the certificate is shared with any client that connects to the server, it is critical that the private key is kept secure and secret so it cannot be obtained by an attacker and used to impersonate the server. However, common operational practices may increase the risk of private key disclosure. Many private keys used with TLS are stored in plaintext files on TLS servers. Alternatively, private keys can be stored in files encrypted with a password; however, the passwords are generally stored in plaintext configuration files so they are accessible by the TLS server software when it is started. These common practices make it possible for private keys to be viewed and copied by system administrators or malicious actors.
Figure 2‑3 Upon Connecting to the Server, the Client Receives the Server’s TLS Certificate, Which Includes the Server’s Public Key
In addition to users with browsers connecting to servers that have TLS server certificates, automated processes also connect as clients to TLS servers and must trust TLS server certificates. Examples of automated processes acting as TLS clients include a web server making requests to an application server, one cloud container connecting to another, or an Internet of Things (IoT) device connecting to a cloud service. (See Figure 2-4.)
Figure 2‑4 Browsers and Various Automated Processes (Web Servers, Containers, and IoT Devices) Connect as Clients to TLS Servers
2.2 Certificate Request and Installation Process¶
The following steps, shown in Figure 2‑8 and detailed below, are typically followed by a system administrator to get a TLS certificate for a server that he or she manages.
Figure 2‑8 Certificate Issuance Process
The system administrator for the TLS server uses utilities on the server to generate a cryptographic key pair (a public key and a private key).
The system administrator enters the address of the server (e.g., www.organization1.com). The utilities create a request for a certificate, called a certificate signing request (CSR), which contains the address of the server and the public key. The system administrator retrieves a copy of the CSR (which is contained in a file) from the server.
The system administrator submits the CSR to the registration authority (RA), who acts as a reviewer and approver of the certificate request.
The RA/approver reviews the CSR, performs necessary checks to confirm the validity of the request and the authority of the requester, and then sends an approval to the CA.
The CA issues the certificate.
The CA notifies the system administrator that the certificate is ready, either by emailing a copy of the certificate or providing a link from which it can be downloaded. The system administrator retrieves the server certificate.
The system administrator retrieves the CA certificate chain from the CA.
The system administrator installs the server certificate on the server.
The system administrator installs the CA certificate chain on the server.
The CA certificate chain is used by TLS clients to validate the signature on the server certificate. When a client connects to a TLS server, the server returns its certificate and the CA certificate chain, which can contain one or more CA certificates. The client starts with one of its locally trusted root CA certificates and successively validates the signatures on certificates in the CA certificate chain until it reaches the server certificate.
The system administrator must note the expiration date in the certificate to ensure that a new certificate is requested and installed before the existing certificate expires.
3 TLS Server Certificate Risks¶
When TLS server certificates are not properly managed, organizations risk negative impacts to their revenue, customers, and reputation. There are four primary types of negative incidents that result from certificate mismanagement: outages to important business applications, caused by expired certificates; security breaches resulting from server impersonation; outages or security breaches resulting from a lack of crypto-agility; and increased vulnerability to attack via encrypted threats. (Note: While TLS server certificates enable confidentiality for legitimate communications, they can also allow attackers to hide their malicious activities within encrypted TLS connections. When a TLS server certificate is installed and enabled on a server, all users who connect (including attackers) can establish an encrypted connection to the server.)
3.1 Outages Caused by Expired Certificates¶
TLS server certificates contain an expiration date to ensure that the cryptographic keys are changed regularly; this reduces the impact of a security breach caused by a compromised private key. If a server certificate is not changed before its expiration date, then clients should generate an error message and stop the connection process to the server. This causes the application supported by the server with the expired certificate to become unavailable.
Application outages can also be caused by the mismanagement of CA certificate chains that results in expired intermediate CA certificates. The TLS server is responsible for providing the client with the intermediate CA certificates (CA certificate chain) necessary for the client to link the server’s end-entity certificate with the root CA certificate trusted by the client. The absence or expiration of an intermediate certificate means the client will not trust the server, even though the server may have a perfectly trustworthy end-entity certificate. Intermediate CA certificates are typically renewed every few years, and it is possible for a TLS server to fail to use the most current version. As a result, although the server certificate has been updated, the installed intermediate CA certificate may expire, resulting in an outage due to expiration. Such outages are often difficult to diagnose because the focus of investigation is typically on the server certificate, which is still valid and not the cause of the outage.
Nearly every enterprise has experienced an application outage due to an expired certificate, including outages to major applications such as online banking, stock trading, health records access, and flight operations. Organizations’ increased use of TLS server certificates to secure the organizations’ applications increases the likelihood of outages, because there are more certificates to track and more certificates per business application that can impact operations.
Various scenarios result in a certificate expiring while still in use, causing an outage, including these:
The system administrator forgets about the certificate.
The system administrator ignores notifications that the certificate will soon expire.
The system administrator does not properly install or update the CA certificate chain.
The system administrator is reassigned, and nobody else receives expiry notifications.
The system administrator enrolls for a new certificate but does not install it on the server(s) in time or installs it incorrectly.
The application relies on multiple load-balanced servers, and the certificate is not updated on all of them.
The certificate is installed on a backup system, but the certificate has expired before the backup system is brought online.
Troubleshooting an incident where an application is unavailable due to an expired certificate can be complex and often requires hours to discover the source of the problem. If the server on which an expired certificate is deployed is being accessed by people using browsers, then each of those people will receive an error message, making it clear that the cause of the issue is an expired certificate. If, on the other hand, the clients connecting to the server with the expired certificate are automated systems (e.g., the clients are web servers and the server with the expired certificate is an application server) then the web servers acting as clients will stop operations when they encounter the expired certificate. They may log an error message, but that message may not be immediately discovered in the log file, increasing the amount of time required to identify the root cause of the outage and fix it. If certificates that are deployed on backup systems are not updated when they expire, an outage can occur if operations are shifted to the backup systems.
3.2 Server Impersonation¶
An attacker may be able to impersonate a legitimate TLS server (e.g., a banking website) if the attacker is able to get a fraudulent certificate containing the address of the server and the attacker’s own public key by tricking a trusted CA into issuing the certificate to the attacker or by compromising the CA and issuing the certificate. A client connecting to the attacker’s server will accept the certificate because the certificate contains the address to which the client intended to connect and because the certificate has been issued by a trusted CA. Because the certificate contains the attacker’s public key (and the attacker also holds the private key corresponding to this public key), the attacker can decrypt the communications from the client (including passwords intended for login to the legitimate server). Alternatively, if the attacker can access a copy of the legitimate server’s private key, then the attacker can eavesdrop or impersonate that server by using the legitimate server’s certificate. To successfully perform these attacks, the attacker must redirect traffic destined for the legitimate server to a system that the attacker is operating (e.g., using Border Gateway Protocol [BGP] hijacking or DNS compromise). (Note: BGP [B16] is used to communicate optimal routes between internet service providers on the internet. It is possible for an attacker to hijack traffic by falsely advertising that the fastest route to one or more internet protocol [IP] addresses is via systems that the attacker is operating, thereby causing traffic to be rerouted through the attacker’s systems. The DNS provides translation between human‑readable addresses [e.g., www.company123.com] and IP addresses. If an attacker can compromise an organization’s DNS account, then the attacker can change the IP address to which traffic intended for that organization will be sent.)
Most private keys used on TLS servers are stored in files. The private keys are directly managed and handled by system administrators, who can make copies of the private keys. In addition, many TLS servers are clustered (for load balancing); in many cases, the same TLS server certificate and the private key will be copied to each server in the cluster. The manual handling and copying of private keys significantly increase the possibility of a key compromise and the confidentiality and data integrity consequences of key compromise (including but not limited to server impersonation).
3.3 Lack of Crypto-Agility¶
There are several types of incidents that have required organizations to replace [B2] large numbers of TLS certificates and private keys, including the following:
CA compromise: If a CA is breached by an attacker, then the attacker can cause that CA to issue fraudulent certificates. After the CA breach is discovered and forensics are performed, it may be concluded that certificates issued by the CA cannot be trusted and that new certificates must be installed on all servers with certificates from the compromised CA.
Vulnerable algorithm: Cryptographic algorithms are constantly evaluated for vulnerabilities, by parties with both positive and negative intent. When an algorithm is found to be vulnerable (e.g., Secure Hash Algorithm 1 (SHA-1) [B6] for signature generation), TLS server certificates that are dependent on the algorithm must be replaced. Ongoing advancements in quantum computing require that organizations establish the ability to rapidly replace all existing certificates and keys and be prepared for implementation of post-quantum algorithms.
Cryptographic library bug: Because cryptographic operations are quite complex, a few groups have specialized in developing cryptographic libraries that are used by TLS servers and other systems. If a bug is found with the key-generation functions of a cryptographic library, then all keys generated since the bug was introduced must be replaced. (Note: In 2008, a key-generation bug in the cryptographic libraries in Debian Linux was discovered. That bug was introduced in 2006. In 2017, a key-generation bug was discovered in the Infineon cryptographic libraries used in smart cards and trusted platform module chips.)
Most enterprises are not prepared to respond to the large-scale cryptographic failure that results from these types of incidents. Many organizations do not have comprehensive inventories of their TLS server certificates. In addition, they cannot contact the certificate owners, because they do not have up-to-date information about the certificate owners responsible for each certificate. Finally, many organizations rely on manual processes to manage certificates and do not have processes for tracking the progress in replacing large numbers of certificates — leaving the organizations to guess how many systems have been updated. All these factors can result in organizations requiring several weeks or months to replace all affected certificates, during which time business applications can be unavailable or vulnerable to security breaches.
3.4 Encrypted Threats¶
Many organizations are working to encrypt all communications by using TLS server certificates to prevent interception of plaintext credentials and eavesdropping on communications. While TLS server certificates enable confidentiality for legitimate communications, they can also allow attackers to hide their malicious activities within encrypted TLS connections. When a TLS server certificate is installed and enabled on a server, all users who connect (including attackers) can establish an encrypted connection to the server. An attacker who establishes an encrypted connection can then begin to probe the server for vulnerabilities within that encrypted connection.
The following steps, shown in Figure 3-1 and detailed below, describe how an attacker can leverage encrypted connections in his or her attacks.
Figure 3‑1 How an Attacker Leverages Encrypted Connections to Hide Attacks
The attacker begins by connecting to a server and establishing an encrypted TLS session. Within that encrypted session, the attacker can probe for vulnerabilities that exist on the server and its software.
If the attacker discovers a vulnerability and sufficiently elevates his or her privileges, then the attacker can load malware, generally called a “web shell,” onto the server.
With this web shell loaded, the attacker can send commands over TLS connections (i.e., encrypted connections facilitated by the server’s certificate). The attacker can then work to pivot to other systems by probing for vulnerabilities in servers accessible from the compromised system. The increased use of encryption enables an attacker who has compromised one system to pivot and attack other systems via encrypted connections, without being detected.
Once the attacker has successfully reached data that he or she desires, the attacker is able to use the web shell to exfiltrate data. Because the attacker is establishing TLS connections by using the server’s certificate to connect to the web shell, all the exfiltrated data is encrypted while in transit.
As stated in Section 1.2, in accordance with their security policies, some organizations may choose to perform inspection of internal traffic that has been encrypted using TLS. The question of whether to perform such inspection is complex, and it involves important tradeoffs between traffic security and traffic visibility that each organization should weigh for itself.
Some organizations are concerned about the risk posed by attackers who leverage encrypted connections to hide their attacks, as illustrated in Figure 3-1 above. If these attackers gain access to trusted internal systems via malware or some other exploit, they may be able to move about the network without being detected by hiding their traffic within TLS connections. Organizations that are concerned about these risks want the option of decrypting internal TLS traffic so it can be inspected. Such inspection may be used not only for intrusion and malware detection, but also for troubleshooting, fraud detection, forensics, and performance monitoring. These organizations have concluded that the visibility into their internal traffic that can be provided by TLS inspection is worth the tradeoff of the weaker encryption and other risks that come with such inspection. For these organization, TLS inspection may be considered standard practice and may represent a critical component of their threat detection and service assurance strategies. Some of these organizations have complex networks that are several tiers deep, so it would not be realistic to expect them to be able to manage the movement of keys required to perform such inspection securely using purely manual processes. For those organizations that have a policy to perform inspection of TLS traffic, this document provides recommendations regarding how to securely move the TLS private keys needed for this inspection.
On the other hand, inspection creates a single location where traffic may be decrypted, creating an attractive target for hackers. It also may have compliance implications if sensitive data is being decrypted. An organization that performs decryption on border devices or that performs passive internal decryption runs the risk of such devices being taken over by a malicious attacker who would then have access to private keys and traffic. In addition, passive decryption requires the use of static key exchange, which results in weaker encryption than can be achieved when using ephemeral key exchange methods. If an attacker captures a server’s private key and that key was negotiated using static key exchange, the attacker will also be able to decrypt traffic that had been captured in the past. If, instead, that key was negotiated using an ephemeral key exchange method, the key will provide forward secrecy, meaning the attacker will not be able to decrypt past traffic. For some organizations, the reduced security of performing inspection or using static keys is unacceptable. These organizations have determined that the security risks posed by inspection of internal TLS traffic are not worth the potential benefits of having visibility into the encrypted traffic. These organizations should have a policy against performing TLS inspection. As an alternative to inspection, they may choose to perform traffic analysis to try to detect illegitimate internal TLS traffic. None of the discussion or recommendations in this document are intended to mandate or encourage an organization to begin performing TLS inspection of its traffic if that organization has determined that the risks of TLS inspection are not worth the benefits.
An organization that has a policy to perform inspection of TLS traffic so it can monitor and detect malicious activity has several methods it can use to gain visibility into encrypted communications. Some examples are listed below and are illustrated in Figure 3-2:
placing a threat detection system that acts as a reverse proxy in front of servers
installing end point software on each server to monitor communications
passively decrypting communications
Figure 3‑2 Methods for Gaining Visibility into Encrypted Communications
The use of threat detection proxies is ideal at the perimeters of organizations for monitoring inbound internet communications for attacks. The threat detection proxy is connected in-line, requiring all inbound traffic to pass through it before moving on to the next device. The threat detection proxy terminates the TLS connection. It decrypts and examines incoming traffic. If the traffic is determined to be malicious, the proxy drops it. Because the threat detection proxy is terminating all TLS connections, it must have a certificate for each server to which clients are attempting to connect. After the threat detection proxy decrypts and examines the traffic, it can establish a TLS session with the appropriate server behind it and send the traffic to that server in an encrypted TLS session.
While a threat detection proxy is ideal for use at the perimeter of an organization, many organizations also want to inspect their internal TLS traffic. Many enterprise applications include multiple tiers of servers and services (e.g., load balancers, web servers, application servers, databases, identity services) that communicate with each other internally via encrypted TLS sessions, making it impractical to place threat detection proxies between all systems on internal networks.
End point software can be installed on each server to monitor communications, alleviating the need to install proxies, but may impose additional processing requirements on servers that are already under a high load. In addition, because of the diversity of TLS server systems, it may be difficult to find an end point solution that operates on all platforms and provides comprehensive and consistent visibility and monitoring of all communications.
Passive, out-of-band decryption and threat analysis are performed by using devices that decrypt TLS‑encrypted communications but that do not terminate TLS connections. The TLS connection is established between the client and the server. The passive decryption device listens to the TLS traffic without affecting it and decrypts it. Threat analysis is performed either by the passive decryption device or via other systems to which decrypted traffic is forwarded. Security-focused passive decryption devices can detect malicious traffic that has been sent on TLS connections, but these devices do not react in real time to block this traffic. Passive decryption does not require a change in network architecture or loading additional software on TLS servers. However, passive decryption poses a TLS server certificate management challenge, because private keys must be copied to decryption devices from each TLS server whose communications will be monitored. The transfer of private keys must be done securely to avoid a key compromise and rapidly to avoid blind spots in monitoring for attacks. Automation can significantly aid in securely transferring private keys from TLS servers to the decryption device and keeping keys up-to-date when certificates are replaced.
4 Organizational Challenges¶
Despite the mission-critical nature of TLS server certificates, many organizations do not have clear policies, processes, and roles and responsibilities defined to ensure effective certificate management. Moreover, many organizations do not leverage available technology and automation to effectively manage the large and growing number of TLS server certificates. As a result, many organizations continue to experience significant incidents related to TLS server certificates.
As illustrated by Figure 4-1, the management of TLS server certificates is challenging due to the broad distribution of certificates across enterprise environments and groups, the complex processes needed to manage certificates, the multiple roles involved in certificate management and issuance, and the speed at which new TLS servers are being deployed. TLS server certificates are typically issued by a Certificate Services team (often called the public key infrastructure team). However, the certificates are commonly installed and managed by the certificate owners — the groups and the system administrators responsible for individual web servers, application servers, network appliances, and other devices for which certificates are used.
Figure 4‑1 TLS Certificates Are Distributed Broadly Across Enterprise Environments and Groups
4.1 Certificate Owners¶
The term “certificate owner” is used to denote a group responsible for systems where certificates are deployed. Typically, there are several roles within a certificate owner group, including executives who have ultimate accountability for ensuring that certificate-related responsibilities are addressed, system administrators who are responsible for managing individual systems and the certificates on them, and application owners who can review and approve certificate requests from system administrators to ensure that only authorized certificates are issued. The certificate owners typically are not knowledgeable about the risks associated with certificates or the best practices for effectively managing certificates.
With the advent of virtualization, the development and operations (DevOps) teams provision systems and software through programmatic means. This introduces a new type of certificate owner and new TLS server certificate challenges for organizations. As organizations push for more rapid and efficient deployment of business applications, many DevOps teams deploy certificates without coordination with the Certificate Services team. This can result in certificates for mission-critical applications not being tracked. This can be particularly problematic if bugs in DevOps programs/scripts cause certificates to be improperly deployed or updated. In addition, as DevOps teams adopt newer frameworks and tools, it is important to continue to monitor certificates and applications deployed and maintained by older DevOps frameworks and tools.
4.2 Certificate Services Team¶
The Certificate Services team is typically the group that has been given responsibility for managing relationships with public CAs and for the internal CAs. The Certificate Services team typically comprises one to three people. Though the team members have good knowledge and expertise about TLS server certificates, they do not have the resources or access required to directly manage certificates on the extensive number of systems where certificates are deployed. However, the Certificate Services team is often blamed when TLS certificate incidents, such as outages, occur.
5 Recommended Best Practices¶
To effectively address the risks and organizational challenges related to TLS server certificates and to ensure that they are a security asset instead of a liability, organizations should establish a formal TLS certificate management program with executive leadership, guidance, and support. The formal TLS certificate management program should include clearly defined policies, processes, and roles and responsibilities for the certificate owners and the Certificate Services team, as well as a central Certificate Service. The program should be driven by the Certificate Services team but should include active participation by the certificate owners — whether the certificate owners are responsible for traditional servers, appliances, virtual machines, cloud-based applications, DevOps, or other systems acting as TLS servers.
5.1 Establishing TLS Server Certificate Policies¶
As previously mentioned, most certificate owners are typically not knowledgeable about the best practices for effectively managing TLS server certificates. Because certificate owners are responsible for the systems where certificates are deployed, it is imperative that they be provided with clear requirements and that those requirements be enforced as policies. This section provides recommended TLS server certificate policies. It also includes recommended responsibilities for the certificate owners and the Certificate Services team to successfully meet those requirements and policies.
These recommendations are intended to serve as guidance for organizations that do not already have their own TLS server certificate management policies and responsibilities defined, or that are looking to improve existing policies and procedures. They are not intended to override any organization’s existing policies. Organizations should feel free to copy, delete, augment, or modify these recommended policies and responsibilities as needed to suit their own requirements. Appendix B contains a table that maps the recommended best practices for TLS server certificate management proposed in this document to the NIST Framework for Improving Critical Infrastructure Cybersecurity (Cybersecurity Framework). [B11] Appendix C contains a table that explains how specific controls defined within NIST SP 800-53 [B12] should be applied to these TLS server certificate management recommended best practices.
The recommended requirements in the remaining subsections use the word “should” throughout. Based on their own security policies, organizations may choose to make these recommendations mandatory, e.g., by changing “should” to “must.”
To address TLS server certificate risks, organizations should establish and maintain clear visibility across all TLS server certificates in their environment so they can perform the following actions:
detect potential vulnerabilities (e.g., the use of weak algorithms, such as SHA-1)
identify certificates that are nearing expiration and replace them
respond to large-scale cryptographic incidents, such as a CA compromise, vulnerable algorithms, and cryptographic library bugs
ensure compliance with regulatory guidelines and established organizational policy
This visibility is achieved by maintaining an inventory of all TLS server certificates. A single central inventory is recommended, as it minimizes the possibility of overlooking critical TLS server certificates.
An up-to-date inventory of all deployed certificates (end-entity certificates and CA certificate chain certificates) should be maintained, including certificates on backup systems that may not necessarily be online. For each certificate, the inventory should include the following components:
Subject Distinguished Name (DN)
Subject Alternative Names (SANs)
issue date (i.e., notBefore date)
expiration date (i.e., notAfter date)
issuing Certificate Authority (CA)
key algorithm (e.g., Rivest, Shamir, & Adleman [RSA]; Elliptic Curve Digital Signature Algorithm [ECDSA])
validity period (i.e., from the notBefore date/time to the notAfter date/time)
installed location(s) of certificate (e.g., IP or DNS address and file path)
certificate owner (i.e., the group responsible for the certificate)
group responsible for the DevOps technology used to deploy the certificate (if the certificate was deployed via DevOps technology)
contacts (i.e., the group of individuals that should be notified of issues)
approver(s) (i.e., the parties responsible for reviewing issuance and renewal requests)
type of system (e.g., web, email, directory server, appliance, virtual machine, container)
business application (i.e., the application using the certificate)
applicable regulations (e.g., Payment Card Industry Data Security Standard [PCI-DSS], Health Insurance Portability and Accountability Act [HIPAA])
extended key-usage flags
Certificate Services team: provide a central system for certificate owners to establish and maintain their inventories
Certificate owners: establish and maintain an inventory of all certificates and keys on their systems
To rapidly respond to issues with TLS server certificates, it is necessary to know who is responsible for each certificate. This information should be kept up-to-date as people are reassigned or terminated. Because reassignments can happen frequently, and because there may be a lag in updating ownership information, it is recommended that ownership be assigned to functional groups (e.g., an Active Directory [AD] group) that contain multiple individuals, instead of assigning ownership to individuals. In cases where DevOps technologies are used to deploy TLS server certificates, the group responsible for the DevOps deployment technology should be tracked, in addition to the certificate owner, so they can both be contacted when incidents arise.
Contact information for certificate owners should be assigned to functional groups (e.g., AD groups), and the content of a group should be updated within <30> business days of a role reassignment or termination of an individual member of that group. (Note: Here and elsewhere in this practice guide, when specific time frames, such as “<30> business days” are recommended, these values are often placed within brackets (“<>”) to indicate they are provided only as suggestions. Each organization should determine the time frames to be instituted within its own enterprise, based on its needs. If it is possible for organizations to require compliance within shorter time frames, then that would be preferable.)
If the certificate was deployed via DevOps technology, contact information should be provided for the group that is responsible for this technology, and the content of this group should be updated within <30> business days of a role reassignment or termination of an individual member of that group.
Certificate Services team: provide a system to track ownership as part of the inventory
Certificate Owners: keep ownership information up-to-date (i.e., membership information for certificate owner group up-to-date)
DevOps team: Where DevOps technology is used to deploy the certificate, the DevOps team should keep membership information for DevOps deployment technology group up-to-date
5.1.3 Approved CAs¶
CAs are trusted issuers of certificates. If organizations do not control the CAs that are used to issue certificates in their environments, then they will face several potential risks:
Increased costs: If multiple groups are individually purchasing certificates from CAs, then the cost per certificate can be significantly higher because organizations are not taking advantage of volume discounts
Trust issues: Each CA used to issue TLS certificates to servers in an organization must be trusted by the clients connecting to those servers via a root certificate. If a large number of CAs (internal and external) is used, then the organization is required to take on the extra burden of maintaining multiple trusted CA certificates on clients to avoid cases in which the necessary CA is not trusted, which can result in outages
Security risk: A certificate owner may decide to set up his or her own CA on a system that does not have the necessary security controls and to configure the system to trust that CA. This increases the possibility of an attacker impersonating a server if the attacker compromises that CA and issues fraudulent certificates
Unexpected CA incidents: If one of the untracked CAs used in the organization’s environment encounters an issue, such as a CA compromise or suddenly being untrusted by browser vendors, then the organization may have to scramble to avoid security or operational issues for core applications
To ensure they can rapidly respond to a CA compromise or another incident when using public CAs, organizations should maintain contractual relationships with more than one public CA. By doing this, organizations will not have to scramble to negotiate a contract (which may take days or weeks) while attempting to respond to an urgent situation. Organizations that rely on internal CAs should also maintain at least one backup internal CA so they can efficiently respond to an internal CA compromise or incident.
Certificates should be issued only by the following CAs:
Contractual relationships with at least two public CAs that conform to the CA/Browser Forum Baseline Requirements should be maintained at all times
Internal CAs (if any) should be securely operated. Backup internal CAs should be maintained to support a rapid response to incidents, such as CA compromise
Certificate Services team: manage business relationships with approved external CAs, and operate or outsource the operation of approved internal CAs
Certificate owners: ensure that only certificates from approved CAs are used
5.1.4 Validity Periods¶
The validity period for a certificate defines the time that it is valid, from the first date/time (notBefore) to the last date/time (notAfter) that it can be used. It is important to note that the validity period of a certificate is different than the cryptoperiod of the public key contained in the certificate and the corresponding private key. It is possible to renew a certificate with the same public and private keys (i.e., not rekeying during the renewal process). However, this is only recommended when the private key is contained with a hardware security module (HSM) validated to Federal Information Processing Standards (FIPS) Publication 140-2 Level 2 or above.
One of the greatest risks of private-key compromise is from administrators who have direct access to plaintext private keys (including the ability to make a copy) and who are then reassigned or terminated. Although certificates would ideally be changed (rekeyed) each time an administrator with access to private keys is reassigned, this is often not practical. Therefore, ensuring certificates and their corresponding private keys are changed regularly is important, as shorter validity periods reduce the amount of time that a compromised private key can be used for malicious purposes. However, validity periods that are too short may increase the risk of outages. Organizations should determine the ideal validity period that balances security and operational risks for their organization. In general, due to the regular reassignment of administrative staff, it is recommended that validity periods be one year or less. The automated management of certificates can enable a more frequent renewal of certificates.
The maximum validity period (i.e., from the notBefore date to the notAfter date for certificates should be <one year or less>
Certificate Services team: ensure CAs are available to certificate owners to issue certificates with approved validity periods
Certificate owners: ensure certificates are renewed and replaced before their expiration
5.1.5 Key Length¶
Each certificate contains a public key that is mathematically matched to a private key (which should be kept secret). To prevent an attacker from guessing the value of the private key, it is necessary to randomly pick the value of the private key from a large set of possible values. For example, it is more difficult for someone to guess a number selected between zero and 1,000,000 than a number selected between zero and 100. The key length effectively defines the size of the range of numbers from which private and public key values are selected. For a given algorithm, a longer key length is more secure against guessing attacks. However, longer key lengths require more processing power and time, as well as more storage. Consequently, a balance must be struck between security risk and resource requirements. NIST monitors the industry to continually assess the potential crypto-analytical capabilities of possible attackers and their ability to guess the values of private keys. Based on this information, it sets recommended minimum key lengths. It is recommended that organizations require the use of keys with key lengths equal to or greater than the NIST recommendations.
All certificates should use key lengths that comply with NIST SP 800-131A, which are currently equal to or greater than the following key lengths:
Certificate Services team: provide dashboards, reports, and alerts that enable the rapid detection of unauthorized key lengths, and provide automation technologies that enable rapid remediation
Certificate owners: use only TLS certificate public and private keys whose key lengths meet or exceed the organization’s key-length policy, monitor their inventory, and replace certificates that do not comply with the policy
5.1.6 Signing Algorithms¶
Certificates are digitally signed by CAs so their authenticity can be verified. Signatures are generated by using digital signature algorithms (e.g., RSA, ECDSA) [B14] and hash algorithms (e.g., Secure Hash Algorithm 256 [SHA-256]). If certificates are signed by using a signing algorithm with an insufficient key length or by using vulnerable hash algorithms (e.g., SHA-1), then attackers can forge certificates and impersonate TLS servers. Consequently, organizations should ensure that all certificates are signed by using cryptographic algorithms that conform to approved standards.
All certificates should be signed with an approved signature algorithm and key length and with an approved hash algorithm (e.g., SHA-256), as defined in NIST SP 800-131A and FIPS Publication 180-4
Certificate Services team: ensure the availability of CAs that use approved signing algorithms, and provide reporting and alerting tools to enable the rapid identification of noncompliant certificates
Certificate Owners: use only certificates signed with an approved signature algorithm and key length and with an approved hash algorithm, and identify and replace certificates signed with unapproved algorithms or key lengths
5.1.7 Subject DN and SAN Contents¶
The combination of Subject DN and SAN are used to identify the TLS server to which the certificate is issued. The Subject DN is in the form of an X.500 DN, which can include information such as the country, state, city/locality, organization, organizational unit (e.g., department), and a common name (CN). The CN, when present, and the SAN field contain the fully-qualified domain name or IP address of the TLS server. For publicly trusted certificates, the contents of the Subject DN are governed by the public CA that issues them. The CA/Browser Forum requires the SAN field to be present, however, the CN is now deprecated and the other fields in the DN are now optional, though in practice they are still present. For internal certificates, the contents of the Subject DN fields, such as the organizational unit, can help identify the group responsible for certificates.
Public CAs will often perform checks to validate that an organization owns a top-level domain (e.g., www.company123.com), and will then allow the organization to request a certificate with Subject DNs and with SANs containing domains subordinate to that domain (e.g., www.company123.com, www.server1.company123.com). Consequently, it is critical that organizations implement approval processes that ensure the Subject DNs and SANs in all certificate requests are thoroughly reviewed and vetted before they are sent to the CA.
Names used in Subject DNs should conform to the following requirements:
The Organization (O) attribute in the Subject DN should be one of the following values:
<e.g., Company, Inc.>
The Organizational Unit attribute in the Subject DN should conform to the following categorization:- <specify whether department, location, or another categorization should be used>
The Locale (City), State (Province), and Country codes should be set to the following location:- <City, State, Country of organization identified in O = headquarters offices>
The CNs and SANs should not include wildcards (e.g., *.company123.com).
The fully-qualified domain names or IP addresses in all Subject DNs and SANs should be reviewed and approved by an individual who is knowledgeable about the application or system for which the certificate is being requested and who can confirm that the requester is authorized to make the request.
Certificate Services team: provide technology solutions to automatically detect and prevent Subject DN and SAN policy violations
Certificate owners: ensure the Subject DNs and SANs in all certificates comply with policy
The broadening use of and reliance on TLS server certificates to secure important applications is rendering manual certificate management impractical. Risks such as certificate-related outages are often the result of errors made while manually managing certificates. Organizations are unable to manually replace large numbers of certificates in response to large-scale cryptographic incidents, such as CA compromises, in a timely manner. Consequently, organizations should work to automate certificate management on as many systems and applications as possible to decrease security and operational risks. Historically, many organizations can find it difficult to induce certificate owners to move from manual to automated methods—though the move to automation can significantly reduce their work and risk. New automation tools (e.g., DevOps) and protocols have increased the methods and options by which automated certificate management can be successfully performed. Consequently, organizations should define clear guidelines and policies for automation and for when continued manual management is justified due to operational or organizational constraints.
Automation should be used wherever possible for the enrollment, installation, monitoring, and replacement of certificates, or justification should be provided for continuing to use manual methods that may cause operational security risks.
Certificate Services team: provide a central system that supports certificate owners in automating the management of their certificates
Certificate owners: automate the management of their certificates
5.1.10 Private Key Security¶
Each TLS server certificate has a corresponding private key that must be kept secret to prevent compromise. Often, the private keys used with TLS server certificates are stored in plaintext files, which may be accessible by administrators if not properly secured. Even when the files where private keys are stored are encrypted with passwords, the passwords are stored in plaintext configuration files so that TLS servers can gain access to the private keys when they are started. It is possible to protect TLS private keys in HSMs; however, due to the large number of TLS servers where private keys would be required, many organizations have not used HSMs to protect private keys. Organizations should assess the criticality and risk of each TLS server and determine the appropriate level of protection required for private keys. Further, organizations should ensure that only authorized personnel have access to private keys and that the authorized personnel are trained in the processes necessary to keep the private keys secure.
Access to TLS server private keys stored in plaintext files should be limited to authorized personnel. For mission-critical systems, TLS private keys should be stored in an HSM.
Individuals granted access to private keys should complete training on procedures and practices for keeping private keys secure.
Certificate Services team: provide training on the proper procedures for keeping private keys secure, and provide automation to simplify the management of TLS private keys stored in HSMs
Certificate owners: ensure only authorized personnel are granted access to private keys, regularly review who is granted access to private keys, and ensure the authorized personnel receive training on the proper procedures for keeping private keys secure
5.1.11 Rekey/Rotation upon Reassignment/Terminations¶
Most private keys associated with TLS server certificates are stored in plaintext files. System administrators who manually manage TLS server certificates and associated private keys on their systems can make copies of the private-key files. Consequently, if a system administrator is reassigned or terminated, then the private key and certificate should be replaced (renewed) with a new key pair and certificate, and the previous certificate should be revoked, to prevent any malicious activities with the original private key and certificate. If automation is used for the management of certificates and private keys and if direct access by system administrators is limited (via limited-access controls and audit logging on any access), then certificate owners can avoid replacing certificates when a system administrator is reassigned or terminated.
Private keys and the associated certificates that have the capability of being directly accessed by an administrator should be replaced within <30> days of reassignment or <5> days of termination of that administrator.
Certificate Services team: provide automated certificate and key management services that remove the need for administrators to manually access private keys, alleviating the need to replace certificates and private keys when a system administrator is reassigned or terminated
Certificate owners: ensure manually managed certificates and private keys are replaced when a system administrator with access is reassigned or terminated
5.1.12 Proactive Certificate Renewal¶
When a certificate is nearing expiration, it should be replaced. The replacement of certificates involves multiple steps, including reviewing and approving requests and testing the newly installed certificate(s) to ensure the application they secure is operating properly after replacement. If an unexpected issue is encountered with the new certificate and the associated private key, the previous certificate and private key can be restored and used if the certificate has not yet expired. If certificate owners are not proactive and instead wait until the last minute before requesting, obtaining, and installing a new certificate, this procrastination can cause unplanned, urgent work by multiple teams (including the Certificate Services team) and risk unplanned downtime for the application. Certificate owners should plan, initiate, and complete the certificate renewal, installation, and testing process several weeks ahead of certificate expiration to ensure unexpected issues and circumstances can be addressed and to avoid unnecessary “fire drills” for supporting teams (e.g., the Certificate Services team).
Certificates should be renewed, installed, and tested at least <30> days prior to expiration of the currently installed certificate.
If the validity period (total lifetime) of a certificate is shorter than <60> days (e.g., 20-day certificates used in short-lived/automated applications), then the certificate should be renewed before <80 percent> of the total validity period has elapsed.
Certificate Services team: provide automated services for monitoring certificate expiration dates, send reports to certificate owners showing certificates expiring in the next <60–90> days, send alerts and escalations to certificate owners for certificates expiring in <30> days or fewer, and send alerts to executives for certificates expiring in <30> days or fewer
Certificate owners: track upcoming expiration dates for their certificates, schedule replacement (in change windows where necessary), and ensure completion of certificate renewal, installation (of the new certificate), and verification of proper operation prior to the minimum renewal windows
There are several incidents that can require organizations to rapidly replace large numbers of certificates and private keys, including CA compromise or distrust, vulnerable algorithms, or bugs in cryptographic libraries. There have been multiple examples of these incidents in recent years, including the CA compromise of DigiNotar, the distrust of Symantec certificates by browser vendors, the deprecation of SHA-1 for signature generation, and cryptographic library bugs in Debian and Infineon. In 2006, NIST first recommended that organizations stop using SHA-1 for signatures. However, many organizations were still struggling to eradicate the use of certificates signed with SHA-1 in 2017, when their use was forcibly stopped by browser vendors.
An unexpected cryptographic incident can require an organization to rapidly respond to ensure that its operations and services to customers are not interrupted for an extended period. In addition, the industry is preparing for a transition [B2] to quantum-resistant algorithms, which will require organizations to replace large numbers of certificates and private keys.
System owners should maintain the ability to replace all certificates on their systems within <2> days to respond to security incidents such as CA compromise, vulnerable algorithms, or cryptographic library bugs.
System owners should maintain the ability to track the replacement of certificates so it is clear which systems are updated and which are not.
Select and establish contracts with backup CAs for public and internal certificates to enable rapid transition in response to a CA compromise.
Certificate Services team: document effective processes for replacing large numbers of certificates and private keys; train all certificate owners on certificate replacement processes; provide services, such as automation, that enable the rapid replacement of large numbers of certificates and private keys; actively track the occurrence of cryptographic incidents that require replacement of certificates and private keys, and communicate clearly to certificate owners when such an event occurs; and ensure contracts with backup CAs for both public certificates and internal certificates (if applicable) are in place
Certificate owners: proactively support crypto-agility by maintaining an inventory of all certificates for which they are responsible and corresponding ownership information, making sure that certificate replacement processes are as efficient as possible and that personnel are trained; and appropriately prioritize replacement of certificates and private keys when cryptographic incidents occur
If the private key associated with a TLS server certificate is compromised, then the certificate can be revoked by the CA so that potential relying parties are alerted and do not trust the certificate. Certificate owners should understand their responsibility in revoking certificates and should proactively revoke certificates when an incident occurs. Inadvertent or malicious revocation of a certificate can cause downtime for the application that it secures; therefore, organizations should ensure they have processes to prevent unauthorized revocation.
TLS server certificates should be revoked if the associated private key has been or is suspected of being compromised.
Revocation of a TLS server certificate outside the renewal/replacement process can be initiated only by a certificate owner or identified security personnel and should be approved by the Certificate Services team or a designated security approver.
Certificate Services team: provide the infrastructure and services to ensure that certificates can be rapidly and securely revoked when necessary and that certificates cannot be revoked without proper approval
Certificate owners: request revocation of old certificates that have been replaced but that are still valid, and request revocation of certificates when a private key is compromised or suspected to be compromised
5.1.15 Continuous Monitoring¶
Because of the broad use of TLS server certificates in all critical communications, operational or security failures related to TLS server certificates can significantly impact the business operations of organizations. TLS certificates should be continuously monitored to prevent outages and security vulnerabilities. The certificates should be monitored for impending expiration; for situations in which they are not operating, are not configured properly, or are vulnerable; and for situations in which they are not consistent with policy.
The expiration dates of certificates should be continuously monitored. Notifications should be automatically sent to certificate contacts <90, 60, and 30> days prior to expiration. If a certificate is not successfully renewed and replaced <30> days prior to expiration, then escalation notifications should be sent to the certificate owner management and incident response teams.
The operation and configuration of certificates should be periodically checked to identify any issues or vulnerabilities.
Certificates should be periodically checked to ensure they are consistent with policy.
Certificate Services team: provide systems and services for continuously monitoring TLS server certificates, and support certificate owners in implementing TLS server certificate continuous monitoring and in keeping it operational
Certificate owners: ensure continuous monitoring processes are in place and operational for all their TLS server certificates
5.1.16 Logging TLS Server Certificate Management Operations¶
TLS server certificates serve as trusted credentials that authenticate servers for mission-critical applications. Just as logging data access is required for forensics and other purposes, logging all certificate and private-key management operations is critical. Organizations should ensure they have a complete chain of custody for private keys and certificates that includes a log of all operations, including key-pair generation, certificate requests, request approval, certificate and key installation, the copying of certificates and keys (e.g., for load-balanced applications), certificate and key replacement, and certificate revocation. Logs should be collected and stored in a central location so the complete chain of events for certificates and private keys can be reviewed when necessary.
A complete automated log should be maintained of all TLS certificate and private-key management operations (from creation to installation to revocation) that includes a description of the operation performed, any relevant metadata about the event (e.g., the location of files), the identity of the person/application performing the operation, and the date/time it was performed.
Certificate Services team: provide a system for collecting all logged events, and provide tools that automatically log certificate and private-key management operations
Certificate owners: ensure all tools used for certificate and private-key management operations log events in a central log
5.1.17 TLS Traffic Monitoring¶
While providing authentication and confidentiality for legitimate communications and operations, TLS can also be used by attackers to hide their operations, such as scanning for vulnerabilities, leveraging vulnerabilities for privilege escalation, denial-of-service operations, and data exfiltration. Depending on organizational policy, in addition to monitoring the content of TLS communications for external-facing systems, organizations may monitor TLS communications between internal systems to retain the ability to detect attackers who are attempting to pivot between internal systems (to gain access to critical data) or are exfiltrating compromised data. For external facing systems, monitoring is generally supported by decrypting traffic on systems located at organizational boundaries (such as load balancers.) For internal traffic, monitoring may be accomplished in a variety of ways, including via proxy, end point software, or passive decryption. As discussed in Section 3.4, each organization should decide for itself whether the security risks posed by monitoring internal TLS traffic are worth the potential benefits of having visibility into the encrypted traffic. If, on the other hand, the organization determines it is in its best interests to perform TLS traffic monitoring through passive decryption, then the recommended related requirements and responsibilities are as follows.
Where TLS monitoring via passive decryption is supported, TLS server private keys should be securely and automatically transferred to authorized TLS decryption devices and updated when TLS certificates are replaced.
Certificate Services team: provide a secure method for transporting TLS private keys between TLS servers and passive decryption devices when passive decryption is used for TLS traffic monitoring
Certificate owners: ensure all communications protected by TLS are monitored for unauthorized operations and data exfiltration
If the organization determines it is in its best interests to perform TLS traffic monitoring through means other than passive decryption, the following recommended responsibility applies.
Certificate owners: ensure all communications protected by TLS are monitored for unauthorized operations and data exfiltration
5.1.19 Certificate Transparency¶
Certificate Transparency (CT) provides a publicly searchable log of issued certificates. CT is primarily focused on certificates issued by public CAs. Some browsers require that certificates issued by public CAs be published to a publicly available CT log; otherwise, the browser will display a warning to the user. The availability of CT logs enables organizations to confirm that unauthorized certificates have not been issued for their domains.
CT logs should be regularly monitored to ensure unauthorized certificates have not been issued for any domains owned by the organization.
Certificate Services team: establish an automated process for monitoring CT logs
5.1.20 CA Trust by Relying Parties¶
Clients that connect to TLS servers verify the validity of those servers’ certificates by using CA certificates or root certificates that they store locally in their systems. Many operating systems and applications (e.g., browsers) are preloaded with certificates from public CAs that have met the requirements of standards organizations, such as the CA/Browser Forum. Some applications, such as browsers, may include more than 100 trusted CA certificates. To reduce their exposure to CA compromise incidents, organizations should minimize the CAs that their clients trust to only those they are likely to need to trust. For example, if certain systems acting as TLS clients are used only for internal operations, then they should trust only the certificate(s) from the internal CA(s). Furthermore, if certain TLS clients communicate with TLS servers from select partners, then certificates from only the CAs expected to be used by those partners should be trusted. Organizations should maintain an inventory of CA certificates trusted on all their systems, ensure only needed CAs are trusted, and maintain the ability to rapidly remove or replace CA certificates that should no longer be trusted.
CA certificates trusted by TLS clients should be limited to only those required to validate TLS certificates of the servers with which the client communicates. All unneeded CA certificates should be removed. The following CAs should never be trusted:
Certificate Services team: provide the technology and services for discovering and creating inventories of existing CA certificates and for managing (e.g., adding, removing) CA certificates
Certificate owners: limit CA trust to the minimum needed for each system and ensure all other CAs are removed
5.2 Establish a Certificate Service¶
Manually managing TLS server certificates is infeasible due to the large number of certificates in most enterprises. It is also not feasible for each certificate owner to create their own certificate management system. The most efficient and effective approach is for the Certificate Services team to provide a central Certificate Service that includes technology-based solutions that provide automation and that support certificate owners in effectively managing their certificates. This service should include the technology/services for CAs, certificate discovery, inventory management, reporting, monitoring, enrollment, installation, renewal, revocation, and other certificate management operations.
The central Certificate Service should also provide self-service access for certificate owners so they are able to configure and operate the services for their areas without requiring significant interaction with the Certificate Services team. Furthermore, the central Certificate Service should be able to integrate with other enterprise systems, including identity and access management systems, ticketing systems, configuration management databases, email, workflow, and logging and auditing.
Approved CAs should be designated and made available to certificate owners for requesting public and internal certificates. If, as is common, different CAs will be used for issuing public and internal certificates, then instructions should be provided to certificate owners to help them select the correct CA based on the purpose of the server where the certificate will be used. Establish backup CAs for both public and internal certificates, including completing contracts with backup public CAs so an immediate cutover is possible in case of a CA compromise, for business reasons, or because of some other motivation.
An up-to-date inventory of deployed TLS server certificates is the foundation of an effective certificate management program. The functionality required by an inventory system generally makes it infeasible for certificate owners to operate and manage their own inventory systems. It is imperative that the Certificate Services team provides a central system that certificate owners can use to maintain an inventory of their certificates. Without a central, up-to-date inventory, the Certificate Services team has no way of proactively monitoring for certificate-related security and operational risks or supporting certificate owners in minimizing such risks.
The central inventory system should provide the following characteristics and functions:
Automatic parsing: certificates contain multiple fields of information (e.g., subject, issuer, expiration date) that should be monitored. The inventory system should provide automatic parsing of the contents of certificates that are loaded into it so searches can be performed on individual fields
Additional metadata: It should be possible to associate additional information/metadata with each certificate (e.g., identifiers of the owners and approvers; installed locations; application identifiers; cost center numbers)
Organization: With hundreds or thousands of certificates spread across many certificate owners and geographic locations, the inventory system should support organizing certificates into distinct groups/folders
Access controls: To prevent unauthorized actions, it should be possible to define and enforce access controls that are assigned to groups or individuals
Support certificate management: As the foundation of a certificate management program, the inventory system should integrate with and support all other certificate management functions (e.g., discovery, enrollment portal, approvals, automation)
5.2.3 Discovery and Import¶
Manually establishing and maintaining an up-to-date and comprehensive inventory is difficult, if not impossible. Because of the complexity of most enterprise environments — which contain firewalls, different security/operations restrictions, etc. — it is often not sufficient to have a single method of automatically populating and maintaining an inventory. The central Certificate Service should provide multiple options for automated discovery and the import of certificates, including those listed below:
CA import: automated import of certificates from CAs. This is often the fastest way to initially populate the certificate inventory. However, it will only provide an inventory of certificates from known CAs
Network discovery: automated scanning of one or more configurable sets of IP addresses, IP address ranges, and ports for TLS server certificates. This helps provide a comprehensive view of all certificates and their locations. Organizations typically find certificates from unapproved CAs and self-signed certificates (which should likely be replaced with certificates from approved CAs). The network discovery service should support operation across multiple network zones separated by firewalls
Configuration discovery: Network discovery can find certificates and determine their network location(s); however, it does not allow for collection of configuration information, such as the type of key store (e.g., Privacy Enhanced Mail, Public Key Cryptography Standards [PKCS] #12 [B9], HSM), the storage location on the server, and other information that can be helpful in detecting issues and in setting up automated management for the certificate. The inventory system should provide a means of discovering certificate configuration information via an authenticated connection or agent
Bulk import: In addition to network discovery and CA import, it is beneficial to have the option for administrators to import certificate data. This helps in cases where network discovery and CA import are not possible and in cases where there is additional information/metadata (e.g., contacts, approvers, cost centers) that can be associated with each certificate to help in tracking and management.
Figure 5-1 depicts options for automated discovery and import of certificates.
Figure 5‑1 Various Options for Automated Discovery and the Import of Certificates
5.2.4 Management Interfaces¶
Certificate owners and the Certificate Services team should provide user interfaces to view and manage certificates. The interfaces should be simple enough to support certificate owners who have small numbers of certificates and perform management operations infrequently. The interfaces should also offer more‑sophisticated functionality to support the needs of certificate owners with large numbers of certificates and the needs of the Certificate Services team.
The interfaces should provide the following characteristics and functions:
Inventory view: Certificate owners should be able to view their certificates (to which they have been granted access). The Certificate Services team should be able to view the entire inventory.
Searching and filtering: Certificate owners with large numbers of certificates, and the Certificate Services team, should be able to search and filter operations so they can quickly find specific certificates.
Enrollment and renewal: The portal should provide a simple method to request new certificates and to renew existing certificates. Having a single interface for enrollment and renewal across all CAs reduces the retraining needed when moving CAs, resulting in better crypto-agility.
Approvals: If an external system is not used for reviewing certificate requests, then the portal should provide a method for an approver to perform RA functions to review the relevant details of certificate requests and to approve/reject the requests with comments.
5.2.5 Automated Enrollment and Installation¶
Manually requesting, installing, and managing large numbers of certificates is error-prone and resource‑intensive; increases security risk; and does not allow for a rapid response to large-scale incidents, such as CA compromises. In cloud environments, the ability to quickly spin up new instances to support increased loads is critical. Because most enterprises have a range of systems from different vendors with diverse management methods, the central Certificate Service should offer multiple options for automation, including those listed below:
Programmatic automation: The central Certificate Service should provide a set of application programming interfaces (APIs) (e.g., Representational State Transfer) that enable enrollment, revocation, reporting, etc. The central Certificate Service should support easy integration with and access from DevOps frameworks and other programming tools.
Standard protocol support: The central Certificate Service should support standard protocols for requesting certificates, including the Simple Certificate Enrollment Protocol (SCEP) [B15], Automated Certificate Management Environment, and Enrollment over Secure Transport.
Proprietary automation: Some systems may not support programmatic or standards-based enrollment and installation but may provide other methods (e.g., APIs, command-line utilities) that can be used to automate certificate enrollment and installation. This may be performed with an agent or via a remote authenticated connection.
Secure key transport: Within organizations that, by policy, permit TLS traffic monitoring and enable detection of encrypted threats by using passive decryption devices, the central Certificate Service should provide the ability to securely transport TLS private keys from TLS servers to the decryption devices that enable inspection of encryption communications.
Automation should support integration with HSMs when HSMs are used for protection of private keys.
Certificate requests should be reviewed and vetted to ensure unauthorized certificates are not issued or used for malicious purposes. Large enterprises generally have hundreds of different departments, business applications, projects, and systems administrators, making it infeasible for a central group to have the relevant knowledge needed to vet requests. The central Certificate Service should provide the ability to assign individuals (e.g., application owners) to review certificate requests for their respective areas. Once approvers are assigned, the central Certificate Service should automatically route certificate requests to assigned reviewers for approval and enable them to review any relevant data needed to properly vet requests.
5.2.7 Reporting and Analytics¶
To address TLS server certificate-related risks, certificate owners and the Certificate Services team should have visibility across their inventory and be able to quickly identify TLS server certificate issues or vulnerabilities. The most efficient method of addressing risks is proactive notifications sent by the central Certificate Service, based on configured rules. However, reports and dashboards can help in planning (e.g., an unexpectedly large number of certificate expirations coming in the next few weeks) and identifying anomalies that would otherwise not be caught by the automated rules. The central Certificate Service should support the following reporting and analysis tools:
Custom reporting: Users should be able to create customized reports, including the data to be presented, the filtering criteria for the results, the scheduling of execution, and the selection of report recipients.
Dashboards: To help in identifying anomalies or unexpected issues, dashboards should proactively highlight risks, such as certificates with weak keys, vulnerable algorithms, impending expirations, operational errors, and other issues.
Interfaces to monitoring systems: Many organizations rely upon automated security incident and event monitoring systems that collect, analyze, and correlate information that is subsequently displayed or used to notify humans of events and the actions required. Certificate-related anomalies and issues should be delivered to such systems.
5.2.8 Passive Decryption Support¶
If passive decryption devices are used to monitor TLS-encrypted communications for attacks, then those devices must have copies of the private keys from all monitored TLS servers so the devices are able to decrypt TLS traffic to those servers. Manually transporting private keys from TLS servers to passive decryption devices creates risk of a compromise. Consequently, when passive decryption is used, the central Certificate Service should provide an automated and secure method for transporting private keys from TLS servers to passive decryption devices and for keeping the private keys up-to-date when new keys (and certificates) are deployed.
5.2.9 Continuous Monitoring¶
To prevent operational or security incidents, the certificates should be continuously monitored across the enterprise. Continuous monitoring should include the following types of monitoring:
Expiration monitoring: To prevent outages due to expired certificates, the expiration dates for all certificates should be monitored. It should be possible to configure the time periods when notifications will be sent to certificate contacts prior to expiration (e.g., 90 days, 60 days, 30 days). If timely action is not taken, then it should be possible to escalate and send notifications to managers or a central incident response team.
Operation/configuration monitoring: Once a known good state is established (e.g., the location and configuration of certificates), the central Certificate Service should monitor and detect situations in which certificates are not operating, are not configured properly, or are vulnerable.
Policy compliance: The central Certificate Service should detect and send alerts when deployed certificates are not consistent with policy.
Because certificate expirations are a regular occurrence, especially for certificate owners with large numbers of certificates, it is important to not inundate certificate owners with notifications, as they will likely start to ignore them. An effective strategy is to combine the use of reports, change tickets, and alerts. Sending regular (e.g., monthly) reports containing a list of certificates expiring within a certain number of days (e.g., 120 days) helps certificate owners plan for expirations. Automatically creating change tickets in the organization’s central ticketing system can ensure certificate renewals and replacements are handled in the same way that other change operations are performed. Sending alerts within 30 days of expiration and escalating to management and incident response teams ensures certificates not replaced in a timely fashion are identified before they expire. Figure 5‑2 provides an example schedule for reports, tickets, and alerts.
Figure 5‑2 Example Timeline of Processes and Notifications Triggered by Impending Certificate Expiration
Management of TLS server certificates in an enterprise environment is complex, time-consuming, error-prone, and security-sensitive. Most certificate owners are not knowledgeable about TLS server certificates, the processes for effectively managing certificates, or their own certificate-related responsibilities. Consequently, the Certificate Services team should provide readily accessible educational materials, preferably online and available on demand. The TLS server certificate educational materials should include the following items:
basic introduction to certificates and keys (e.g., when certificates are used, obtaining certificates, protecting keys, certificate changes, revocation)
risks of improper TLS server certificate management
explanation of TLS server certificate policies and certificate owner responsibilities
step-by-step instructions for managing TLS server certificates, including any of the following steps offered via the central Certificate Service:
creating an inventory
reviewing the inventory and identifying risks/vulnerabilities (e.g., generating reports)
manually requesting and installing TLS server certificates on each relevant operating system/application (e.g., Apache)
DevOps/API-based request and installation
agentless automated installation
agent-based automated installation
There are many educational resources available on the internet that can alleviate the need to create new materials. An internal TLS server certificate education website can include links to helpful web pages and websites.
5.2.11 Help Desk¶
In addition to educational materials, certificate owners should have a central support service that they can contact about questions and that can assist in troubleshooting issues. Many certificate owners may be new to TLS server certificate management or responsible for only a small number of certificates (e.g., one to five certificates) and will likely need assistance in successfully performing necessary operations. Any certificate owner calling the help desk should be required to have completed the educational programs that apply to their use cases so that help-desk personnel do not need to explain basic concepts that can be learned prior to the request for help.
TLS server certificates are typically installed or renewed during scheduled maintenance windows, which are often scheduled on weekends and/or in the middle of the night. Issues related to TLS server certificates can often arise during these scheduled maintenance operations; therefore, help-desk personnel should be made available during all times when certificate issues may arise (e.g., 24 hours a day, seven days a week). Help-desk personnel should be knowledgeable about and experienced in TLS server certificate management. It is possible to have general help-desk personnel answer and address Level One certificate calls and escalate to more-experienced personnel as needed for Level Two and Level Three calls.
5.3 Terms of Service¶
It is helpful to define the terms of service for the central Certificate Service to avoid confusion by certificate owners about the services they will receive and their responsibilities. The terms of service should include those listed below:
description of the services provided (e.g., network discovery, monitoring enrollment, automation)
responsibilities of the certificate owners and the Certificate Services team (e.g., the Certificate Services team will help with network discovery, but a certificate owner is responsible for working with the network team to allow the discovery on their systems)
expected service levels — stated in service level agreements — with response times
Due to the fundamental role that TLS server certificates play in securing data and systems, periodic reviews of TLS server certificate management practices are essential. Auditors should confirm that TLS server certificate policy requirements are addressed. For example, all certificate owners should be able to demonstrate they have a certificate inventory and to describe the steps they have taken to ensure all certificates are included in the inventory. The Certificate Services team should demonstrate it is providing the services needed for certificate owners to comply with policy.
TLS server certificate risks can lie latent for long periods of time and then can unexpectedly have significant impact to an organization’s operations —due to either operational outages or security issues. Consequently, regular audits of certificate management practices performed by compliance auditors are critical to prevent unanticipated issues.
6 Implementing a Successful Program¶
The broad distribution of TLS server certificates across distinct groups, networks, and systems can present unique challenges in implementing an effective certificate management program across an enterprise environment. The following resources are helpful for successful implementation:
Executive owner: It is essential to have an executive owner for the certificate management program. This executive owner should be prepared to educate the executives of each group of certificate owners on TLS server certificate risks and the executives’ responsibilities.
Prioritization of risks: Each organization has different challenges and priorities related to TLS server certificates. Although the best practices detailed in this practice guide are intended to help address all the risks related to TLS server certificates, it is helpful to prioritize those risks based on historical certificate issues and business needs. This prioritization can help in communications with certificate owners and with setting objectives and prioritizing tasks.
Objectives: Establishing clear and achievable objectives provides targets, helps focus efforts, and improves the likelihood of successful implementation. For example, if an organization finds it does not have an inventory and recognizes there are two groups that may be difficult to inventory in the near term, then one objective may be to create an inventory of all other groups’ TLS server certificates in the next 12 months.
Action plan: An action plan with specific tasks, responsibilities, and milestones, geared to achieve the objectives, should be created, communicated, and reviewed by all stakeholders (e.g., certificate owners, Certificate Services team, executive owner). The action plan should be prioritized to address the most important objectives first. For example, an action plan might include the following objectives:
30 days from the start of the project:
complete certificate imports from CA1, CA2, and CA3
require certificate enrollment through the central Certificate Service portal and prevent enrollment directly to CAs
90 days from the start of the project:
complete network discovery across all North American and European data centers
complete the assignment of certificate owners for all certificates in inventory
180 days from the start of the project:
automate certificate enrollment and installation on all load balancers
automate certificate enrollment and installation for all e-commerce web servers
complete network discovery across all Asia-Pacific data centers
Regular executive reviews: The objectives and action plan should be reviewed with the executive owner at commencement of the project, and regular reviews should be scheduled (e.g., every 90 days) to track progress. During these reviews, the executive owner should note areas where additional action by certificate owners is needed so the executive owner can proactively communicate with peer executives to ensure action is taken
Periodic audits: Due to the critical role that TLS server certificates play in the security and operations of organizations, and the risks resulting from improper management, regular audits should confirm the Certificate Services team and certificate owners are fulfilling their responsibilities in TLS server certificate management.
Security testing should be defined as part of the organization’s policies. Before going live with any recommendations in this document, authorization from the security team should be provided, as specified by security policy.