#2 What is HTTPS e why is it important?

HyperText Transfer Protocol over Secure Socket Layer (HTTPS), (also known as HTTP over TLS, HTTP over SSL, and HTTP Secure) is a protocol for secure communication across a computer network used on the Internet (Wikipedia).

But why is it so important?

Let’s start by stating that when we use HTTP and HTTPS we are using the HyperText Transfer Protocol and in reference to R we are usually accessing web interfaces (RStudio server, a Shiny Dashboard or an HTML page generated with RMarkdown, etc …) or we are using web APIs (httpr, client or Plumber, server). But for simplicity we can imagine that we are talking about a Shiny application.

HTTPS protects all communication with the server: the entry of passwords, the loaded data, the operations that are carried out, but also, on the contrary, the data that is displayed and the responses of the server. Instead, all communication is exposed when using the HTTP version.

To clarify, we outline the following concepts:

user authentication
the confidentiality of the information exchanged
server authentication

User authentication is usually obtained by requesting a username and password. It serves to recognize the user, to grant access to authorized users and to prohibit it for all others. However, it is good to know that when an authorized user accesses the system, all communications between the client (the user’s browser) and the server pass through the network and are exposed to prying eyes. Therefore data, but also passwords can be intercepted and stolen with some ease. On the contrary, the confidentiality of the information exchanged is usually obtained by encrypting the traffic between client and server, in this way even the intercepted information is illegible and cannot be changed without alarming the communication system. Server authentication is the mechanism by which the client (the browser) confirms that the server responding to its connection request is authentic. This procedure is necessary to avoid cyber attacks in which a third-party server replaces the competent one and, presenting itself exactly as the original, gets confidential information delivered such as user and password. This kind of attack is not that difficult to engineer if we consider that the malicious server could straddle the communication and replace the end point, but then let us really communicate with the competent server (to learn more, see the attack man-in -the-middle).

When data is sensitive, there are several legal reasons to protect it. The GDPR, the EU Data Protection Regulation requires the ”… ability to ensure the confidentiality, integrity, availability and resilience of processing systems and services on a permanent basis” (translated from the Italian version) GDPR compliance (IT)). Whereas professional relationships are often regulated by non-disclosure agreements (NDA). Data protection is often part of these agreements.

How does it work?

HTTPS allows you to activate secure HTTP communication, because it is encrypted through SSL (Secure Socket Layer) certificates from source to destination (end-to-end). Furthermore, always thanks to the use of the SSL protocol, our client (Browser) is able to ensure the authenticity of the server to which it is connected, that is, the server is really who it claims to be.

In technical terms, using HTTPS technology brings several security advantages in terms of:

Encryption: Confidential information is protected from interception because the encryption makes it unreadable. It means that all information exchanged with the server remains private.
Integrity: the encrypted data cannot be manipulated or altered by any malicious person.
Authenticity: The server we are connecting to is correct. Passwords and data we send will be used by the correct recipient. Important note: An SSL certificate signed by a CA certification authority (Certification Authority) is required to properly authenticate the server, otherwise the browser will warn that the website certificate cannot be verified. The verification that is made is therefore that the certificate communicating the server is made out to the domain name to which you connect, for example rinproduction.com. If the result is positive you are connected to the right server.

HTTPS is the future of the web

Today, modern browsers alert the user when the content of the site they are visiting is not fully protected by HTTPS. It goes without saying that if the service we offer have to be reliable it must support this kind of protection to be well accepted by browsers. We therefore understand how HTTPS is becoming the minimum standard to use. In fact, Google has long (6 August 2014) announced that sites hosted under the HTTPS protocol are considered more reliable and thus the protocol becomes in fact also a ranking factor of the search engine.

If you just use HTTPS without a valid certificate, your user will get this alarming message when logging into your application.

Invalid certificate

While if everything is correct, you will be able to see the “green lock”: the connection is secure and it is with the server registered in your name.

Valid certificate

Open Source Software Issues

Unfortunately, one of the features that the Open Source versions of the most used software in the R environment lack is precisely HTTPS. Shiny Server and RStudio Server provide HTTPS connection support only in paid Professional versions. If the budget available does not allow to purchase these licenses, it is necessary to use other tools such as Shiny Proxy which provides an Open Source Shiny Server with Support for Authentication and HTTPS or you need to take advantage of other web servers able to intervene between client and Shiny and act from HTTPS server for HTTP connection provided by Open Source software. Note that there are no conflicts between Open Source and HTTPS, in fact there are numerous Open Source software with this feature, but it is the tools provided to the R world that release this feature only for a fee.

Obtaining an HTTPS server from an HTTP connection is possible with Open Source technologies (if you are interested this could be the subject of one of the next articles).

Alternatives to HTTPS

The only alternative to the use of this secure protocol and to the use of a private network and possibly isolated from the outside, is the use of HTTP on a secure network. These are solutions that actually replace the authentication layer (of the server) and SSL encryption with another solution between:

SSH encrypted tunnel (Secure SHell)
VPN (Virtual Private Network)
Other encrypted tunneling systems for TCP / IP or HTTP

Technical differences with HTTP

Beyond the functional differences just described, there are other purely technical differences.

The standard port used by HTTP is 80, while HTTPS uses 443. Note: the port we are talking about is the port of the TCP / IP protocol, on which HTTP and HTTPS are based, and is the identifier of the channel on which a server application is listening. Consequently, when you connect to http: // rinproduction.com you are contacting the server * rinproduction.com * on port 80, while if https: // rinproduction.com you are connecting to the same server on port 443. It means that if you want to force the user looking for HTTP to use the secure protocol you have to keep the server listening on 80 and do a redirect. For further information, you will find a lot of online documentation.

HTTPS is fast. Although it requires a more complex process than HTTP, it has mechanisms that often make it faster. Find an example here: http://www.httpvshttps.com/

Conclusion

Users expect that when they access an online service, what they do remains confidential, so it is important that the technology meets this expectation. HTTPS does this by leaving compatibility with everything that works over HTTP.

Comments

Comment on Linkedin

Comment on Twitter

Credits

Di Sean MacEntee - https://flic.kr/p/qi1eYu, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=41015981
Image CC BY-SA 3.0, Link
Image MPL 2, Link