5 min read
Protect your chain of custody with content hashing and timestamping
Tim Thorne : Tue, Nov 22, '22
The awareness and practice of digital forensics has been with us for over 40 years and although often seen as a reactive activity, digital forensics is now proving to be a vital part of the cybersecurity stack and increasingly effective when deployed proactively.
Digital forensic techniques and methodologies developed over those 40 years can now add significant value to the overall cybersecurity and incident response processes for anyone engaged in DFIR.
Binalyze have been disrupting and innovating in this DFIR space for the last 5 years by delivering faster containment, remediation and investigative solutions.
Today, in this blog we will take a look at how Binalyze AIR uses SHA-256 hashing in conjunction with RFC3161 digital timestamp certificates to protect data content and provide guarantees as to exactly when that protected content originally existed. And, provide an assurance that it’s not been changed.
Let's go back to the start of Digital Forensics
In the 1950’s Hans Peter Luhn, a scientist at IBM, developed a formula known as the ‘modulus 10’ or ‘Mod-10’ algorithm. This allowed a ‘check digit’ to be generated for number sequences such as those used on ID cards, bank cards or more recently mobile phone IMEI numbers. As an ex-police officer from London I can tell you that this method was also used to verify police warrant card numbers.
Mod-10 was never intended to provide a cryptographically secure hash function and it was not until 1990 when none other than Ronald Rivest (one of the inventors of the RSA algorithm) published the MD4 Message Digest Algorithm, that a 128 bit Message Digest could be generated via a truly cryptographic hash function. This is a one way function, it’s not possible to roll back from your hash value to the data that originally generated it.
This one way function makes hashing the perfect way to store passwords - the actual password can never be determined from the hash itself and this is why we today see crypto-currency and blockchain technologies being so reliant on it.
Pressure to evolve
However, what's possible now, thanks to the extraordinary and ever-increasing processing power of modern computers, is the ability to generate two different files that can actually have the same hash value, this is often referred to as a ‘Hash Collision’.
MD4 suffered such a collision in 1995 but by then MD5 was available as was SHA-1 which produces a 160 bit hash output. Even so, by 2017 an attack called ‘SHAttered’ proved that SHA-1 was now vulnerable to hash collisions and therefore not 100% secure or strong enough for all of the technologies that relied on it.
Right here, right now
That brings us right up-to-date with what is currently considered a secure hashing algorithm, as there have yet to be any documented collisions with SHA-256 and its 256 bit output.
At Binalyze we use SHA-256 to hash all of the files collected by Binalyze AIR and then we take this to the next level. We do this by further hashing our .ppc collection report and having that value sent to the DigiCert Trusted Timestamp Server to generate a certificate. This not only proves that the report and all of the data associated with it exist exactly as it did on acquisition, but it did so at the date and time notarized by a Trusted Timestamp Authority (TSA) certificate.
So, thanks to RFC3161, you can prove not only that the data content is 100% intact, but that the date and time of the collection is also guaranteed.
Trust in RFC 3161
Requests For Comment (RFC) is a system that has been adopted as the official documentation of Internet specifications, communications protocols, procedures, and events. Originally used to record the unofficial notes concerned with the ARPANET project in 1969, the system is now considered a standard setting body for the internet and its connected systems.
A published RFC will have been through a review and revision process, overseen by several groups such as the Internet Engineering Task Force (IETF), which is a large open international community of network designers, operators, vendors, and researchers. As part of their collective role, they review the evolution of everything concerned with the evolution of internet architecture and the smooth operation of the internet. A list of RFC3161 compliant TSAs can be found here. When choosing TSAs users may want to consider if their implementation of RFC 3161 has been qualified by organisations such as eIDAS (electronic identification and trust services).
RFC3161 defines how trusted timestamping leverages public-key cryptography and the internet X.509 Public Key Infrastructure Time-Stamp Protocol (TSP) sets the required protocols for standardisation.
One way to use a TSA allows a requestor to take the hash they’ve generated for the total of their collected data set, send that hash to the TSA and receive in return a Timestamp Request Token (TSR). This TSR can be saved and at any later time be used to verify both the content of the collection along with the date and time that the collection took place.
The RFC 3161 capability is not unique and is available from a whole range of independent third parties. This is important as any in-house time-stamping processes could be open to challenge or criticism due to its lack of independence or verified accuracy.
How does this work in Binalyze AIR?
In the AIR platform, when you send a collection task to an endpoint agent, the agent will build the collection on the endpoint in a directory named ‘Cases’. This collection is in a .zip file, with a filename that starts with the date and time of the collection. If you expand the .zip file you’ll note that the collected data has been added while maintaining the directory tree structure. This is good news if you want or need to further investigate the collection in other forensic solutions.
At the root of the collection shown above, you can see the Case.ppc file. This is another .zip container and if you expand this you can inspect the contents. Here, you'll also note the presence of the Hashes.csv which records all of the hashes of the collected files.
So when AIR hashes the .ppc file it is of course hashing all of the hash values collected as part of that Hashes.csv file. This means that a change to any of the content or the .ppc file would result in a mismatched hash.
With Binalyze AIR, RFC3161 timestamping is on by default. This means the hash value of your collection .ppc file is sent to the TSA and their TST response is saved as metadata for that collection in the AIR console automatically. You can download and verify the TST from here anytime you or others need to.
You can also disable the RFC3161 Timestamping functionality at any time via the AIR Settings > Chain of Custody page.
How to verify the .ppc via the RFC 3161 Timestamp Token
To verify the .ppc via RFC 3161, the first thing you need to do is to download the TST from the metadata button in the AIR endpoint details > Task tab (as shown in figure xx).
In the example below I’ve changed the name of the TST to ‘RFC3161 timestamp.tsr’ and saved it to my downloads folder.
I can then open a shell session and change the directory to downloads.
To see the information in the TST Run:
openssl ts -reply -in RFC3161\ timestamp.tsr -token_in -token_out -text
and in the output you’ll see the hash of your .ppc and the Timestamp
To verify this TST we now need to download the root certificate from DigiCert: https://cacerts.digicert.com/DigiCertAssuredIDRootCA.crt.pem.
We will also need the following TSA certificates from the Digicert TSA server to build a ‘chain certificate’. In this case I took the content of each .cer file, in the order shown, and concatenated them into one file that I named ‘CHAIN.pem’
DigiCertTrustedG4RSA4096SHA256TimeStampingCA.cer
With all these files remaining in the same directory I then ran the following command to verify the TST:
openssl ts -verify -CAfile DigiCertAssuredIDRootCA.crt.pem -untrusted CHAIN.pem -data TASK.ppc -in RFC3161\ timestamp.tsr -token_in
This simple verification ‘ok’ message confirms that the TST is correct, indicating that my data is sound and that it existed at the date and time shown by the timestamp
Conclusion - Robust best practice
Thanks to the RFC 3161 and SHA-256 hashing features of AIR, it’s now possible to prove that not only is your data content 100% intact, but that it existed at a particular moment in time. So we can now be sure that we know exactly what was collected and when it was collected. In short, RFC 3161 provides immutable timestamping for an effective chain of custody to maintain forensic integrity.
References
-
The Luhn algorithm - https://en.wikipedia.org/wiki/Luhn_algorithm
-
Network Working Group RFC 3161 - Internet X.509 Public Key Infrastructure Time-Stamp Protocol (TSP) - https://datatracker.ietf.org/doc/html/rfc3161
-
DigiCert - RFC 3161 compliant Time Stamp Authority (TSA) server - https://knowledge.digicert.com/generalinformation/INFO4231.html
-
Proving Chain of Custody and Digital Evidence Integrity with Time Stamp - https://www.researchgate.net/publication/279174845_Im_Proving_Chain_of_Custody_and_Digital_Evidence_Integrity_with_Time_Stamp
-
The European Union Agency for Cybersecurity - Security guidelines on the appropriate use of qualified electronic time stamps - Guidance for users https://www.enisa.europa.eu/publications/security-guidelines-on-the-appropriate-use-of-qualified-electronic-time-stamps/@@download/fullReport