User-friendly Hash Library

TForge 0.65 released.

The release features a hash library with fluent coding support. While I was writing the library I was inspired by usability of the Python’s hashlib.

Current release supports:

  • Cryptographic hash algorithms:
    1. MD5
    2. SHA1
    3. SHA256
  • Non-cryptographic hash algorithms:
    1. CRC32
    2. Jenkins-One-At-Time
  • Hash-based MAC algorithm (HMAC)
  • Key derivation algorithms:
    1. PBKDF1
    2. PBKDF2

Let us consider a common problem: calculate MD5 and SHA1 digests of a file. The simplest way to do it is:

program HashFile;

{$APPTYPE CONSOLE}

uses
  SysUtils, Classes, tfTypes, tfHashes;

procedure CalcHash(const FileName: string);
begin
  Writeln('MD5:  ', THash.MD5.UpdateFile(FileName).Digest.ToHex);
  Writeln('SHA1: ', THash.SHA1.UpdateFile(FileName).Digest.ToHex);
end;

begin
  try
    if ParamCount = 1 then begin
      CalcHash(ParamStr(1));
    end
    else
      Writeln('Usage: > HashFile filename');
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
  Readln;
end.

The above application demonstrates the beauty of fluent coding. The code is compact and clear – you create an instance of a hash algorithm, feed a file data to it, generate resulting digest and convert it to hexadecimal; the instance is freed automatically, no need for explicit call of the Free method.

But the above code is not optimal – it reads a file twice. A better solutions involves some coding:

procedure CalcHash(const FileName: string);
const
  BufSize = 16 * 1024;

var
  MD5, SHA1: THash;
  Stream: TStream;
  Buffer: array[0 .. BufSize - 1] of Byte;
  N: Integer;

begin
  MD5:= THash.MD5;
  SHA1:= THash.SHA1;
  try
    Stream:= TFileStream.Create(FileName,
                                fmOpenRead or fmShareDenyWrite);
    try
      repeat
        N:= Stream.Read(Buffer, BufSize);
        if N <= 0 then Break
        else begin
          MD5.Update(Buffer, N);
          SHA1.Update(Buffer, N);
        end;
      until False;
    finally
      Stream.Free;
    end;
    Writeln('MD5:  ', MD5.Digest.ToHex);
    Writeln('SHA1: ', SHA1.Digest.ToHex);
  finally
    MD5.Burn;
    SHA1.Burn;
  end;
end;

The code also demonstrate the use of Burn method; it is not needed here and could be safely removed with corresponding try/finally block but can be useful in other cases – it destroys all sensitive data in an instance. The use of Burn method is optional – it is called anyway when an instance is freed, but explicit call of the Burn method gives you full control over erasing the sensitive data.

The Free method does not free an instance; it only decrements the instance’s reference count, and since the compiler can create hidden references to the instance the moment when the reference count turns zero and the instance is freed is generally controlled by the compiler.


The use of non-cryptographic hash algorithms has one caveat – since they actually return an integer value the bytes of a digest are reversed. The idiomatic way to get the correct result is to cast the digest to integer type:

  Writeln('CRC32: ',
    IntToHex(LongWord(THash.CRC32.UpdateFile(FileName).Digest), 8));

This issue is fixed in ver. 0.68 – see https://sergworks.wordpress.com/2014/12/27/endianness-issue/


HMAC algorithm generates digest using a cryptographic hash algorithm and a secret key. Here is an example of calculating SHA1-HMAC digest of a file:

procedure SHA1_HMAC_File(const FileName: string;
                         const Key: ByteArray);
begin
  Writeln('SHA1-HMAC: ',
    THMAC.SHA1.ExpandKey(Key).UpdateFile(FileName).Digest.ToHex);
end;

begin
..
  SHA1_HMAC_File(ParamStr(1),
    ByteArray.FromText('My Secret Key'));

Key derivation algorithms generate keys from user passwords by applying hash algorithms. PBKDF1 applies a cryptographic hash algorithm directly, PBKDF2 uses HMAC. Here are usage examples:

procedure DeriveKeys(const Password, Salt: ByteArray);
begin
  Writeln('PBKDF1 Key: ',
    THash.SHA1.DeriveKey(Password, Salt,
                         10000,   // number of rounds
                         16       // key length in bytes
                         ).ToHex);
  Writeln('PBKDF2 Key: ',
    THMAC.SHA1.DeriveKey(Password, Salt,
                         10000,   // number of rounds
                         32       // key length in bytes
                         ).ToHex);
end;

begin
..
  DeriveKeys(ByteArray.FromText('User Password'),
             ByteArray.FromText('Salt'));

Configuration and Installation

The release contains 2 runtime packages TForge and THashes. You should build them, first TForge, next THashes.

For Delphi users:

The packages are in Packages\DXE subfolder (Delphi XE only).

You should make the folder Source\Include available via project’s search path before you can build the packages. To do it open “Project Options” dialog for each package, set “Build Configuration” to “Base” and replace the path to TFL.inc by your path (depends on where you unpacked the downloaded archive):

Include

Now use the project manager and build the packages in “Debug” and “Release” configurations:

build

Optionally, to enable Ctrl-click navigation in the editor, add paths to the source code units to the Browsing path:
Browse

To make the packages available to an application open the application’s “Project Options” dialog, set “Build Configuration” to “Base” and add the next path to search path:

appconfig

For FPC/Lazarus users:

The packages are in Packages\FPC subfolder.

I don’t know much about Lazarus packages. Here are the package configuration options I used:

Laz2

_lazconfig

Advertisements

10 thoughts on “User-friendly Hash Library

  1. Pingback: Новости из мира Delphi 06.10 – 12.10 2014 | Delphi 2010 ru

  2. Hi Serg,
    I have 2 questions:
    1/Is your code thread safe ?
    2/How about speed compared to open source competition ?
    Greetings, Artur

      • I think he was wondering if it was GPL, LGPL, Apache, MIT, BSD, something else?

        A clear license with text in the distro would be really helpful. Apache, MIT, BSD are the most flexible “you can use it for free” licenses, with protection for the author.

        Some projects use multiple licenses, letting people choose the one that best suits their needs (usually GPL and either MIT or BSD).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s