Why unit tests are not always good

17

Unit tests are good to detect most bugs in your code but not all bugs. When you are writing standard unit tests for a class you are doing the following

  • Create a fresh class instance (ex using Setup method in DUnit framework);
  • Run a code under test (usually a single call of a single method) on the instance;
  • Free the instance (ex using TearDown method in DUnit framework).

And this is how unit tests should be written; if your test detects a bug you immediately know the bug’s origin.

The problem with the above scenario is that it is ideal to hide some badly reproducible bugs such as access violation (AV) bugs. To detect such a bad bug with good probability you need something different, probably to do multiple calls of a method on the same instance, or to call different methods in the same test, and this approach is quite opposite to the idea of unit testing.

Numerics 0.57 released (Hashtables, Bugfix)

2

1. The main purpose of the release is to implement hash tables (aka associative arrays) with keys of BigCardinal or BigInteger type; such hash tables are important for cryptographic applications. The hash tables in Delphi are implemented by a generic TDictionary<TKey,TValue> class. The release includes a new unit GNumerics.pas with two helper generic classes TBigIntegerDictionary<TValue> and TBigCardinalDictionary<TValue> specializing the base generic dictionary class. For example to work with a hash table having BigInteger keys and string values you need something like this:

uses GNumerics;

[..]
var
  HashTable: TBigIntegerDictionary<string>;
  A: BigInteger;

[..]
begin
// create hash table instance
  HashTable:= TBigIntegerDictionary<string>.Create;
  A:= BigInteger('1234567890987654321');
  try
// fill hash table with data
    HashTable.Add(A * A, 'some string value');
    [..]
  finally
// destroy hash table
    HashTable.Free;
  end;
end;

2. Some bugs fixed; in particular a nasty access violation bug in BigInteger.ModPow function was fixed.

3. Minor changes in BigCardinal/BigInteger methods.

Link to the download page

Update

Version 0.58 fixes the conversion bug from (U)Int64 to BigInteger/BigCardinal.

RawByteString type explained

0

With the introduction of Unicode support Delphi also introduced magic RawByteString type; the word ‘magic’ here means that you can’t implement a type with RawByteString functionality without hidden compiler support.

A common misunderstanding about the RawByteString type is that instances of RawByteString contain no encoding information and because of this the compiler can’t implement implicit string conversion. That is not true.

First of all the RawByteString is AnsiString (1-byte character size). If you typecast a Unicode string to RawByteString type or a RawByteString string to Unicode type the compiler will always implement string conversion; if the compiler has no information about the ANSI codepage of RawByteString it uses the system codepage for conversion.

So the RawByteString magic is for AnsiStrings only.

When you declare a variable of AnsiString type you also declare a codepage; for example

type
  CyrString = type AnsiString(1251);

var
  S: CyrString;

That means the compiler has static codepage information; if you do not declare codepage the compiler assumes the system codepage. The AnsiString instances also have runtime codepage information (codepage field in the string instance header), but usually the compiler never checks the runtime codepage information and uses static codepage information for string conversions.

The magic of RawByteString type is that the compiler has no static codepage information; it does not mean that a RawByteString instance has no runtime codepage information.

If you typecast an AnsiString instance to RawByteString type no string conversion happens.

The RawByteString type is for ANSI strings’ typecasting only, not for creating instances of the type. To understand how it works let us first consider “an educated abuse” of RawByteString:

type
  string1251 = type AnsiString(1251);
  string1252 = type AnsiString(1252);

var
  S1: string1251;
  S2: string1252;

begin
// just initialize it with some data
  S1:= 'АБВГДЕЙКА';

// no string conversion here;
// the S1 string instance is copied 'as is',
//   with codepage information
  S2:= RawByteString(S1);

Now we have an instance (S2) of string1252 type containing data in ANSI 1251 encoding and runtime codepage 1251. But since the compiler normally uses static codepage information the subsequent use of the instance may produce strange results.

Finally an example of correct RawByteString type usage. The following function counts the number of occurrences of ‘?’ character in an ANSI string:

function CountQuestions(const S: RawByteString): Integer;
const
  Mark = $3F;

var
  I: Integer;

begin
  Result:= 0;
  for I:= 1 to Length(S) do begin
    if Byte(S[I]) = Mark
      then Inc(Result);
  end;
end;

The purpose of using RawByteString type for the function’s argument is to avoid unnecessary string conversion.

Implementing generic interfaces in Delphi

8

Delphi supports generic interfaces; for example we can declare a generic interface

type
  IChecker<T> = interface
    function Check(const Instance: T): Boolean;
  end;

and use this generic interface as follows:

unit UseDemo;

interface

uses GenChecks;

type
  TDemo<T> = class
  private
    FChecker: IChecker<T>;
  public
    constructor Create(AChecker: IChecker<T>);
    procedure Check(AValue: T);
  end;

implementation

{ TDemo<T> }

procedure TDemo<T>.Check(AValue: T);
begin
  if FChecker.Check(AValue)
    then Writeln('Passed')
    else Writeln('Stopped')
end;

constructor TDemo<T>.Create(AChecker: IChecker<T>);
begin
  FChecker:= AChecker;
end;

end.

To implement the above generic interface IChecker we need a generic class; the straightforward solution is

type
  TChecker<T> = class(TInterfacedObject, IChecker<T>)
    function Check(const Instance: T): Boolean;
  end;

If the IChecker interface can be implemented like that, we need nothing else. The problem with the above implementation is that we are limited to the generic constraints on the type T and can’t use properties of specific types like Integer or string that will finally be substituted for the type T.

A more elastic solution is to introduce an abstract base type and derive the specific implementations from it. Here is a full code example:

program GenericEx1;

{$APPTYPE CONSOLE}

uses
  SysUtils,
  GenChecks in 'GenChecks.pas',
  UseDemo in 'UseDemo.pas';

procedure TestInt;
var
  Demo: TDemo<Integer>;

begin
  Demo:= TDemo<Integer>.Create(TIntChecker.Create(42));
  Demo.Check(0);
  Demo.Check(42);
end;

procedure TestStr;
var
  Demo: TDemo<string>;

begin
  Demo:= TDemo<string>.Create(TStrChecker.Create('trololo'));
  Demo.Check('ololo');
  Demo.Check('olololo');
end;

begin
  TestInt;
  TestStr;
  ReadLn;
end.
unit GenChecks;

interface

type
  IChecker<T> = interface
    function Check(const Instance: T): Boolean;
  end;

type
  TCustomChecker<T> = class(TInterfacedObject, IChecker<T>)
  protected
    FCheckValue: T;
    function Check(const Instance: T): Boolean; virtual; abstract;
  public
    constructor Create(ACheckValue: T);
  end;

  TIntChecker = class(TCustomChecker<Integer>)
  protected
    function Check(const Instance: Integer): Boolean; override;
  end;

  TStrChecker = class(TCustomChecker<string>)
  protected
    function Check(const Instance: string): Boolean; override;
  end;

implementation

{ TCustomChecker<T> }

constructor TCustomChecker<T>.Create(ACheckValue: T);
begin
  FCheckValue:= ACheckValue;
end;

{ TIntChecker }

function TIntChecker.Check(const Instance: Integer): Boolean;
begin
  Result:= Instance = FCheckValue;
end;

{ TStrChecker }

function TStrChecker.Check(const Instance: string): Boolean;
begin
  Result:= Length(Instance) = Length(FCheckValue);
end;

end.

In the above example each interface reference ICheck references its own class instance; this is necessary because every instance contains a parameter (FCheckValue) set in the constructor. If an implementation does not require such a parameter creating new instances for every interface reference will be an overhead. A better solution is to use a singleton instance.

Here is a full code example for the integer type:

program GenericEx2;

{$APPTYPE CONSOLE}

uses
  SysUtils,
  GenChecks in 'GenChecks.pas',
  UseDemo in 'UseDemo.pas';

procedure TestInt;
var
  Demo: TDemo<Integer>;

begin
  Demo:= TDemo<Integer>.Create(TIntChecker.Ordinal);
  Demo.Check(0);
  Demo.Check(42);
end;

begin
  TestInt;
  ReadLn;
end.
unit GenChecks;

interface

uses Generics.Defaults;

type
  IChecker<T> = interface
    function Check(const Instance: T): Boolean;
  end;

  TCustomChecker<T> = class(TSingletonImplementation, IChecker<T>)
  protected
    function Check(const Instance: T): Boolean; virtual; abstract;
  end;

  TIntChecker = class(TCustomChecker<Integer>)
  private
    class var
      FOrdinal: TCustomChecker<Integer>;
  public
    class function Ordinal: TIntChecker;
  end;

implementation

type
  TOrdinalIntChecker = class(TIntChecker)
  public
    function Check(const Instance: Integer): Boolean; override;
  end;

{ TOrdinalIntChecker }

function TOrdinalIntChecker.Check(const Instance: Integer): Boolean;
begin
  Result:= Instance = 42;
end;

{ TIntChecker }

class function TIntChecker.Ordinal: TIntChecker;
begin
  if FOrdinal = nil then
    FOrdinal := TOrdinalIntChecker.Create;
  Result := TIntChecker(FOrdinal);
end;

end.

A Year in MOOC

3

About a year ago I joined my first MOOC [Massive Open Online Course]. There were a lot courses after that; I’ve completed 5 courses and dropped about 20, and I’ve come to my personal understanding what MOOC are and what they should be.

I believe many MOOC courses now are more about the positive learners’ feedback than about the teaching quality. Many courses are watered versions of the offline university courses (or maybe these universities have such watered courses), easy to pass. I dropped most of these courses because they were boring.

Fortunately not all MOOC courses are like that. So far I discovered 2 real MOOC gems: the first was BerkeleyX CS-191x: Quantum Mechanics and Quantum Computation provided by Berkeley University and the second was MITx 6.041x: Introduction to Probability – The Science of Uncertainty provided by MIT.

Both courses were about the fundamentals, the things many talk about but little understand, and the professors were really interested in the learners mastering the course, not in the positive learners’ feedback. Both courses required a lot of time and effort to master, but the rewards of mastering them are incomparable to the watered courses.

Both these courses set a high educational standard, still both are undergraduate courses. I hope we are at the beginning of the educational evolution, and after some time all courses of the best professors from the best universities will be available online.

Type alignments and layouts in Delphi

2

Every Delphi type, built-in or user-defined, has 2 properties that affect memory placement of its instances: alignment and size. The alignment affects the starting address of an instance; for example the alignment of LongInt type is equal to 4, means that the starting address of any LongInt variable is a multiple of 4. The size is equal to the number of bytes occupied by an instance.

For any type, built-in or user-defined, the size is multiple of the alignment. The rule has only one notable exception, the FPU Extended type (and possibly user-defined types that include it) which has the alignment of 8 and the size of 10.

We can affect the alignment of user-defined types with {$ALIGN ..} compiler directive; there are 2 kinds of user-defined types possibly of interest in context of alignment – static arrays and records.

I will not tell here about the bugs in the alignment implementation in old Delphi versions; I believe they are fixed now and everything works as follows.

Static arrays are just not affected by the alignment directive. The alignment of a static array is equal to the alignment of an array element, the size of a static array is equal to the size of an array element multiplied by the number of elements in the array. Period.

As for records, there is common misunderstanding that the alignment directive defines the internal alignment of a record fields, i.e. the layout of a record type; it actually defines record type alignment, i.e. base address of a record instance in memory. But through defining the alignment it also indirectly defines the layout of a record type.

Here is how it works:

program AlignDemo;

{$APPTYPE CONSOLE}

{$ALIGN 1}
type
  TType1 = record
    Field1: Byte;
    Field2: int64;
  end;

{$ALIGN ON}
type
  TType2 = record
    Field1: Byte;
    Field2: int64;
  end;

begin
  Writeln('SizeOf(TType1) : ', SizeOf(TType1));    // 9
  Writeln('SizeOf(TType2) : ', SizeOf(TType2));    // 16
  Readln;
end.

With the TType1 we are telling the compiler that instances of TType1 are not aligned, i.e. can have any base address in memory; so it is impossible to make the TType1.Field2 8-byte aligned in memory by adding pad bytes between the Field1 and Field2.

With the TType2 we are telling the compiler to choose the optimal alignment; the compiler sets the alignment for TType2 equal to 8 because 8 is greatest alignment among the record fields’ types (it is the int64 type alignment). Since TType2 is 8-byte aligned, the compiler adds 7 padding bytes after TType2.Field1 to make the TType2.Field2 8-byte aligned too for performance reasons.

The case of more complicated structures is not much different – the compiler sets the record’s type alignment equal to the greatest alignment among the record fields if it is less than the default alignment; otherwise it uses the default alignment. This also unambiguously defines the internal layout because all alignments are powers of 2.

The story would be incomplete if I say nothing about the packed specifier. As the name suggests the packed specifier affects the layout of a record type. But following the same alignment-layout interdependence logic it also affects the record’s alignment, and the final result of applying the packed specifier is exactly the same as of {$ALIGN 1} directive. At least that is true for the current Delphi versions; there is the open bug report which suggests that it may change in future.

Simple benchmarking framework, or how to enumerate published methods in Delphi

1

The article describes an old trick of enumerating and invoking published methods of a class. The trick is used in DUnit testing framework, and I came to the same idea while trying to bring some order into my benchmark routines.

So the idea was to declare benchmark routines as published methods of a class

type
  TDemoRunner = class(TRunner)
  published
    procedure Proc1;
    procedure Proc2;
  end;

and get these published methods automatically invoked by the framework.

I don’t want to dig into details of the solution because it supposed to be the solution that “just works” without need to dig into internals.

{$M+}
unit Runners;

interface

uses
  SysUtils;

const
  MillisPerDay = 24 * 60 * 60 * 1000;

type
  TBenchProc = procedure of object;

  PMethInfo = ^TMethInfo;
  TMethInfo = record
    Code: Pointer;
    Name: string;
  end;

  TMethArray = array of TMethInfo;

type
  TRunnerClass = class of TRunner;

  TRunner = class
  protected
    FElapsedMs: Integer;
    FMethArray: TMethArray;
    FMethIndex: Integer;
    procedure LogProc; virtual;
    procedure Init; virtual;
    procedure Run; virtual;
  public
    constructor Create; virtual;
    class procedure Exec(AClass: TRunnerClass);
  end;

implementation

{ TRunner }

constructor TRunner.Create;
begin
end;

procedure TRunner.Init;
type
  PMethodTable = ^TMethodTable;
  TMethodTable = packed record
    Count: SmallInt;
    Data: record end;
  end;

  PMethEntry = ^TMethEntry;
  TMethEntry = packed record
    Len: Word;
    Code: Pointer;
    Name: ShortString;
  end;

var
  MethTable: PMethodTable;
  I: Integer;
  Entry: PMethEntry;

begin
  FMethArray:= nil;
  MethTable:= PPointer(PByte(Self.ClassType) + vmtMethodTable)^;
  if (MethTable = nil) or (MethTable.Count <= 0) then Exit;
  Entry:= @MethTable.Data;
  SetLength(FMethArray, MethTable.Count);
  for I:= 0 to MethTable.Count - 1 do begin
    FMethArray[I].Code:= Entry.Code;
    FMethArray[I].Name:= string(Entry.Name);
    Inc(PByte(Entry), Entry.Len);
  end;
end;

procedure TRunner.LogProc;
begin
  Writeln(FMethArray[FMethIndex].Name, ' .. time: ', FElapsedMs, ' ms');
end;

procedure TRunner.Run;
var
  Proc: TBenchProc;
  StartTime: TDateTime;
  I: Integer;

begin
  Init;
  for I:= 0 to Length(FMethArray) - 1 do begin
    TMethod(Proc).Code:= FMethArray[I].Code;
    TMethod(Proc).Data:= Self;
    StartTime:= Now;
    Proc();
    FElapsedMs:= Round((Now - StartTime) * MillisPerDay);
    FMethIndex:= I;
    LogProc;
  end;
end;

class procedure TRunner.Exec(AClass: TRunnerClass);
var
  Instance: TRunner;

begin
  Instance:= AClass.Create;
  try
    Instance.Run;
  finally
    Instance.Free;
  end;
end;

end.

Here is the usage example.

Demo benchmarking class:

unit DemoRunners;

interface

uses Runners;

type
  TDemoRunner = class(TRunner)
  published
    procedure Proc1;
    procedure Proc2;
  end;

implementation

{ TDemoRunner }

procedure TDemoRunner.Proc1;
begin
  Writeln('Running Proc1');
end;

procedure TDemoRunner.Proc2;
begin
  Writeln('Running Proc2');
end;

end.

and benchmarking console application:

program BenchDemo;

{$APPTYPE CONSOLE}

uses
  SysUtils,
  Runners in 'Runners.pas',
  DemoRunners in 'DemoRunners.pas';

begin
  try
    TRunner.Exec(TDemoRunner);
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
  Readln;
end.

The output:
_bench