Byte Array redux

0

I am continuing to experiment with the coding patterns used in my numerics dll project and releasing TForge 0.60.

First of all TForge is a modern cryptographic library for Delphi and Lazarus/FPC (which I am currently working on) implemented as a set of runtime packages. The current release contains only one package (also named TForge); it is core package of the whole TForge project and is required by other packages (to be released later).

The purpose of the release is to introduce a new type – ByteArray; the type is an enhanced version of the standard RTL TBytes type. If you want to test the ByteArray you need to build tforge package; the release containes tforge packages for Delphi XE and Lazarus.

The release includes ByteArrayDemo console application which demonstrates functionality of the ByteArray type. For example, you can concatenate byte arrays:

var
  A1, A2: ByteArray;

begin
  A1:= ByteArray(1);
  A2:= TBytes.Create(2, 3, 4);
  Writeln('A1 + A2 = ', (A1 + A2).ToString); // 1, 2, 3, 4

perform bitwise boolean operations (xor is most useful):

var
  A1, A2: ByteArray;

begin
  A1:= ByteArray(1);
  A2:= TBytes.Create(2, 3, 4);
  Writeln('A1 xor A2 = ', (A1 xor A2).ToString); // 3 (= 1 xor 2, min array length used)

use fluent coding style (which appears to be very handy once one gets accustomed to it):

begin
  Writeln(ByteArray.FromText('ABCDEFGHIJ').Insert(3,
    ByteArray.FromText(' 123 ')).Reverse.ToText); // JIHGFED 321 CBA

and so on. It was fun to code ByteArray; I am using it more and more now as a TBytes replacement because of better usability.

Why PDWORD is not a pointer to DWORD

2

Once upon a time the Delphi team lead decided that Delphi should contain declarations of C++ types like unsigned long for the sake of better C++ compatibility. So he has written

type
  CppULongInt = LongWord;

No, that is not good thought the team lead. Delphi’s LongWord is a 32-bit type while the C++’s unsigned long is a platform-dependent type. So let us declare CppULongInt as a distinct type instead of an alias type:

type
  CppULongInt = type LongWord;

That is better. Now how about DWORD type? It is declared in Windows headers as

  typedef unsigned long       DWORD;

So for compatibility sake we will declare DWORD as

type
  DWORD = CppULongInt;

Great, thought the Delphi team lead. He call two juniors (let us call them DevA and DevB) and said:

- DevA, create a branch (BranchA) in the Delphi project, declare type DWORD = CppULongInt; and fix all bugs that may appear after;
DevB, create a branch (BranchB) in the Delphi project, declare type PDWORD = ^CppULongInt; and fix all bugs that may appear after;

After you both are ready we will merge the branches and declare PDWORD type as it should be:

type
  PDWORD = ^DWORD;

In a due time the happy DevB came to the Delphi team lead and said: “I did everything as you said Sir!. Now type DWORD = CppULongInt;, and everything is fine!”.

But the second guy DevA was not happy. He said: “Sir, we have tons of code like that:

procedure Foo(var Value: DWord);
begin
  Value:= 0;
end;

var
  D: Longword;

begin
  Foo(D);

If I declare

type
  DWORD = CppULongInt;

Then it does not compile. What should I do?”

-“Nothing”, said the Delphi team lead. “The BranchB will be the main trunk now”.

Disclaimer: the tale was written after reading this SO question

Why unit tests are not always good

17

Unit tests are good to detect most bugs in your code but not all bugs. When you are writing standard unit tests for a class you are doing the following

  • Create a fresh class instance (ex using Setup method in DUnit framework);
  • Run a code under test (usually a single call of a single method) on the instance;
  • Free the instance (ex using TearDown method in DUnit framework).

And this is how unit tests should be written; if your test detects a bug you immediately know the bug’s origin.

The problem with the above scenario is that it is ideal to hide some badly reproducible bugs such as access violation (AV) bugs. To detect such a bad bug with good probability you need something different, probably to do multiple calls of a method on the same instance, or to call different methods in the same test, and this approach is quite opposite to the idea of unit testing.

Numerics 0.57 released (Hashtables, Bugfix)

2

1. The main purpose of the release is to implement hash tables (aka associative arrays) with keys of BigCardinal or BigInteger type; such hash tables are important for cryptographic applications. The hash tables in Delphi are implemented by a generic TDictionary<TKey,TValue> class. The release includes a new unit GNumerics.pas with two helper generic classes TBigIntegerDictionary<TValue> and TBigCardinalDictionary<TValue> specializing the base generic dictionary class. For example to work with a hash table having BigInteger keys and string values you need something like this:

uses GNumerics;

[..]
var
  HashTable: TBigIntegerDictionary<string>;
  A: BigInteger;

[..]
begin
// create hash table instance
  HashTable:= TBigIntegerDictionary<string>.Create;
  A:= BigInteger('1234567890987654321');
  try
// fill hash table with data
    HashTable.Add(A * A, 'some string value');
    [..]
  finally
// destroy hash table
    HashTable.Free;
  end;
end;

2. Some bugs fixed; in particular a nasty access violation bug in BigInteger.ModPow function was fixed.

3. Minor changes in BigCardinal/BigInteger methods.

Link to the download page

Update

Version 0.58 fixes the conversion bug from (U)Int64 to BigInteger/BigCardinal.

RawByteString type explained

0

With the introduction of Unicode support Delphi also introduced magic RawByteString type; the word ‘magic’ here means that you can’t implement a type with RawByteString functionality without hidden compiler support.

A common misunderstanding about the RawByteString type is that instances of RawByteString contain no encoding information and because of this the compiler can’t implement implicit string conversion. That is not true.

First of all the RawByteString is AnsiString (1-byte character size). If you typecast a Unicode string to RawByteString type or a RawByteString string to Unicode type the compiler will always implement string conversion; if the compiler has no information about the ANSI codepage of RawByteString it uses the system codepage for conversion.

So the RawByteString magic is for AnsiStrings only.

When you declare a variable of AnsiString type you also declare a codepage; for example

type
  CyrString = type AnsiString(1251);

var
  S: CyrString;

That means the compiler has static codepage information; if you do not declare codepage the compiler assumes the system codepage. The AnsiString instances also have runtime codepage information (codepage field in the string instance header), but usually the compiler never checks the runtime codepage information and uses static codepage information for string conversions.

The magic of RawByteString type is that the compiler has no static codepage information; it does not mean that a RawByteString instance has no runtime codepage information.

If you typecast an AnsiString instance to RawByteString type no string conversion happens.

The RawByteString type is for ANSI strings’ typecasting only, not for creating instances of the type. To understand how it works let us first consider “an educated abuse” of RawByteString:

type
  string1251 = type AnsiString(1251);
  string1252 = type AnsiString(1252);

var
  S1: string1251;
  S2: string1252;

begin
// just initialize it with some data
  S1:= 'АБВГДЕЙКА';

// no string conversion here;
// the S1 string instance is copied 'as is',
//   with codepage information
  S2:= RawByteString(S1);

Now we have an instance (S2) of string1252 type containing data in ANSI 1251 encoding and runtime codepage 1251. But since the compiler normally uses static codepage information the subsequent use of the instance may produce strange results.

Finally an example of correct RawByteString type usage. The following function counts the number of occurrences of ‘?’ character in an ANSI string:

function CountQuestions(const S: RawByteString): Integer;
const
  Mark = $3F;

var
  I: Integer;

begin
  Result:= 0;
  for I:= 1 to Length(S) do begin
    if Byte(S[I]) = Mark
      then Inc(Result);
  end;
end;

The purpose of using RawByteString type for the function’s argument is to avoid unnecessary string conversion.

Implementing generic interfaces in Delphi

8

Delphi supports generic interfaces; for example we can declare a generic interface

type
  IChecker<T> = interface
    function Check(const Instance: T): Boolean;
  end;

and use this generic interface as follows:

unit UseDemo;

interface

uses GenChecks;

type
  TDemo<T> = class
  private
    FChecker: IChecker<T>;
  public
    constructor Create(AChecker: IChecker<T>);
    procedure Check(AValue: T);
  end;

implementation

{ TDemo<T> }

procedure TDemo<T>.Check(AValue: T);
begin
  if FChecker.Check(AValue)
    then Writeln('Passed')
    else Writeln('Stopped')
end;

constructor TDemo<T>.Create(AChecker: IChecker<T>);
begin
  FChecker:= AChecker;
end;

end.

To implement the above generic interface IChecker we need a generic class; the straightforward solution is

type
  TChecker<T> = class(TInterfacedObject, IChecker<T>)
    function Check(const Instance: T): Boolean;
  end;

If the IChecker interface can be implemented like that, we need nothing else. The problem with the above implementation is that we are limited to the generic constraints on the type T and can’t use properties of specific types like Integer or string that will finally be substituted for the type T.

A more elastic solution is to introduce an abstract base type and derive the specific implementations from it. Here is a full code example:

program GenericEx1;

{$APPTYPE CONSOLE}

uses
  SysUtils,
  GenChecks in 'GenChecks.pas',
  UseDemo in 'UseDemo.pas';

procedure TestInt;
var
  Demo: TDemo<Integer>;

begin
  Demo:= TDemo<Integer>.Create(TIntChecker.Create(42));
  Demo.Check(0);
  Demo.Check(42);
end;

procedure TestStr;
var
  Demo: TDemo<string>;

begin
  Demo:= TDemo<string>.Create(TStrChecker.Create('trololo'));
  Demo.Check('ololo');
  Demo.Check('olololo');
end;

begin
  TestInt;
  TestStr;
  ReadLn;
end.
unit GenChecks;

interface

type
  IChecker<T> = interface
    function Check(const Instance: T): Boolean;
  end;

type
  TCustomChecker<T> = class(TInterfacedObject, IChecker<T>)
  protected
    FCheckValue: T;
    function Check(const Instance: T): Boolean; virtual; abstract;
  public
    constructor Create(ACheckValue: T);
  end;

  TIntChecker = class(TCustomChecker<Integer>)
  protected
    function Check(const Instance: Integer): Boolean; override;
  end;

  TStrChecker = class(TCustomChecker<string>)
  protected
    function Check(const Instance: string): Boolean; override;
  end;

implementation

{ TCustomChecker<T> }

constructor TCustomChecker<T>.Create(ACheckValue: T);
begin
  FCheckValue:= ACheckValue;
end;

{ TIntChecker }

function TIntChecker.Check(const Instance: Integer): Boolean;
begin
  Result:= Instance = FCheckValue;
end;

{ TStrChecker }

function TStrChecker.Check(const Instance: string): Boolean;
begin
  Result:= Length(Instance) = Length(FCheckValue);
end;

end.

In the above example each interface reference ICheck references its own class instance; this is necessary because every instance contains a parameter (FCheckValue) set in the constructor. If an implementation does not require such a parameter creating new instances for every interface reference will be an overhead. A better solution is to use a singleton instance.

Here is a full code example for the integer type:

program GenericEx2;

{$APPTYPE CONSOLE}

uses
  SysUtils,
  GenChecks in 'GenChecks.pas',
  UseDemo in 'UseDemo.pas';

procedure TestInt;
var
  Demo: TDemo<Integer>;

begin
  Demo:= TDemo<Integer>.Create(TIntChecker.Ordinal);
  Demo.Check(0);
  Demo.Check(42);
end;

begin
  TestInt;
  ReadLn;
end.
unit GenChecks;

interface

uses Generics.Defaults;

type
  IChecker<T> = interface
    function Check(const Instance: T): Boolean;
  end;

  TCustomChecker<T> = class(TSingletonImplementation, IChecker<T>)
  protected
    function Check(const Instance: T): Boolean; virtual; abstract;
  end;

  TIntChecker = class(TCustomChecker<Integer>)
  private
    class var
      FOrdinal: TCustomChecker<Integer>;
  public
    class function Ordinal: TIntChecker;
  end;

implementation

type
  TOrdinalIntChecker = class(TIntChecker)
  public
    function Check(const Instance: Integer): Boolean; override;
  end;

{ TOrdinalIntChecker }

function TOrdinalIntChecker.Check(const Instance: Integer): Boolean;
begin
  Result:= Instance = 42;
end;

{ TIntChecker }

class function TIntChecker.Ordinal: TIntChecker;
begin
  if FOrdinal = nil then
    FOrdinal := TOrdinalIntChecker.Create;
  Result := TIntChecker(FOrdinal);
end;

end.

A Year in MOOC

3

About a year ago I joined my first MOOC [Massive Open Online Course]. There were a lot courses after that; I’ve completed 5 courses and dropped about 20, and I’ve come to my personal understanding what MOOC are and what they should be.

I believe many MOOC courses now are more about the positive learners’ feedback than about the teaching quality. Many courses are watered versions of the offline university courses (or maybe these universities have such watered courses), easy to pass. I dropped most of these courses because they were boring.

Fortunately not all MOOC courses are like that. So far I discovered 2 real MOOC gems: the first was BerkeleyX CS-191x: Quantum Mechanics and Quantum Computation provided by Berkeley University and the second was MITx 6.041x: Introduction to Probability – The Science of Uncertainty provided by MIT.

Both courses were about the fundamentals, the things many talk about but little understand, and the professors were really interested in the learners mastering the course, not in the positive learners’ feedback. Both courses required a lot of time and effort to master, but the rewards of mastering them are incomparable to the watered courses.

Both these courses set a high educational standard, still both are undergraduate courses. I hope we are at the beginning of the educational evolution, and after some time all courses of the best professors from the best universities will be available online.