On the Pointers and References

6

A Reference is a distinct concept in C/C++ languages. The next code sample (MinGW GCC compiler was used)

#include <iostream>

int main()
  {
    int Value;
    int &ValueRef = Value;

    Value = 2;
    std::cout << "Value: " << Value << "\n";
    std::cout << "ValueRef: " << ValueRef << "\n";

    ValueRef = 3;
    std::cout << "Value: " << Value << "\n";
    std::cout << "ValueRef: " << ValueRef << "\n";
    return 0;
  }

declares ValueRef as a reference to Value variable. The output is
_ref1
Though ValueRef variable is a pointer to Value internally the indirection operator is never used, and the syntax for accessing Value directly or indirectly via ValueRef is the same. ValueRef is also called an alias to Value; the point that ValueRef variable is a pointer internally is just an implementation detail.

Another important thing about C/C++ references is that they are always initialized. The language syntax enforces that you cannot declare a wild reference.

Pascal does not have the same reference concept. The closest concept is a procedure parameter passed by reference:

program ref;

{$APPTYPE CONSOLE}

var
  Value: Integer;

procedure Test(var ValueRef: Integer);
begin
  Writeln('ValueRef: ', ValueRef);
  ValueRef:= 3;
end;

begin
  Value:= 2;
  Writeln('Value: ', Value);
  Test(Value);

  Writeln('Value: ', Value);
  Test(Value);
end.

we can see that

  • ValueRef does not use indirection operator to access a referenced variable;
  • the language syntax enforces that ValueRef is always initialized.

Delphi does not elaborate the reference concept, though there are many built-in types in the language that are ‘transparent pointers’ – objects, interfaces, dynamic arrays, strings. The term reference can be used for example for object variables because the language syntax makes these variables indistinguishable from referenced instances. Instead the term object is usually used, that can mean object reference or object instance, so sometimes you think twice to understand what a particular object does mean.

Bitwise operations on big integers

0

Standard fixed-sized negative integers are stored in two’s complement format; for arbitrary-precision big integers the two’s complement format means infinite size, so internally it is not used. Still bitwise operation on big integers are implemented as if negative big integer values are stored in two’s complement format.

As a result bitwise boolean operations (and, or, xor) applied to big integers produce the same results as with standard fixed-sized integers:

procedure Test1(I, J: Integer);
var
  BigI, BigJ: BigInteger;

begin
  BigI:= I;
  BigJ:= J;
  Assert(BigI and BigJ = I and J);
  Assert(BigI or BigJ = I or J);
  Assert(BigI xor BigJ = I xor J);
end;

There is a difference between standard Delphi integer types and big integers in shift operations. Shift operations on big integers preserve sign. That means any shift applied to negative big integer results in negative big integer (the same is for non-negative values):

procedure Test2(I: BigInteger; N: Cardinal);
begin
  Assert(((I shl N) < 0) = (I < 0));
  Assert(((I shr N) < 0) = (I < 0));
end;

That is a natural consequence of the infinite-sized two’s complement negative values. Shift operations on big integers are arithmetic shifts rather than logical shifts. On the other hand ‘shl’ and ‘shr’ operations on the standard Delphi integer types are implemented as logical shifts and does not preserve sign:

procedure Test3;
var
  I: Integer;

begin
  I:= -1;
  Writeln(I shr 2);   // 1073741823, because 'shr' is logical shift
  I:= 1000000000;
  Writeln(I shl 2);   // -294967296, because of 32-bit overflow
end;

PS: right now TForge does not support bitwise operations on BigInteger type, they will be implemented soon.

A word about ‘out’ parameters in Delphi

5

Delphi compiler has a contract – any interface variable should be either a valid reference or nil. The same also applies to other lifetime managed types – strings and dynamic arrays. An interesting consequence of the contract is how Delphi interprets ‘out’ function parameters.
As the name suggests, ‘out’ means that an input value of a parameter should be ignored inside a function; but Delphi compiler cannot ignore input value of a lifetime managed instance. It decrements the reference count of an instance when an instance is no longer needed if a reference to instance is not nil.
As a workaround Delphi compiler applies the following trick. Suppose we have a procedure

procedure Foo(out II: IInterface);
begin
  II:= TInterfacedObject.Create;
end;

a call of the procedure

  Foo(II);

compiles as:

II:= nil;
Bar(II);

where

procedure Bar(var II: IInterface);
begin
  II:= TInterfacedObject.Create;
end;

Bar procedure decrements the reference count of the input parameter instance; the meaning of ‘out‘ parameter forbids to do so. Since Delphi compiler cannot ignore input value of a lifetime manage instance the problem was solved by assigning nil value to a parameter before calling a procedure.

Big Integer musings

2

When writing a multiple precision integer arithmetic library you should have in mind two aims – usability and performance, and to reach both is not a trivial task. A modern multiple precision integer implementation in Delphi should be as easy to use as that:

var
  A, B, C: TBigInteger;

begin
  A:= 1;
  B:= 2;
  C:= A + B;
  Writeln(C.AsString);
end.

Internally a big integer is some data structure that should be initialized prior it can be used, but there is no explicit initialization in the above code example. Initialization occures when a big integer is assigned first time, and the requirement that a value should be assigned prior it can be used is the same for plain integer type. The only difference here is that a most probable result of using unassigned (= uninitialized) big integer is an access violation error.

Delphi provides operator overloading to implement initialization by assignment, so a big integer should be implemented as an advanced record with overloaded Implicit operator. Next point that should taken into account is an assignment of two big integers:

var
  A, B: TBigInteger;

begin
  A:= 1;
  B:= A;
  Writeln(B.AsString);
end.

The assignment of two big integers cannot be overloaded, and that means that a big integer data should be reference counted; the assignment B:= A just increments a ref counter. A ref counted implementation has its pros and contras from the performance point of view, but there is no alternative here if you are aiming usability too. Reference counting can be implemented either by using a dynamic array to hold big integer data or by using interface, and interface implementation leads to a cleaner code.

Next logical conclusion is that a big integer data should be immutable and write accessed only as a whole. You can’t change just one byte of big integer data because of side effect when ref counter > 1. Fortunately there is no need in byte-level write access to a big integer data. Any write access to a big integer should create a new data structure with ref counter = 1 and decrement ref counter of an old one.

Functional programming style in Delphi

10

Functional programming paradigm gradually finds its way even in Delphi – an imperative language without a garbage collector. Consider the next code sample that changes a button’s caption using extended RTTI (requires Delphi 2010 and above):

procedure TForm1.Button1Click(Sender: TObject);
var
  Ctx: TRttiContext;
  P: TRttiProperty;
  T: TRttiType;

begin
  T:= Ctx.GetType(TButton);
  P:= T.GetProperty('Caption');
  P.SetValue(Button1, 'RTTI');
end;

When analyzing the above code snippet note the following:

  • TRttiContext is a record type; you need not create and destroy a variable of a record type
  • TRttiProperty and TRttiType types are classes; the instances of these classes are created by correspondent TRttiType and TRttiContext methods, but you should not destroy these instances yourself – the underlying extended RTTI implementation takes care about it

As a result the above code snippet can be rewritten in a functional style:

procedure TForm1.Button1Click(Sender: TObject);
begin
  TRttiContext.Create.GetType(TButton).GetProperty('Caption').SetValue(Button1, 'RTTI');
end;

All variables have gone. No memory leaks. Does your brain hurt?

On the type compatibility in Delphi

4

Delphi compiler evolves much faster than Delphi documentation, and some language features remain unnamed. Consider the following code snippet:

type
  MyPChar1 = PChar;
  MyPChar2 = type PChar;
  MyPChar3 = ^Char;

procedure Test(Ch: PChar);
begin
end;

procedure TForm1.Button4Click(Sender: TObject);
var
  Ch1: MyPChar1;
  Ch2: MyPChar2;
  Ch3: MyPChar3;

begin
  Test(Ch1);
//  Test(Ch2);     Error E2008 Incompatible types
//  Test(Ch3);     Error E2010 Incompatible types: 'Unit1.Char' and 'System.Char'
end;

The documentation states that MyPChar1 and PChar are identical types, while MyPChar2 and MyPChar3 are distinct; that is why Test(Ch2); and Test(Ch3); lines does not compile. But notice – the compiler issues different error codes. Does it matter?

Consider the next snippet:

procedure Test1(Ch: MyPChar1);
begin
end;

procedure Test2(Ch: MyPChar2);
begin
end;

procedure Test3(Ch: MyPChar3);
begin
end;

procedure TForm1.Button5Click(Sender: TObject);
begin
  Test1('Foo');
  Test2('Foo');   // Compiles
//  Test3('Foo');    Error E2010 Incompatible types: 'MyPChar3' and 'string'
end;

The documentation states that PChar type is assignment compatible on input with string literal. It appears that there is a difference in type ‘distinctness’. The type PChar type is somewhat more compatible with PChar than ^Char type.

Generic musings

7

Delphi (as Pascal descendant) has always had powerful type system. Consider the following sample code implementing simple insertion sort algorithm (I use Delphi 2009):

unit InsSort;

interface

type
  TItem = Integer;
  TArray = array of TItem;

procedure InsertionSort(var A: TArray);

implementation

procedure InsertionSort(var A: TArray);
var
  I, J: Integer;
  Item: TItem;

begin
  for I:= 1 + Low(A) to High(A) do begin
    Item:= A[I];
    J:= I - 1;
    while (J >= Low(A)) and (A[J] > Item) do begin
      A[J + 1]:= A[J];
      Dec(J);
    end;
    A[J + 1]:= Item;
  end;
end;

The above code sorts integer array; if we need to sort array of strings, all we have to do is to switch TItem type declaration from Integer to string:

type
  TItem = string;

If our task is to sort both integer and string arrays we should have two separate source code units (though they differ just by one word). Clearly that is code duplication that can’t be solved in plain Pascal. But Delphi has generics, let us try generics. Naive

  procedure InsertionSort<TItem>(var A: array of TItem);

does not compile, we need class to use generics; we also need some means to compare generic types, so I have written the following simple generic class:

type
  TArray<TItem> = class
    class function Compare(const L, R: TItem): Integer; static;
    class procedure InsertionSort(var A: array of TItem); static;
  end;

implementation

uses SysUtils, TypInfo;

class function TArray<TItem>.Compare(const L, R: TItem): Integer;
var
  P: PTypeInfo;

begin
  P:= TypeInfo(TItem);
  case P^.Kind of
    tkInteger: begin
      if PInteger(@L)^ < PInteger(@R)^ then Result:= -1
      else if PInteger(@L)^ > PInteger(@R)^ then Result:= 1
      else Result:= 0;
    end;
    tkUString: Result:= CompareStr(PString(@L)^, PString(@R)^);
  end;
end;

class procedure TArray<TItem>.InsertionSort(var A: array of TItem);
var
  I, J: Integer;
  Item: TItem;

begin
  for I:= 1 + Low(A) to High(A) do begin
    Item:= A[I];
    J:= I - 1;
//    while (J >= Low(A)) and (A[J] > Item) do begin
    while (J >= Low(A)) and (Compare(A[J], Item) > 0) do begin
      A[J + 1]:= A[J];
      Dec(J);
    end;
    A[J + 1]:= Item;
  end;
end;

It works but it looks ugly. Think also of crazy overhead of generic comparison, especially for plain types like Integers that are normally compared by just a single assembler instruction. Delphi 2009 Rtl (Generics.Defaults.pas unit) provides more elegant (and more lengthy) solution based on IComparer generic interface, but it boils down to the same inefficient runtime type information usage.

But why we need the ‘Compare’ function at all? By the time the compiler generates code for generic TItem type the compiler should already know actual type to substitute for TItem, and using the actual type it can compile the elegant line

    while .. (A[J] > Item) ..

much more efficiently than the ugly

    while .. (Compare(A[J], Item) > 0) ..

The idea of direct efficient Generic comparison also invokes Haskell’s ‘type class’ concept. The ‘Type class’ is something very different from OOP class. In Haskell every type is an instance of many type classes. For example integer type is an instance of ‘Eq’ type class because integers can be compared for equality, of ‘Ord’ type class because all comparison operation (<, <=, >, >=) are applicable to integers, of ‘Show’ type class because integers can be converted to string, etc. In our case the generic TItem type should be an instance of ‘Ord’ type class.

Delphi interfaces without reference counting

13

Any interface in Delphi inherits from IInterface (which is a nickname for IUnknown). It is nice for reference-counted interfaces, but sometimes we do not need reference counting at all. Consider the following example:

unit IValues;

interface

type
  IValue = interface(IInterface)
    function GetValue: Integer;
  end;

function GetIValue: IValue;

implementation

type
  TValueObject = class(TObject, IValue)
  protected
    FValue: Integer;
    function GetValue: Integer;
  public
    constructor Create(AValue: Integer);
  end;

var
  FValueObject: TValueObject;

function GetIValue: IValue;
begin
  Result:= FValueObject;
end;

constructor TValueObject.Create(AValue: Integer);
begin
  FValue:= AValue;
end;

function TValueObject.GetValue: Integer;
begin
  Result:= FValue;
end;

initialization
  FValueObject:= TValueObject.Create(11);

finalization
  FValueObject.Free;

end.

The above code does not compile because TValueObject does not implement IUnknown. A workaround is to implement stub methods for IUnknown:

type
  TValueObject = class(TObject, IValue)
  protected
    FValue: Integer;
    function QueryInterface(const IID: TGUID; out Obj): HResult; stdcall;
    function _AddRef: Integer; stdcall;
    function _Release: Integer; stdcall;
    function GetValue: Integer;
  public
    constructor Create(AValue: Integer);
  end;
 
function TValueObject.QueryInterface(const IID: TGUID; out Obj): HResult;
begin
  if GetInterface(IID, Obj) then
    Result:= S_OK
  else
    Result:= E_NOINTERFACE;
end;

function TValueObject._AddRef: Integer;
begin
  Result:= -1;
end;

function TValueObject._Release: Integer;
begin
  Result:= -1;
end;

You can find the same code in VCL sources (TCustomVariantType class from Variants.pas unit).

Now, why should I implement IUnknown in my code if I don’t need it at all?

The above implementation of IUnknown looks like a standard implementation for non-reference-counted interfaces in Delphi. Why not to include it into System.pas and apply compiler magic to make the code in the first example compilable and legitimate? The implementation does not use additional object fields (TInterfacedObject implementation requires additional FRefCount field), so it seems like all that is required from the compiler is to insert pointers to the default IUnknown method’s implementations into the interface tables of any class that does not implement or inherit these methods can be implemented directly in TObject. It will save the programmer from writing a useless duplicate code. The compiler may also issue a warning for the safety reasons, if necessary.

Understanding Threads: Atomic Operations and Memory Ordering

3

A common question about threads is:

I need a read/write access to a variable from two or more threads. Should I protect an access to a variable by critical section?

A simple answer is: yes, if you want to be on the safe side always protect the variable by a critical section. A deeper answer requires an understanding of the two concepts: atomic operations and memory ordering.

The “atomic” means that the result of the operation is always valid. Suppose we have two threads: thread A writes to a variable one of the two possible values (0 or 1); thread B reads the variable. If both writes and reads are atomic, then the result of the read operation in thread B can be only 0 or 1 – it is impossible to read a different value from the variable, even if write and read operations are performed at the same time.

That is what an official documentation on Intel processors says about guaranteed atomic operations:

Guaranteed Atomic Operations

The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically:

    Reading or writing a byte
    Reading or writing a word aligned on a 16-bit boundary
    Reading or writing a doubleword aligned on a 32-bit boundary

The Pentium processor (and newer processors since) guarantees that the following additional memory operations will always be carried out atomically:

    Reading or writing a quadword aligned on a 64-bit boundary

Does the above mean that if an access to a variable is carried out atomically by processor there is no need to protect an access to a variable by critical section? A general answer is NO. Sometimes atomic operations are not enough and we also should take a memory ordering into account.

Consider the following situation:

thread A executes the following code:

  K:= 0;
  L:= 0;
// [..] - a lot of code skipped
  K:= 5;
  L:= 1;

thread B executes the following code:

if L = 1 then N:= N + K;

Read and write access to K and L variables is atomic. What is wrong with the above code?
The result of N:= N + K; is predictable only if the processor enforces strong memory ordering, i.e. thread B reads the changed K and L values in the same order as thread A writes them, so that if thread B sees that L has changed (if L = 1) it assumes that K has also changed (and should be read as 5).
Modern processors does not enforce strong memory ordering. You can read about memory ordering (and atomic operations) in official Intel manual (currently available here), but a good practice is not to rely on a particular kind of memory ordering in a particular processor. We should not assume that when thread B reads L as 1 it will next read K as 5 – it is also possible that it will read K as 0, so the result of

if L = 1 then N:= N + K;

is unpredictable. (Of course the result is always predictable if the above code is executed in thread A).

A possible solution in the above situation is to protect read and write access to L variable by a critical section – a critical section creates a kind of memory barrier and enforces local strong memory ordering. A critical section also makes an access to a variable atomic, so if you protect multithread-accessed variables by critical sections you are on the safe side.

Thread synchronization in Windows XP and Vista

0

An implementation of the thread locks have changed between Windows XP and Vista. To make a long story short, XP locks were fair; starting from Vista (and also Windows 2003 SP1) thread locks are unfair. A lock fairness means that the threads that tries to hold a lock (for example, to enter a critical section) forms a FIFO queue – the first thread that is entered the queue is the first thread to get hold the lock. The above statement does not mean though that Vista threads get hold of the lock in random order. I have tried to write a simple demo code to understand the difference in thread synchronization between XP and Vista and failed. The difference is more subtle and not easy to demonstrate.
To be exact, the locks were never 100% fair – reason is kernel mode APC (Asynchronous Procedure Calls) that can shuffle FIFO order both in XP and Vista. This consideration is closely related to well-known PulseEvent function issue (PulseEvent function is deprecated now). Still if there are no APC calls to interfere, the FIFO order is maintained.
With Windows XP we can think that the system never releases a lock if there is a thread waiting for it – the system just passes a lock from the thread that released it to the thread already waiting for it.
The situation is different with Vista.
Imagine the following scenario: a thread A entered a critical section; a thread B is entering the same critical section (waiting for the thread A to leave it); a thread C is executing and is about to enter the critical section. With Windows XP the thread C have no chance to sneak before the thread B and get hold of the lock first. Starting from Vista the thread C have a chance to do it.
The reason to change the lock’s internals is to make the system more multiprocessor (or multicore) friendly. Consider the threads A, B and C are executed on separate processors. If the thread C get hold of the lock just released by thread A (before the thread B), it avoids entering a wait state and also avoids the overhead related (context switching). The thread B is already in wait state, now it will have to wait the thread C to release the lock, but since the thread C has avoided the wait state completely the total system performance is increased. We can think that locks are used more effectively now – the situation when some thread holds a critical section but not executes a code protected by the critical section (performing a context switching) is less probable.