Delphi interfaces on binary level

An interface reference in Delphi is a pointer to pointer to an interface method table (IMT). That follows the COM specifications and is a good starting point to understand what Delphi interfaces are on binary level. Delphi interfaces can be made 100% compatible with COM specifications, but that is not necessary – it is also possible to implement “light COM-like” interfaces.

Interface methods in Delphi are implemented as object’s methods, so let us have a quick look on the Delphi objects on binary level. Each Delphi 2009 object instance have 2 necessary fields, 4 bytes each. The first field is a pointer to the class VMT, the last field (prefixed by ‘hf’ – Hidden Field? in System.pas) is used by TMonitor advanced record, currently I don’t know what this field is actually for. There are no more fields in TObject instance, so TObject instance size in Delphi 2009 is 8 bytes. We need not these fields here, but we must have in mind that the first 4 bytes and the last 4 bytes of any object instance are “reserved”.
Now let us consider what TInterfacedObject is on binary level. TInterfacedObject implements IInterface that is declared in System.pas as

type
  IInterface = interface
    ['{00000000-0000-0000-C000-000000000046}']
    function QueryInterface(const IID: TGUID; out Obj): HResult; stdcall;
    function _AddRef: Integer; stdcall;
    function _Release: Integer; stdcall;
  end;

The TInterfacedObject itself is declared as

  TInterfacedObject = class(TObject, IInterface)
  protected
    FRefCount: Integer;
    function QueryInterface(const IID: TGUID; out Obj): HResult; stdcall;
    function _AddRef: Integer; stdcall;
    function _Release: Integer; stdcall;
  public
    procedure AfterConstruction; override;
    procedure BeforeDestruction; override;
    class function NewInstance: TObject; override;
    property RefCount: Integer read FRefCount;
  end;

TInterfacedObject instance size (in Delphi 2009) is 16 bytes, and we have 2 additional fields (4 bytes each). The first additional field is FRefCount field, the second is more interesting for us – it is a pointer to the interface method table. If we create an interface reference – for example, by calling

var
  II: IInterface;

begin  
  II:= TInterfacedObject.Create;
  ..

then we have

IMT consists of 3 entries – pointers to QueryInterface, AddRef and Release implementations.

Note that an interface reference is just a simple 4-byte pointer while interface methods are object’s methods and require two pointers – a 4-byte pointer to the method’s code and a 4-byte pointer to the object’s instance. The code pointer can be found in IMT, but what about the object’s instance pointer? After having a closer look on the above picture we can guess that the compiler “knows” the offset of IMT field and subtracts it from an interface reference value to obtain a pointer to object’s instance. Let us check the guess:

procedure TForm1.Button1Click(Sender: TObject);
var
  II: IInterface;

begin
  II:= TInterfacedObject.Create;
  II._AddRef;
  II._Release;
end;

The above code just calls two interface methods – _AddRef and _Release. The compiler implements these calls as follows:

II._AddRef;
        mov eax,[ebp-$04]
        push eax
        mov eax,[eax]
        call dword ptr [eax+$04]
II._Release;
        mov eax,[ebp-$04]
        push eax
        mov eax,[eax]
        call dword ptr [eax+$08]

[ebp-$04] is the interface reference II. The compiler pushes it onto stack (as required by stdcall calling conventions), takes a pointer to IMT from the object’s field pointed by interface reference, adds a IMT offset ($04 for _AddRef, $08 for _Release) and calls the corresponding code. No offset is subtracted – the calls are implemented as if a pointer to object’s instance is equal to an interface reference value, but we know they are different.
Let us go further and have a look to the code called by

        call dword ptr [eax+$04]
        call dword ptr [eax+$08]

instructions:

        add dword ptr [esp+$04],-$08
        jmp TInterfacedObject._AddRef
        add dword ptr [esp+$04],-$08
        jmp TInterfacedObject._Release

Yes! That is where the compiler uses its knowledge about the IMT field offset in an object’s instance. Instead of calling TInterfacedObject methods directly the compiler calls a proxy code that converts an interface reference value into a pointer to an object’s instance and jumps to an object’s method implementation. For optimization reasons the compiler adds -8 instead of subtracting 8 (the offset of IMT field in TInterfacedObject instance), that does not matter for us.

Now all pieces of the puzzle are in place.
On the “client” side we have an interface reference – a plain 4-byte pointer. An interface reference is a pointer to pointer to the IMT; the IMT is an array of pointers to the method’s proxy code. When the compiler calls an interface method pointed by an IMT entry it uses the value of the interface reference as an additional “Self” method’s argument.
On the “server” side we have an object with methods that implements the interface methods. Object’s methods require “Self” argument – a pointer to the object’s instance. But the object’s “Self” is not equal to the value of interface reference.
Between the “client” and “server” there is a proxy code that converts a “client” interface reference value to a “server” object’s instance pointer and jumps to the object’s method implementation.

Now we understand how object’s methods are called using an interface reference, and we can convert an interface reference into a method pointer manually. The following code is a modification of the code from Barry Kelly’s post about the anonymous methods in Delphi:

procedure IntRefToMethPtr(const IntRef; var MethPtr; MethNo: Integer);
type
  TVtable = array[0..999] of Pointer;
  PVtable = ^TVtable;
  PPVtable = ^PVtable;
begin
  // QI=0, AddRef=1, Release=2, etc
  TMethod(MethPtr).Code := PPVtable(IntRef)^^[MethNo];
  TMethod(MethPtr).Data := Pointer(IntRef);
end;

Let us test the above procedure:

type
  TIntFunc = function: Integer of object; stdcall;

procedure TForm1.Button2Click(Sender: TObject);
var
  II: IInterface;
  AddRefMeth: TIntFunc;
  Obj: TInterfacedObject;

begin
  Obj:= TInterfacedObject.Create;
  II:= Obj;
  IntRefToMethPtr(II, AddRefMeth, 1);
  ShowMessage(IntToStr(Obj.RefCount));
  AddRefMeth;
  ShowMessage(IntToStr(Obj.RefCount));
  II._Release;
  ShowMessage(IntToStr(Obj.RefCount));
end;

We can see that AddRefMeth method pointer call do the same as _AddRef interface method call – increments a reference count.

15 thoughts on “Delphi interfaces on binary level

  1. I presume that you delved into interface depths because you really really needed this? Can you tell us when one would need such a hack?

  2. @Stefan,
    I’m the creator of Emballo. I’m pleased to know that you liked the dll-wrapper. In fact, the work is not completely done yet: Register calling convention is not supported at this time, and I have not tested with all kinds of parameters and result types. At this time I’m working a lot on a mocking framework as part of Emballo, but I can stop it by now and fix the issues you found on the dll wrapper. Could you send me details? You can contact me via my e-mail (magnomp *AT* gmail *DOT* com) or via the Emballo discussion group (http://groups.google.com.br/group/emballo)

  3. In your IntRefToMethPtr code, shouldn’t TMethod(MethPtr).Data := Pointer(IntRef); rather be TMethod(MethPtr).Data := Pointer(IntRef)-8; ?

  4. I have a dll written in devoloper studio 2006, it returns pointer to interface. Can I use this pointer to call interface functions from application written in delphi 4, or interface tables are diffrent between diffrent versions od delphi?

    • I have no Delphi 4, so I can’t check it, but the answer should be yes – interfaces were designed to be version independent.

      Interface table is just an array of plain pointers, it does not depend on Delphi at all.

      The thing that (in principle) may depend on Delphi version is a structure of a Delphi object that implements interface, but it also does not introduce any dependence of the interfaces on the Delphi version because all that matters is implemented in dll.

      For example you may expect that the offset of the object’s IMT field can depend on Delphi version. If you export a function that returns an interface reference from dll, and call the interface methods from application, then the proxy code that converts interface reference into object reference by subtracting IMT offset is implemented in the dll. So it does not matter if you have an object with the same name but different internal structure in the application – all that actually matters is the dll’s object structure.

  5. Your answer helped me a lot, it works under delphi 4 but this dll has lot of errors, so I was concerned if some of this error are caused by using pointer to interface 😉

  6. Pingback: Interfaces without objects « The Programming Works

  7. Can I Obtain Rtti from an Interface Reference?
    Is it possible to implement a function like this:
    function GetRttiFromInterface(AIntf: IInterface; out RttiType: TRttiType): Boolean;

  8. Good tip! i want to use it to implement pure decorator pattern for legacy interface. If we had such a one then it impossible to do that via delegation or something. Especially if interface is created from class that is not from TAggregatedObject descendant. To do so i want to reassign _AddRef, _Release and _QueryInterface methods to my wrapper class. All others should go directly via implements syntax (or even simple pointer that returned from my QueryInterface). The only problem is to find out right offset between legacy interface pointer and wrapper class…

  9. Pingback: How and when are variables referenced in Delphi’s anonymous methods captured? – Stack Overflow « The Wiert Corner – irregular stream of stuff

  10. FWIW, http://rvelthuis.de/articles/articles-pointers.html#interfaces. I wrote this ages ago.

    Note that to capture the offset in the stub, you must be aware that the stubs are not always the same. So looking for “add eax,dword x”, where x is the negative offset, may not work all the time. Explained here (the part with LongAdd and ShortAdd stubs). One uses a longint, e.g. $FFFFFFF8, the other uses a shortint, e.g. $C0). The result of both stubs is the same.

Leave a comment