Bitwise operations on big integers

0

Standard fixed-sized negative integers are stored in two’s complement format; for arbitrary-precision big integers the two’s complement format means infinite size, so internally it is not used. Still bitwise operation on big integers are implemented as if negative big integer values are stored in two’s complement format.

As a result bitwise boolean operations (and, or, xor) applied to big integers produce the same results as with standard fixed-sized integers:

procedure Test1(I, J: Integer);
var
  BigI, BigJ: BigInteger;

begin
  BigI:= I;
  BigJ:= J;
  Assert(BigI and BigJ = I and J);
  Assert(BigI or BigJ = I or J);
  Assert(BigI xor BigJ = I xor J);
end;

There is a difference between standard Delphi integer types and big integers in shift operations. Shift operations on big integers preserve sign. That means any shift applied to negative big integer results in negative big integer (the same is for non-negative values):

procedure Test2(I: BigInteger; N: Cardinal);
begin
  Assert(((I shl N) < 0) = (I < 0));
  Assert(((I shr N) < 0) = (I < 0));
end;

That is a natural consequence of the infinite-sized two’s complement negative values. Shift operations on big integers are arithmetic shifts rather than logical shifts. On the other hand ‘shl’ and ‘shr’ operations on the standard Delphi integer types are implemented as logical shifts and does not preserve sign:

procedure Test3;
var
  I: Integer;

begin
  I:= -1;
  Writeln(I shr 2);   // 1073741823, because 'shr' is logical shift
  I:= 1000000000;
  Writeln(I shl 2);   // -294967296, because of 32-bit overflow
end;

PS: right now TForge does not support bitwise operations on BigInteger type, they will be implemented soon.

Benchmarking BigInteger

0

Last TForge commits contain PiBench console application that calculates decimal digits of Pi using Machin’s formula; the execution takes about 30 seconds on my laptop. The code was written to compare the execution time of subsequent TForge revisions, but it can also be used to compare TForge BigCardinal and BigInteger types with .NET 4.0 BigInteger type.

Here is Delphi PiBench code:

program PiBench;

{$APPTYPE CONSOLE}

uses
  SysUtils, Diagnostics, tfNumerics;

var
  StopWatch: TStopWatch;
  PiDigits: BigCardinal;

procedure BenchMark;
var
  Factor, Num, Den: BigCardinal;
  Term: BigCardinal;
  N, M: Cardinal;

begin
  PiDigits:= 0;
  Factor:= BigCardinal.Pow(10, 10000);
  Num:= 16 * Factor;
  Den:= 5;
  N:= 1;
  repeat
    Term:= Num div (Den * (2 * N - 1));
    if Term = 0 then Break;
    if Odd(N)
      then PiDigits:= PiDigits + Term
      else PiDigits:= PiDigits - Term;
    Den:= Den * 25;
    Inc(N);
  until N = 0;
  M:= N;
  Num:= 4 * Factor;
  Den:= 239;
  N:= 1;
  repeat
    Term:= Num div (Den * (2 * N - 1));
    if Term = 0 then Break;
    if Odd(N)
      then PiDigits:= PiDigits - Term
      else PiDigits:= PiDigits + Term;
    Den:= Den * 239 * 239;
    Inc(N);
  until N = 0;
  M:= (M + N) div 2;
// M last digits may be wrong
  PiDigits:= PiDigits div BigCardinal.Pow(10, M);
end;

begin
  ReportMemoryLeaksOnShutdown:= True;
  try
    Writeln('Benchmark test started ...');
    StopWatch:= TStopWatch.StartNew;
    BenchMark;
    StopWatch.Stop;
    Writeln(PiDigits.AsString);
    PiDigits.Free;
    Writeln;
    Writeln('Elapsed ms: ', StopWatch.ElapsedMilliseconds);
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
  Readln;
end.

And here is equivalent C# application:

using System;
using System.Numerics;
using System.Diagnostics;

namespace PiBench
{
    class Program
    {
        static void Main(string[] args)
        {
            int N, M;
            BigInteger Term;
            Console.WriteLine("Benchmark test started ...");
            Stopwatch Watch = new Stopwatch();
            Watch.Start();
            BigInteger PiDigits = 0;
            BigInteger Factor = BigInteger.Pow(10, 10000);
            BigInteger Num = 16 * Factor;
            BigInteger Den = 5;
            for(N = 1; N != 0; N++){
              Term = Num / (Den * (2 * N - 1));
              if (Term == 0) break;
              if ((N & 1) != 0) PiDigits = PiDigits + Term;
              else PiDigits = PiDigits - Term;
              Den = Den * 25;
            }
            M = N;
            Num = 4 * Factor;
            Den = 239;
            for(N = 1; N != 0; N++){
              Term = Num / (Den * (2 * N - 1));
              if (Term == 0) break;
              if ((N & 1) != 0) PiDigits = PiDigits - Term;
              else PiDigits = PiDigits + Term;
              Den = Den * 239 * 239;
            }
            M = (M + N) / 2;
       // M last digits may be wrong
            PiDigits = PiDigits / BigInteger.Pow(10, M);
            Watch.Stop();
            Console.WriteLine(PiDigits);
            Console.WriteLine();
            Console.WriteLine("Elapsed ms: " + Watch.ElapsedMilliseconds.ToString());
            Console.ReadLine();
        }
    }
}

You can run both and compare; my tests show nearly the same execution time.

TForge

0

My interface-based arbitrary precision integer arithmetic implementation for Delphi and Free Pascal now is a project on BitBucket.
Currently TForge is a single runtime package (tforge.dpk; future releases will also include a language agnostic dll); you can download the project, build the package and start hacking with BigCardinal and BigInteger types by adding tfNumerics unit to uses clause.
The code is not fully tested yet and not optimized at all, so use it carefully.

Here are some implementation details:

1. BigInteger and BigCardinal variables are initialized by assignment:

var A: BigCardinal;
    B: BigInteger;

begin
   A:= 1;
   B:= 2;
   Writeln(A.AsString, ', ', B.AsString);
   ..

2. No need to finalize BigInteger and BigCardinal variables, the compiler does cleanup when a variable goes out of scope; still you can finalize explicitly by Free method. You can also reuse a ‘Freed’ variable:

var A: BigCardinal;

   A:= 1;
   A.Free;
   A:= 10;
   Writeln(A.AsString);

3. Strings can be explicitly casted to BigInteger and BigCardinal types; you can also use TryFromString method to assign a string value to a BigInteger or BigCardinal variable:

var A: BigCardinal;
    B: BigInteger;

begin
  try
   A:= 1;
   B:= BigInteger('1234567890987654321');
   if not A.TryFromString('Wrong string')
     then Writeln('Bad value');
   Writeln(A.AsString, ', ', B.AsString);

4. BigInteger accommodates BigCardinal, so BigCardinal is implicitly casted to BigInteger, and BigInteger is explicitly casted to BigCardinal:

var A: BigCardinal;
    B: BigInteger;

begin
  A:= 1;
  B:= A; // never raises exception
  A:= BigCardinal(B); // no exception here
  B:= -1;
  A:= BigCardinal(B); // exception – negative value
  ..

5. BigInteger and BigCardinal variables can be mixed in Boolean expressions, with BigCardinal casted to BigInteger:

var A: BigCardinal;
    B: BigInteger;

begin
    A:= 1;
    B:= -2;
    if (A > B) then B:= 0;
    ..

6. BigInteger and BigCardinal variables can be mixed in arithmetic expressions, with BigCardinal casted to BigInteger:

var A: BigCardinal;
    B: BigInteger;

begin
    A:= 10;
    B:= -42;
    Writeln( (A * B).AsString);

The project has a public bug tracker, bug reports are welcome!

Big Integer musings

2

When writing a multiple precision integer arithmetic library you should have in mind two aims – usability and performance, and to reach both is not a trivial task. A modern multiple precision integer implementation in Delphi should be as easy to use as that:

var
  A, B, C: TBigInteger;

begin
  A:= 1;
  B:= 2;
  C:= A + B;
  Writeln(C.AsString);
end.

Internally a big integer is some data structure that should be initialized prior it can be used, but there is no explicit initialization in the above code example. Initialization occures when a big integer is assigned first time, and the requirement that a value should be assigned prior it can be used is the same for plain integer type. The only difference here is that a most probable result of using unassigned (= uninitialized) big integer is an access violation error.

Delphi provides operator overloading to implement initialization by assignment, so a big integer should be implemented as an advanced record with overloaded Implicit operator. Next point that should taken into account is an assignment of two big integers:

var
  A, B: TBigInteger;

begin
  A:= 1;
  B:= A;
  Writeln(B.AsString);
end.

The assignment of two big integers cannot be overloaded, and that means that a big integer data should be reference counted; the assignment B:= A just increments a ref counter. A ref counted implementation has its pros and contras from the performance point of view, but there is no alternative here if you are aiming usability too. Reference counting can be implemented either by using a dynamic array to hold big integer data or by using interface, and interface implementation leads to a cleaner code.

Next logical conclusion is that a big integer data should be immutable and write accessed only as a whole. You can’t change just one byte of big integer data because of side effect when ref counter > 1. Fortunately there is no need in byte-level write access to a big integer data. Any write access to a big integer should create a new data structure with ref counter = 1 and decrement ref counter of an old one.

TInteger

2

Since the introduction of operator overloading in Delphi one can easily define custom data types that support arithmetic operations. For example it is quite easy to implement arithmetic of complex numbers like this:

type
  PComplex = ^TComplex;
  TComplex = record
    Re, Im: Extended;

    class operator Implicit(const A: Extended): TComplex;

    class operator Add(const A, B: TComplex): TksComplex;
    class operator Subtract(const A, B: TComplex): TComplex;
    class operator Multiply(const A, B: TComplex): TComplex;
{...}

Note that TComplex is a static type with fixed data size (Re, Im fields) and is normally allocated on stack. The problems arise when you need to implement arithmetic on dynamic data allocated on heap.
If you are about to implement big integer (TInteger) arithmetic you have two alternatives.
First, you can stick to static data types; that means that you should limit maximum number value in the implementation.
The second alternative is to use dynamic type to allocate the numbers, so there is no need to introduce artificial limits on maximum number value.

The first approach is obvious, the second is more interesting.
You can’t use GetMem/FreeMem to allocate/deallocate the numbers on heap because memory deallocation should be performed automatically by the compiler, and you just cannot instruct the compiler to call FreeMem for temporary TInteger value while evaluating arithmetic expression like

A:= (B + C) * D;

Consequently, the dynamic type to hold the number data should be lifetime-managed type – and the dynamic array seems to be the most probable candidate:

type
  TInteger = record
  private
    FData: array of LongWord;
{...}

The above implementation forbids “in-place” operations on TInteger variables – the reason is a side effect of TInteger assignment. Dynamic arrays do not support copy-on-write, so the following assertion will fail:

var
  A, B: TInteger;

begin
  A:= 1;
  B:= A;
  Inc(A);
  Assert(B=1);
end;

After B:= A both A.FData and B.FData reference the same data, so that after Inc(A) we have B = 2.

Now we have 3 alternatives:

  • we can avoid completely the in-place operations on TInteger;
  • we can use undocumented format of dynamic array to implement TInteger “copy-on-write” by hacking dynamic array’s reference count before calling in-place operations;
  • we can use interfaces.

(Well, there are lifetime-managed data types in Delphi that support “copy-on-write” semantic – strings, but they can’t help us since we need some low-level hacking to manipulate the binary data in string format and “copy-on-write” will not work as we need).

Let us try the following:

type
  PIntegerData = ^TIntegerData;
  TIntegerData = record
    Size: LongInt;   // number of allocated longwords in Data
    Data: array[0..0] of LongWord;
  end;

  IInteger = interface
    function GetData: PIntegerData;
    function GetRefCount: Integer;
  end;

  TIntegerObject = class(TInterfacedObject, IInteger)
  private
    FData: PIntegerData;
  public
    constructor Create(Limbs: Integer);
    destructor Destroy; override;
    function GetData: PIntegerData;
    function GetRefCount: Integer;
  end;

  TInteger = record
  private
    FInteger: IInteger;
{..}

The implementation based on interfaces is the most “correct”. It does not use any dirty hacking or undocumented features to get reference count and implement “copy-on-write”. The bad thing is that the above implementation has the worst performance. For example, to allocate a single TInteger value you need to allocate two memory blocks on heap – first for a TIntegerObject and second for TIntegerData referenced by the TIntegerObject.

All in all, a TInteger implementation based on dynamic arrays (with or without refcount hacking) looks most optimal today.

ksTools 0.45 – Multiple precision integers

0

ksTools 0.45 introduce TksInteger type implementing multiple precision integer arithmetics.
I know several implementations of multiple precision integers in Delphi/Pascal, TksInteger is yet another one. The main reasons to write my own implementation are:
1) I wanted to implement the integer division algorithm as described by Knuth (third edition, Volume 2 / Seminumerical algorithms, 4.3.1, page 273);
2) I wanted “optimization-friendly” implementation.
The current implementation containes no optimization, only “pure pascal” code. Instead, ksTools 0.45 containes unit tests for TksInteger methods as a base for future assembly optimizations.
As usual, the latest ksTools version can be downloaded from the project homepage : http://code.google.com/p/kstools