# Hensel’s lifting

0

Number theory is full of gems that turn into effective algorithms demanded in cryptography. Consider the congruence

$x N \equiv 1 \mod R$

where N is odd, and R > N is a power of 2.

Probably the most effective algorithm to solve it is based on the Hensel’s lemma; the corollary that applies for our congruence is: suppose we know the solution $x_{k-1}$ for $R=2^{k-1}$ :

$x_{k-1}N \equiv 1 \mod 2^{k-1}$

then the solution $x_k$ for

$x_k N \equiv 1 \mod 2^k$

is either $x_{k-1}$ or $x_{k-1} + 2^{k-1}$

Example: solve

$x \cdot 21 \equiv 1 \mod 32$

$x_1 \cdot 21 \equiv 1 \mod 2$

which can be nothing else but $x_1 = 1$; next,

$x_2 \cdot 21 \equiv 1 \mod 4$

which can be either $x_2 = 1$ or $x_2 = 1 + 2 = 3$; checking the first alternative gives $1\cdot 21\ and\ 3 = 1$, so $x_2=1$; next,

$x_3 \cdot 21 \equiv 1 \mod 8$; checking $x_3 = x_2 = 1$ gives $1\cdot 21\ and\ 7 = 5$, so $x_3=x_2+4=5$; next,

$x_4 \cdot 21 \equiv 1 \mod 16$; checking $x_4 = x_3 = 5$ gives $5\cdot 21\ and\ 15 = 9$, so $x_4=x_3+8=13$; finally,

$x_5 \cdot 21 \equiv 1 \mod 32$; checking $x_5 = x_4 = 13$ gives $13\cdot 21\ and\ 31 = 17$, so $x_5=x_4+16=29$.

The solution of the congruence is $x = 29$.

# TForge 0.75

4

TForge 0.75 is released.

Starting from ver. 0.75 TForge consists of a single package; that means simplified installation.

Among other features:

• cryptographically secure pseudo-random generator TRandom
• a lot of minor improvements on BigInteger

# On the “out” parameter specifier in Delphi

8

I’ve posted before what the “out” parameter specifier is actually doing in a Delphi code. Now let us consider when the “out” keyword should be used instead of “var”.

There is only one case when the “out” keyword should be used. It is closely related to COM technology support, but it is not really limited to COM. The case is importing a function with a parameter of interface type which violates Delphi contract on reference counting. To illustrate the point consider the following DLL:

library DemoDll;

uses
SysUtils;

type
PMyUnknown = ^TMyUnknown;
TMyUnknown = record
FVTable: PPointer;
FRefCount: Integer;
class function QueryIntf(Inst: Pointer; const IID: TGUID;
out Obj): HRESULT; stdcall; static;
class function AddRef(Inst: PMyUnknown): Integer; stdcall; static;
class function Release(Inst: PMyUnknown): Integer; stdcall; static;
end;

const
VTable: array[0..2] of Pointer = (
@TMyUnknown.QueryIntf,
@TMyUnknown.Release);

function ReleaseInstance(Inst: Pointer): Integer;
type
TVTable = array[0..2] of Pointer;
PVTable = ^TVTable;
PPVTable = ^PVTable;

TRelease = function(Inst: Pointer): Integer; stdcall;

begin
Result:= TRelease(PPVTable(Inst)^^[2])(Inst);
end;

class function TMyUnknown.QueryIntf(Inst: Pointer; const IID: TGUID;
out Obj): HRESULT;
begin
Result:= E_NOINTERFACE;
end;

begin
Inc(Inst.FRefCount);
Result:= Inst.FRefCount;
end;

class function TMyUnknown.Release(Inst: PMyUnknown): Integer;
begin
Dec(Inst.FRefCount);
Result:= Inst.FRefCount;
if Result = 0 then
FreeMem(Inst);
end;

procedure GetIUnknown1(var Inst: PMyUnknown);
var
P: PMyUnknown;

begin
New(P);
P^.FVTable:= @VTable;
P^.FRefCount:= 1;
if Inst <> nil then ReleaseInstance(Inst);
Inst:= P;
end;

procedure GetIUnknown2(var Inst: PMyUnknown);
begin
New(Inst);
Inst^.FVTable:= @VTable;
Inst^.FRefCount:= 1;
end;

exports
GetIUnknown1,
GetIUnknown2;

{$R *.res} begin ReportMemoryLeaksOnShutdown:= True; end.  The DLL exports 2 procedures which create instances implementing IUnknown. The first procedure (GetIUnknown1) follows Delphi rules on reference counting, the second (GetIUnknown2) does not. To use the procedures in an executable we need an import unit: unit DemoImport; interface procedure GetIUnknown1(var I: IUnknown); //procedure GetIUnknown2(var I: IUnknown); // memory leak procedure GetIUnknown2(out I: IUnknown); // ^^^! implementation procedure GetIUnknown1(var I: IUnknown); external 'DemoDLL' name 'GetIUnknown1'; //procedure GetIUnknown2(var I: IUnknown); external 'DemoDLL' name 'GetIUnknown2'; procedure GetIUnknown2(out I: IUnknown); external 'DemoDLL' name 'GetIUnknown2'; end.  Note that the second procedure is imported with out parameter specifier; the result of replacing “out” by “var” is a memory leak which can be shown by the test application: program DemoEXE; {$APPTYPE CONSOLE}

uses
SysUtils,
DemoImport in 'DemoImport.pas';

procedure Test;
var
I: IUnknown;

begin
GetIUnknown1(I);
GetIUnknown2(I);
end;

begin
Test;
end.


# Class Methods vs Instance Methods

9

C# implements BigInteger.Parse as a static class method, and there is obvious reason for it: you can call a static class methods in a variable’s declaration:

  BigInteger N = BigInteger.Parse(stringToParse);


Delphi/Pascal does not support the above syntax, so the equivalent code is

var
N: BigInteger;
begin
N:= BigInteger.Parse(stringToParse);


But now it looks like implementing BigInteger.Parse as a static class method is nothing but additional typing; using an instance method looks better:

var
N: BigInteger;
begin
N.Parse(stringToParse);


So, what is the right “Delphi way” of implementing BigInteger.Parse – as a static class method or as an instance method?

# Integer division by constant

4

If divisor is fixed, the division operation can be replaced by multiplication by a reciprocal; the idea is very simple: x / d = x * (1/d); since d is fixed we can precompute 1/d and use faster multiplication operation. This trick works for integer division, with some additional complications. For example, the next function is identical to division by 5 for all 32-bit dividends:

function Div5(Dividend: LongWord): LongWord;
const
Mult = $CCCCCCCD; PostSh = 2; asm MOV EDX,Mult MUL EDX MOV EAX,EDX SHR EAX,PostSh end;  It turns out that 32-bit multiplication constant does not exists for every divisor; in this case it is possible to use 33-bit constant with hidden senior 1 bit: function Div5_2(Dividend: LongWord): LongWord; const Mult =$9999999A;
PostSh = 3;

asm
MOV     ECX,EAX     // save dividend
MOV     EDX,Mult
MUL     EDX
MOV     EAX,ECX     // restore dividend
RCR     EAX,1       // because addition could produce carry
SHR     EAX,PostSh-1
end;


Naively, the multiplication constant ($9999999A) and postshift (3) can be obtained using the following steps: 1. Write 1/5 in binary form: 1/5 = 0.0011 0011 0011 0011 0011 .. 2. Count leading zeroes after decimal point and add 1; this gives the postshift 3. Write the leading significant bits: 1100 1100 1100 1100 1100 1100 1100 1100 1100 1100 .. 4. Round to 33 bits; rounding is not necessarily based on the value of 34th bit only (later we will use a different approach which is free from rounding problem) 1100 1100 1100 1100 1100 1100 1100 1100 1101 0 5. Remove the leading bit and write as hexadecimal to get the multiplication constant: 99 99 99 9A The second version is more complicated, but good news is that it works for all divisors. Generally, it can be proven that for N-bit division there exists (N+1) bit multiplication constant – for the details see the key article Division by Invariant Integers using Multiplication. The rest of the post is a review of the results obtained in the article cited. I will use 32-bit arithmetic, but the results can be generalized for any bitness. The multiplication constant and postshift can be found using the following procedure: procedure GetParams(Divisor: LongWord); var L, N: LongWord; Tmp: LongWord; M: UInt64; Mult: LongWord; begin Assert(Divisor > 1); Tmp:= Divisor; L:= 0; repeat // count number of significant bits in Divisor Tmp:= Tmp shr 1; Inc(L); until Tmp = 0; N:= 1 shl L; // N = 2^L M:=$100000000;   // 2^32
M:= (M * (N - Divisor)) div Divisor + 1;
Mult:= LongWord(M);
Writeln('Mult  = ' + IntToHex(Mult, 8));
Writeln('Shift = ' + IntToStr(L));
end;


The multiplication constant is not unique. If the constant obtained from the above procedure is odd, it makes sense to increment it by 1 (provided the incremented constant is correct too). The correctness can be checked by

function CheckMult(Divisor, Mult: LongWord): Boolean;
var
L, N: LongWord;
Tmp: LongWord;
P32, MD: UInt64;
Min, Max: UInt64;

begin
P32:= $100000000; // 2^32; MD:= (P32 + Mult) * Divisor; L:= 0; Tmp:= Divisor; repeat // count number of significant bits in Divisor Tmp:= Tmp shr 1; Inc(L); until Tmp = 0; N:= 1 shl L; Min:= P32 * N; Max:= Min + N; Result:= (Min <= MD) and (MD <= Max); end;  The above function gives sufficient condition for correct multiplication constant; it is probably possible that some constant fails the check but still correct, as can be shown by the full search over the dividend range; we will not consider this case. Optimization 1. If the multiplication constant is even, we can shift it right and fit into 32-bit range, thus obtaining shorter division algorithm; this also decrements the postshift. As an example consider division by 641. The GetParams procedure returns Mult = 98F603FF Shift = 10  Since the multiplication constant is odd, we try incremented constant 98F60400. CheckMult shows that it is correct too; the constant ends with 10 zero bits, which is equal to postshift, so we eliminate the postshift: function Div641(Dividend: LongWord): LongWord; const Mult =$663D81;  // = $198F60400 shr 10 asm MOV EDX,Mult MUL EDX MOV EAX,EDX end;  Optimization 2. If the divisor is even, we can preshift both dividend and divisor right, thus decreasing the required precision and obtaining shorter division algorithm again (with the dividend preshift step). For details see the article cited above. # Serial Number System Challenge 6 I stumbled across an interesting link that made me think about a solid serial number system based on strong cryptography. Cryptography discourages systems based on secret algorithms, and relies on open algorithms and secret keys. So let us develop a serial number generation/verification system with the same usability as the one in the linked article but without any secret algorithms. First, our serial numbers should have the form XXXX-XXXX-XXXX-XXXX-XXXX where X – an uppercase english letter A..Z; to prevent user’s confusion let us exclude the letter O which looks like zero, so in the end we have 25 possible letters in 20 positions, that is BigInteger.Pow(25, 20) =$1D6329F1C35CA4BFABB9F561 combinations. Next, we want to work with full bytes; this reduces the possible serial keys to 11 byte-long numbers; also we want to use 2 bytes of serial key as a key checksum; this leaves us with 9 bytes, and we have 9*8 = 72-bit serial keys. That should be strong enough against full keyspace search attack on our system.

Suppose you are a micro-ISV and expecting to sell up to 100 copies of you software; then you need to generate 100 72-bit keys and embed their hashes into the executable (if it will turn out later that you need more copies it is not a problem – just recompile your executable with more keys next time; the same way you can revoke leaked keys – by not including them in the next release).

To derive 72-bit keys I use 128-bit master key and AES encryption algorithm as a pseudorandom function. Note that the 128-bit master key is actually the only secret in the system, everything else is calculated. It is worthwhile to generate the master key, for example, by tossing a coin 128 times.

For hashing I use SHA256 hash function. I also use CRC16 algorithm to calculate the key checksums.

The verification is 2-phase process. First, it converts a serial number in 20-letter format entered by user into a 11-byte serial key, calculates the checksum of the first 9 bytes and compares it with the last 2 bytes (this prevents user from mistyping his serial number). Second, it hashes the 9-byte key and checks that the hash exists in the keyhash table.

And now the challenge. The last TForge release (0.74) includes full source code of console application with the serial number system described above, in the Demos\Challenge subfolder. The key generation code is also included, though it is not used in the application and could be kept secret; the only thing I keep secret is 128-bit master key used.

Build the application with Delphi or Lazarus/Free Pascal.

One of the valid serial numbers is:

AVVH-GJCX-YVWM-EHUE-YMRL

Try to find other valid serial number(s).

# Any CRC algorithm

0

TForge 0.73 is released.

The theme of the release is implementing any CRC algorithm of size up to 64 bits. New TCRC class implements Rocksoft™ Model of CRC algorithms using one-bit-at-a-time CRC calculation; the details will be published on the related project’s wiki page.