Thursday, August 18, 2005

Getting (possibly) 500% speed gain on divisions

I've moved my blog and it's post to my new blog, please go to Getting (possibly) 500% speed gain on divisions on Landman Code.One of my fellow students once made a remark about the old days when a division was much faster if it was written as multiplication.

This means, instead of devising with a 100, you multiply with 0.01 (1/100). Because some programs of mine have a lot of divisions in their core loops, I investigated the difference.

I created the following test program.

program DivVsMult
 
{$APPTYPE CONSOLE} 
 
uses 
SysUtils
Windows
 
var 
Start,Stop, Start2, Stop2, Freq:int64
i : integer
t : real
CpuSpeed : integer
begin 
{ TODO -oUser -cConsole Main : Insert code here } 
{ Cpu Speed fastes cpu = 1 slower => 10 
it's just to determin the number of time to do the loop 
Maxint div CpuSpeed is calculated } 
if ParamCount = 1 then 
CpuSpeed := StrToIntDef(ParamStr(1),1
else 
CpuSpeed := 10
Writeln('Simple Number division:'); 
Writeln('Calculating'); 
QueryPerformanceFrequency(freq); 
QueryPerformanceCounter(Start); 
for i:=0 to MaxInt div CpuSpeed do 
t := i * (1/100); 
QueryPerformanceCounter(Stop); 
Writeln(Format('First Pass Result: %f',[t])); 
{ This is needed because the compiler would optimize, 
and would notice the result of the loop isn't used at all, 
so therefor the result is useless.. so depending on the compiler, it will 
choose what to do with it, this disables that optimization } 
QueryPerformanceCounter(Start2); 
for i:=0 to MaxInt div CpuSpeed do 
t := i * (1/100); 
QueryPerformanceCounter(Stop2); 
Writeln(Format('Second Pass Result: %15.6f',[t])); 
{ This is needed because the compiler would optimize, 
and would notice the result of the loop isn't used at all, 
so therefor the result is useless.. so depending on the compiler, it will 
choose what to do with it, this disables that optimization } 
Writeln('Done, Results:'); 
Writeln(Format('/ 100 Time: %6.4f seconds'+#13#10
'/ 100 Clock: %d ticks'+#13#10
'* 0.01 Time: %6.4f seconds'+#13#10
'* 0.01 Clock: %d ticks',[(Stop-Start) / freq, , (Stop2-Start2)])); 
Writeln
Writeln('Odd Number division:'); 
QueryPerformanceCounter(Start); 
for i:=0 to high(i) div CpuSpeed do 
t := i / 556
QueryPerformanceCounter(Stop); 
Writeln(Format('First Pass Result: %15.6f',[t])); 
{ This is needed because the compiler would optimize, 
and would notice the result of the loop isn't used at all, 
so therefor the result is useless.. so depending on the compiler, it will 
choose what to do with it, this disables that optimization } 
QueryPerformanceCounter(Start2); 
for i:=0 to high(i) div CpuSpeed do 
t := i * (1/556); 
QueryPerformanceCounter(Stop2); 
Writeln(Format('Second Pass Result: %15.6f',[t])); 
Writeln(Format('/ 556 Time: %6.4f seconds'+#13#10
'/ 556 Clock: %d ticks'+#13#10
'* (1/556) Time: %6.4f seconds'+#13#10
'* (1/556) Clock: %d ticks',[(Stop-Start) / freq, (Stop-Start), (Stop2-Start2) / freq, (Stop2-Start2)])); 
Writeln(Format(' (1/556) = %15.14f (approximate)',[1/556])); 
// Readln; 
 
end
 

On an old P3 900Mhz:

Simple Number division:
Calculating
First Pass Result: 2147483,64
Second Pass Result: 2147483,64
Done, Results:
/ 100 Time: 10,3319 seconds
/ 100 Clock: 36983482 ticks
* 0.01 Time: 2,0378 seconds
* 0.01 Clock: 7294251 ticks

Odd Number division:
First Pass Result: 386238,0647482014610000
Second Pass Result: 386238,0647482014610000
Done, Results:
/ 556 Time: 10,0735 seconds
/ 556 Clock: 36058581 ticks
* (1/556) Time: 2,0446 seconds
* (1/556) Clock: 7318775 ticks
(1/556) = 0,0017985611510791 (approximate)

On a new P4 2.3 Ghz:

Simple Number division:
Calculating
First Pass Result: 2147483.64
Second Pass Result: 2147483.64
Done, Results:
/ 100 Time: 4.6227 seconds
/ 100 Clock: 16547055 ticks
* 0.01 Time: 1.0782 seconds
* 0.01 Clock: 3859508 ticks

Odd Number division:
First Pass Result: 386238.064748
Second Pass Result: 386238.064748
Done, Results:
/ 556 Time: 4.5820 seconds
/ 556 Clock: 16401425 ticks
* (1/556) Time: 12.1746 seconds
* (1/556) Clock: 43579366 ticks
(1/556) = 0.00179856115108 (approximate)

The results are variating, on simple numbers like 0.01 the speedup is allways working, but somehow the very complex numbers tend to be slower sometimes.

I use this tip allot when working with percentage.

Automatic Cropping of a Bitmap

I've moved my blog and it's post to my new blog, please go to Automatic Cropping of a Bitmap on Landman Code. I have created a lot of program’s which needed to do some operations on Bitmaps, most of them are pretty simple, but recently I needed some auto cropping due to the fact I could not calculate the width and height without allot of calculation, but the maximum could easily be calculated/estimated, therefor I needed an algorithm for cropping these bitmap automatically.

Everyone who uses Delphi and Bitmaps should have at least heard about scanlines, and in particularly efg’s pages on the topic. In short:

The ScanLine property, new in Delphi 3, allows quick access to individual pixels, but you must know what PixelFormat you're working with before you can access the pixels correctly.

Because I almost always work with 24bit bitmaps, I didn’t adapt the code for other pixel formats, but it should really just be editing the “procedure AutoCropBitmap(InputBitmap, OutputBitmap: TBitmap; iBleeding : Integer; BackColor: TColor);” overload to start the right sub functions for each pixel format.

The current function will always translate it to an 24bit bitmap, which can be a processor and memory heavy job, the advise is therefor directly after creating the bitmap set the PixelFormat to pf24bit.

You might notice the overloads, well thats just one part of me being lazy again, sometimes I haven’t got the time to create an extra variable and assign the property’s etc. The overloads allow the choice of which input and output you’ll like.

unit unBitmapCropping;  
 
interface  
 
uses 
Windows, Graphics, Dialogs, SysUtils, Math, Classes;  
 
const 
PixelCountMax = 32768;  
 
type 
pRGBArray = ^TRGBArray
TRGBArray = array[0..PixelCountMax-1] of TRGBTriple;  
 
procedure AutoCropBitmap(InputBitmap, OutputBitmap: TBitmap; iBleeding : Integer); overload
procedure AutoCropBitmap(InputBitmap, OutputBitmap: TBitmap; iBleeding : Integer; BackColor: TColor); overload
procedure AutoCropBitmap(BitMapToCrop: TBitmap; iBleeding : Integer); overload
procedure AutoCropBitmap(BitMapToCrop: TBitmap; iBleeding : Integer; BackColor: TColor); overload
procedure AutoCropBmp(const sFileName : String; iBleeding : Integer); overload
procedure AutoCropBmp(const sFileName : String; iBleeding : Integer; BackColor: TColor); overload;  
 
implementation  
 
procedure AutoCropBitmap(BitMapToCrop: TBitmap; iBleeding : Integer); 
var bmpTmp : TBitmap
begin 
bmpTmp := TBitmap.Create
try 
AutoCropBitmap(BitMapToCrop,bmpTmp,iBleeding); 
BitMapToCrop.Assign(bmpTmp); 
finally 
bmpTmp.Free
end
end;  
 
procedure AutoCropBitmap(BitMapToCrop: TBitmap; iBleeding : Integer; BackColor: TColor); 
var bmpTmp : TBitmap
begin 
bmpTmp := TBitmap.Create
try 
AutoCropBitmap(BitMapToCrop,bmpTmp,iBleeding, BackColor); 
BitMapToCrop.Assign(bmpTmp); 
finally 
bmpTmp.Free
end
end;  
 
procedure AutoCropBitmap(InputBitmap, OutputBitmap: TBitmap; iBleeding : Integer); 
begin 
AutoCropBitmap(InputBitmap,OutputBitmap, iBleeding, InputBitmap.Canvas.Pixels[0,0]); 
end;  
 
procedure AutoCropBitmap(InputBitmap, OutputBitmap: TBitmap; iBleeding : Integer; BackColor: TColor); 
var Row : pRGBArray
MyTop, MyBottom, MyLeft
i, j, MyRight : Integer
begin 
MyTop := InputBitmap.Height
MyLeft := InputBitmap.Width
MyBottom := 0
MyRight := 0
InputBitmap.PixelFormat := pf24bit
OutputBitmap.PixelFormat := pf24Bit
{ Find Top } 
for j := 0 to InputBitmap.Height-1 do 
begin 
if j > MyTop then 
Break
Row := pRGBArray(InputBitmap.Scanline[j]); 
for i:= InputBitmap.Width - 1 downto 0 do 
if ((Row[i].rgbtRed <> GetRvalue(BackColor)) or 
(Row[i].rgbtGreen <> GetGvalue(BackColor)) or 
(Row[i].rgbtBlue <> GetBvalue(BackColor))) then 
begin 
MyTop := j
Break
end
end
if MyTop = InputBitmap.Height then 
{ Empty Bitmap } 
MyTop := 0;  
 
{ Find Bottom } 
for j := InputBitmap.Height-1 Downto MyTop do 
begin 
if (j + 1) < MyBottom then 
Break
Row := pRGBArray(InputBitmap.Scanline[j]); 
for i:= InputBitmap.Width - 1 downto 0 do 
if ((Row[i].rgbtRed <> GetRvalue(BackColor)) or 
(Row[i].rgbtGreen <> GetGvalue(BackColor)) or 
(Row[i].rgbtBlue <> GetBvalue(BackColor))) then 
begin 
MyBottom := j+1
Break
end
end;  
 
{ Find Left } 
for j := MyTop to MyBottom-1 do 
begin 
Row := pRGBArray(InputBitmap.Scanline[j]); 
for i:= 0 to MyLeft-1 do 
if ((Row[i].rgbtRed <> GetRvalue(BackColor)) or 
(Row[i].rgbtGreen <> GetGvalue(BackColor)) or 
(Row[i].rgbtBlue <> GetBvalue(BackColor))) then 
begin 
MyLeft := i
Break
end
end
if MyLeft = InputBitmap.Width then 
{ Empty Bitmap } 
MyLeft := 0;  
 
{ Find Right } 
for j := MyTop to MyBottom -1 do 
begin 
Row := pRGBArray(InputBitmap.Scanline[j]); 
for i:= InputBitmap.Width-1 downto MyRight do 
if ((Row[i].rgbtRed <> GetRvalue(BackColor)) or 
(Row[i].rgbtGreen <> GetGvalue(BackColor)) or 
(Row[i].rgbtBlue <> GetBvalue(BackColor))) then 
begin 
MyRight := i+1
Break
end
end
if (MyRight = 0) or (MyBottom = 0) then 
{ Empty Bitmap } 
iBleeding := 0
 
OutputBitmap.Width := MyRight - MyLeft + (iBleeding * 2); 
OutputBitmap.Height := MyBottom - MyTop + (iBleeding * 2); 
OutputBitmap.Canvas.Brush.Color := BackColor
OutputBitmap.Canvas.FillRect(Rect(0,0,OutputBitmap.Width,OutputBitmap.Height));  
 
BitBlt
(OutputBitmap.canvas.Handle, -MyLeft + iBleeding
-MyTop + iBleeding,MyLeft + MyRight,MyTop + MyBottom
InputBitmap.Canvas.Handle, 0, 0, SRCCOPY); 
end;  
 
procedure AutoCropBmp(const sFileName : String; iBleeding : Integer); 
var InputBitmap, OutputBitmap : TBitmap
begin 
if not FileExists(sFileName) then 
raise Exception.Create('File doesn''s exists.'); 
InputBitmap := TBitmap.Create
OutputBitmap := TBitmap.Create
try 
InputBitmap.LoadFromFile(sFileName); 
OutputBitmap.PixelFormat := InputBitmap.PixelFormat
AutoCropBitmap(InputBitmap, OutputBitmap,iBleeding); 
OutputBitmap.SaveToFile(sFileName); 
finally 
OutputBitmap.Free
InputBitmap.Free
end
end;  
 
procedure AutoCropBmp(const sFileName : String; iBleeding : Integer; BackColor: TColor); 
var InputBitmap, OutputBitmap : TBitmap
begin 
if not FileExists(sFileName) then 
raise Exception.Create('File doesn''s exists.'); 
InputBitmap := TBitmap.Create
OutputBitmap := TBitmap.Create
try 
InputBitmap.LoadFromFile(sFileName); 
OutputBitmap.PixelFormat := InputBitmap.PixelFormat
AutoCropBitmap(InputBitmap, OutputBitmap,iBleeding, BackColor); 
OutputBitmap.SaveToFile(sFileName); 
finally 
OutputBitmap.Free
InputBitmap.Free
end
end
 
end.  
 
 

I started off with an example found on efg’s site, and started optimizing it’s algorithm:

  • Dismissed the Temp variables and the counter variable.
  • Making the loops downto 0 as many as possible (this will make the loop slightly faster).
  • Making sure no extra round on the loop is used (adding breaks).
  • Decreasing the number of core operations (removing if’s)

With this I created a >400% speed gain.

Writing this article, I got the following ideas for a little more speedup:

  • Storing the GetXvalue results, as to decrease the recalculation in each for loop.
  • Maybe running loops for 0 till the end will make it faster because of better page alignment

But I will try that out an other time.

Tuesday, August 16, 2005

CPU and Form Friendly (Long) Sleep

I've moved my blog and it's post to my new blog, please go to CPU and Form Friendly (Long) Sleep on Landman Code, Because a Sleep(1000) will seriously freeze your form for a second, the you’ll see that the solution will often be using a Application.ProcessMessages loop until the second has passed, but the problem with that loop is, it will create 100% cpu usage.

Let’s say your waiting for a response from a printer your controlling, than the 100% usage might slow down the complete process, not to mention that on a laptop you’ll be seriously eating the battery.

The following source offers a nice solution to this problem.

procedure TForm.LongDelay(DelayMs : Cardinal);
var StopTime : Cardinal;
begin
StopTime := GetTickCount + DelayMs;
while (GetTickCount < StopTime) do
begin
Application.ProcessMessages;
Sleep(1);
end;
end;

It’s pretty straight forward, offcourse when using a basic windows function, you should check out MSDN for the arguments and remarks. There was one thing that was interresting.

A value of zero causes the thread to relinquish the remainder of its time slice to any other thread of equal priority that is ready to run. If there are no other threads of equal priority ready to run, the function returns immediately, and the thread continues execution.

This fixes one problem, you will only use the cpu when it’s idle. But that still makes this a battery eater on laptops.

I hope you liked this first post, this was just a minor subject.. But I got to start somewhere.

Monday, August 15, 2005

Welcome

I've moved my blog and it's post to my new blog, please go to my new blog called Landman Code. Welcome to my Blog,

you’ll probably think.. wow, yet another useless blog. You’re probably right.

But I will try to explain why I still created this blog. As a student, I travel a lot, therefore I bought a laptop (IBM Think Pad T22). I think many programmers will recognize that once you’re programming, you will always miss just that one piece of code you know you have on the other computer. Seeing the fact that a good programmer is a lazy programmer, you will too not like to rewrite that piece of code (also known as a snippet).

I find myself having two solutions to that problem.

  • Logging in to the other computer trough VNC
  • Post it on the Internet

Always running the laptop and the other computer is not a nice option, therefore I choose the Internet. At first I used an simple html page, but I figured, a blog has a few extras which allow the discussion of my frequently used snippets.

The focus of this weblog will be on the following subjects:

  • Optimizing
  • String operations
  • Image Manipulation
  • Windows API
  • Extended Dialogs
  • File In/Output
  • Database
  • Office Automation

The programming language will be Delphi, and all thought I’m Dutch, I will (try to) keep this blog in English.