Incredibly slow MMX? (Developers)
Now I tried to write a transparent PutSprite routine. Suprisingly it wasn't so hard but the result is much slower than normal 386 code. If I draw into VRAM it is even cca 80 times slower!!!
In my test routines I measured how many cycles of PutSprite will pass in 50x55ms. I switched on and off the CPU_info_mmx variable and used RAM or VRAM location:
noMMX into RAM: 8645
MMX into RAM: 5928
noMMX into VRAM: 10210
MMX into VRAM: 152 (!!!)
Hell, do I something fundamentaly wrong?
My computer: AMD K6 Thunderbird 1,33 GHz with GeForce4 MX card
Measured in FreeDOS and DOS session in Windows98 and results are the same.
PROCEDURE PutHCSprite(var Dest,Sprite:VirtualWindow;x,y:LongInt;HideColor:Word);assembler;
Asm
PUSH ES
mov edi,Dest
mov ax,ds:[edi+ 0] {selector of destination...}
Mov es,ax {...into ES}
mov esi,Sprite
mov ecx,ds:[esi+26] {Sprite - bytes per line (I use 16bpp modes)}
mov eax,y {Compute the offset of Y-th line}
mov ebx,ds:[edi+26] {Destination - bytes per line into EBX}
mul ebx
Mov edx,ds:[esi+30] {Height of sprite into EDX }
Add eax,x {add X position to offset}
add eax,x {and even more because I use 16bpp mode}
Add eax,ds:[edi+2] {Add the Destination basic offset to the computed
position offset}
mov edi,eax {and store into EDI}
Mov esi,ds:[esi+2] {Sprite selector is allways DS and we will not
compute the offset because this is non-cliping
routine}
cmp cpu_info_mmx,0 {do I have a MMX processor?}
jnz @with_mmx {If yes, jump}
@wo_mmx: {--------------------------------------------------}
@wo_mmx_lines:
push ecx
push edi
@wo_mmx_dots:
Mov ax,[esi]
Cmp ax,HideColor
Je @wo_mmx_skip
Mov es:[edi],ax
@wo_mmx_skip:
Add esi,2
Add edi,2
sub ecx,2
jnz @wo_mmx_dots
pop edi
pop ecx
Add edi,ebx
dec edx
jnz @wo_mmx_lines
JMP @Finished
@with_mmx:{--------------------------------------------------}
cmp ecx,8 {too short lines?}
jle @wo_mmx {If yes, jump}
{Now we know we are on MMX processor and sprite is at least 8 bytes (4 pixels)
width}
{Put Hidecolor word in all MMX registers}
mov ax,HideColor
shl eax,16
mov ax,HideColor
movd mm5,eax
movd mm6,eax
psllq mm5,32
paddusw mm5,mm6
{...ready in mm5------------------------}
@mmx_lines:
push ecx
push edi
@mmx_dots:
movq mm1,ds:[esi] {4 pixels from sprite}
movq mm2,mm1 {make a backup}
pcmpeqw mm1,mm5 {make mask for AND operation}
movq mm3,es:[edi] {4 pixels from destination}
pand mm1,mm3 {AND operation between mask and dest}
por mm1,mm2 {and now I can place the backuped sprite}
movq es:[edi],mm1 {finished 4 pixels into destination}
add esi,8
add edi,8
sub ecx,8
jz @mmx_endline {ECX zero? So this line is finished}
cmp ecx,8
jge @mmx_dots {Do I have to process the rest of line?}
{Now do the rest what didn't fit into MMX registers}
@mmx_rest:
Mov ax,[esi]
Cmp ax,HideColor
Je @mmx_skip
Mov es:[edi],ax
@mmx_skip:
add edi,2
add esi,2
sub ecx,2
jnz @mmx_rest
@mmx_endline:
pop edi
pop ecx
Add edi,ebx
dec edx
jnz @mmx_lines
EMMS
@Finished:
POP ES
End;
---
DOS-u-akbar!
Complete thread:
- Incredibly slow MMX? - Laaca, 02.03.2009, 21:15 (Developers)
- Incredibly slow MMX? - Rugxulo, 04.03.2009, 03:01
- Incredibly slow MMX? - Laaca, 04.03.2009, 13:26
- Incredibly slow MMX? - Rugxulo, 04.03.2009, 23:50
- Incredibly slow MMX? - Laaca, 04.03.2009, 13:26
- Incredibly slow MMX? - Japheth, 04.03.2009, 17:55
- Incredibly slow MMX? - Laaca, 08.03.2009, 22:04
- Incredibly slow MMX? - Rugxulo, 09.03.2009, 00:33
- Incredibly slow MMX? - mht, 21.03.2009, 12:57
- Incredibly slow MMX? - Laaca, 08.03.2009, 22:04
- Incredibly slow MMX? - DOS386, 22.03.2009, 06:11
- Incredibly slow MMX? - Rugxulo, 04.03.2009, 03:01