Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to the forum
Board view  Mix view

Incredibly slow MMX? (Developers)

posted by Laaca Homepage, Czech republic, 02.03.2009, 21:15

Now I tried to write a transparent PutSprite routine. Suprisingly it wasn't so hard but the result is much slower than normal 386 code. If I draw into VRAM it is even cca 80 times slower!!!

In my test routines I measured how many cycles of PutSprite will pass in 50x55ms. I switched on and off the CPU_info_mmx variable and used RAM or VRAM location:

noMMX into RAM:   8645
MMX into RAM:     5928
noMMX into VRAM: 10210
MMX into VRAM:     152 (!!!)


Hell, do I something fundamentaly wrong?

My computer: AMD K6 Thunderbird 1,33 GHz with GeForce4 MX card
Measured in FreeDOS and DOS session in Windows98 and results are the same.

PROCEDURE PutHCSprite(var Dest,Sprite:VirtualWindow;x,y:LongInt;HideColor:Word);assembler;
 Asm
PUSH ES
  mov  edi,Dest
  mov  ax,ds:[edi+ 0]       {selector of destination...}
  Mov  es,ax                {...into ES}

  mov  esi,Sprite

  mov  ecx,ds:[esi+26]      {Sprite - bytes per line (I use 16bpp modes)}
  mov  eax,y                {Compute the offset of Y-th line}
  mov  ebx,ds:[edi+26]      {Destination - bytes per line into EBX}
  mul  ebx
  Mov  edx,ds:[esi+30]      {Height of sprite into EDX }

  Add  eax,x                {add X position to offset}
  add  eax,x                {and even more because I use 16bpp mode}

  Add  eax,ds:[edi+2]       {Add the Destination basic offset to the computed
                             position offset}

  mov  edi,eax              {and store into EDI}

  Mov  esi,ds:[esi+2]       {Sprite selector is allways DS and we will not
                             compute the offset because this is non-cliping
                             routine}


cmp cpu_info_mmx,0          {do I have a MMX processor?}
jnz @with_mmx                  {If yes, jump}


@wo_mmx:  {--------------------------------------------------}
          @wo_mmx_lines:
            push ecx
            push edi
          @wo_mmx_dots:
               Mov ax,[esi]
               Cmp ax,HideColor
                Je @wo_mmx_skip
               Mov es:[edi],ax
          @wo_mmx_skip:
               Add esi,2
               Add edi,2
               sub ecx,2
               jnz @wo_mmx_dots

            pop edi
            pop ecx
            Add edi,ebx
            dec edx
            jnz @wo_mmx_lines
            JMP @Finished

@with_mmx:{--------------------------------------------------}
cmp ecx,8                   {too short lines?}
jle @wo_mmx                   {If yes, jump}

{Now we know we are on MMX processor and sprite is at least 8 bytes (4 pixels)
 width}

       {Put Hidecolor word in all MMX registers}
            mov ax,HideColor
            shl eax,16
            mov ax,HideColor
            movd mm5,eax
            movd mm6,eax
            psllq mm5,32
            paddusw mm5,mm6
       {...ready in mm5------------------------}

          @mmx_lines:

            push ecx
            push edi
          @mmx_dots:
               movq    mm1,ds:[esi] {4 pixels from sprite}
               movq    mm2,mm1      {make a backup}
               pcmpeqw mm1,mm5          {make mask for AND operation}


               movq    mm3,es:[edi] {4 pixels from destination}
               pand    mm1,mm3      {AND operation between mask and dest}
               por     mm1,mm2      {and now I can place the backuped sprite}

               movq    es:[edi],mm1 {finished 4 pixels into destination}

               add     esi,8
               add     edi,8
               sub     ecx,8
               jz  @mmx_endline     {ECX zero? So this line is finished}
               cmp     ecx,8
               jge @mmx_dots        {Do I have to process the rest of line?}

       {Now do the rest what didn't fit into MMX registers}
          @mmx_rest:
               Mov ax,[esi]
               Cmp ax,HideColor
                Je @mmx_skip
               Mov es:[edi],ax
          @mmx_skip:
               add edi,2
               add esi,2
               sub ecx,2
               jnz @mmx_rest

          @mmx_endline:
               pop edi
               pop ecx
               Add edi,ebx
               dec edx
               jnz @mmx_lines
          EMMS

@Finished:
POP ES
End;

---
DOS-u-akbar!

 

Complete thread:

Back to the forum
Board view  Mix view
22049 Postings in 2034 Threads, 396 registered users, 277 users online (0 registered, 277 guests)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum