Possibly More Optimized Version of the Unsigned 16x16=32 Library Multiplication Function


01/13/2010:  The original Cosmic library source code is here.  (This is protected because it is part of the Cosmic licensed library source code.  If you require the code, please contact Mike Burns at Cosmic.)

The proposed optimized version is below.  The copyright notice below doesn't apply (the code was written by me).  The code incorporates removal of an unnecessary TNZ instruction (mentioned in an e-mail to Cosmic as an enhancement to the code I originally sent).

;       LONG MULTIPLY 16 x 16 -> 32
;       Copyright (c) 2002 by COSMIC Software
;       - 1st operand in X
;       - 2nd operand in Y
;       - result in long accumulator
;
        xdef d_umul
	xref.b c_x, c_y, c_lreg
        .dcall "4,0,d_umul"
;
d_umul:
        push    a              ; Preserve A
        ldw     c_x,x          ; Preserve X.  I'll call this argument 1 in the comments below.
        ldw     c_y,y          ; Preserve Y. I'll call this argument 2 in the comments below.
        clr     c_lreg         ; Clear upper two bytes in case branches are taken.
        clr     c_lreg+1       ; Clear upper two bytes in case branches are taken.
;
;================================================================================
;Calculate and store L1 * L2.
;================================================================================
        ld      a,c_y+1        ; XL = L1, A = L2
        mul     x,a            ; L1 * L2
        ldw     c_lreg+2,x     ; Save result in the lower position.
;
;================================================================================
;Calculate and store L1 * H2. Note that one can show algebraically that the
;sum can't be beyond 2^24-1, so there is provably no carry to propagate (which is
;why it is ignored).
;================================================================================
        ld      a,yh           ; A now contains H2.  TNZ required because a
	                       ; register-to-register move doesn't set the Z flag. 
        tnz     a
        jreq    next           ; Skip to the next calculation if H2 is zero.
        ldw     x,c_x          ; XL now contains L1
        mul     x,a            ; L1 * H2
        addw    x,c_lreg+1     ; Add it to the middle two bytes of the result.
        ldw     c_lreg+1,x     ; Store it back. Provably no carry possible.
;================================================================================
;Calculate and store H1 * L2. There can be a carry into the most significant
;byte in this case.
;================================================================================
next:   ld      a,c_x          ; A now contains H1.  TNZ not required because
                               ; a memory-to-register move sets the Z flag.
        jreq    fini           ; Skip to the end if H1 is zero.
        mul     y,a            ; H1 * L2
        addw    y,c_lreg+1     ; Add in the middle two bytes. There might be a carry.
        ldw     c_lreg+1,y     ; Store the result back.
        bccm    c_lreg,#0      ; Propagate the carry. I've never actually used this instruction, so I
                               ; hope I'm understanding it right. Note that the result MSB would now
                               ; be either 0 or 1, so this instruction appears to be appropriate. This
                               ; is a very special case of carry propagation.
;================================================================================
;Calculate and store H1 * H2. There can't be a carry out.
;================================================================================
        ldw     x,c_x          ; X now contains operand 1.
        swapw   x              ; XL now contains H1.
        ld      a,c_y          ; A now contains H2.
        mul     x,a            ; H1 * H2
        addw    x,c_lreg       ; Add in the most significant bytes.
        ldw     c_lreg,x       ; Store the result back.
fini:
        pop a                  ; Restore A register
        retf                   ; ... and return
;
end

This is the set of materials sent to Cosmic.  I'm waiting on feedback from Cosmic about the proposed optimization.


This web page is maintained by webmaster@dtashley.com.  Sound credit: The Godfather.  This sound is #42, randomly selected from the 76 sounds that are suitable for this page.  A list of all 113 sounds in the database (some unsuitable for this page) can be found here.  Local time on this server (at the time the page was served) is 12:25:10 am on May 21, 2012.