don't click here

Optimized CalcAngle (atan2 function)

Discussion in 'Engineering & Reverse Engineering' started by Devon, Sep 13, 2022.

  1. Devon

    Devon

    I'm a loser, baby, so why don't you kill me? Tech Member
    1,248
    1,419
    93
    your mom
    Some years ago, there was an attempt to optimize this function by using logarithmic division instead. Unfortunately, there were some glaring issues with it that prevented it from working correctly, notably the fact that it failed to take into account the fact that the function that was used as the basis for that optimization only supported 8-bit input values, whereas Sonic 1/2/3K supported 16-bit input values.

    Recently, I decided to take my own crack at it, and came up with this:
    Code (Text):
    1.  
    2. ; -------------------------------------------------------------------------
    3. ; 2-argument arctangent (angle between (0,0) and (x,y))
    4. ; Based on http://codebase64.org/doku.php?id=base:8bit_atan2_8-bit_angle
    5. ; -------------------------------------------------------------------------
    6. ; PARAMETERS:
    7. ;       d1.w - X value
    8. ;       d2.w - Y value
    9. ; RETURNS:
    10. ;       d0.b - 2-argument arctangent value (angle between (0,0) and (x,y))
    11. ; -------------------------------------------------------------------------
    12.  
    13. CalcAngle:
    14.         moveq   #0,d0                           ; Default to bottom right quadrant
    15.         tst.w   d1                              ; Is the X value negative?
    16.         beq.s   CalcAngle_XZero                 ; If the X value is zero, branch
    17.         bpl.s   CalcAngle_CheckY                ; If not, branch
    18.         neg.w   d1                              ; If so, get the absolute value
    19.         moveq   #4,d0                           ; Shift to left quadrant
    20.  
    21. CalcAngle_CheckY:
    22.         tst.w   d2                              ; Is the Y value negative?
    23.         beq.s   CalcAngle_YZero                 ; If the Y value is zero, branch
    24.         bpl.s   CalcAngle_CheckOctet            ; If not, branch
    25.         neg.w   d2                              ; If so, get the absolute value
    26.         addq.b  #2,d0                           ; Shift to top quadrant
    27.  
    28. CalcAngle_CheckOctet:
    29.         cmp.w   d2,d1                           ; Are we horizontally closer to the center?
    30.         bcc.s   CalcAngle_Divide                ; If not, branch
    31.         exg.l   d1,d2                           ; If so, divide Y from X instead
    32.         addq.b  #1,d0                           ; Use octant that's horizontally closer to the center
    33.  
    34. CalcAngle_Divide:
    35.         move.w  d1,-(sp)                        ; Shrink X and Y down into bytes
    36.         moveq   #0,d3
    37.         move.b  (sp)+,d3
    38.         move.b  WordShiftTable(pc,d3.w),d3
    39.         lsr.w   d3,d1
    40.         lsr.w   d3,d2
    41.  
    42.         lea     Log2Table(pc),a2                ; Perform logarithmic division
    43.         move.b  (a2,d2.w),d2
    44.         sub.b   (a2,d1.w),d2
    45.         bne.s   CalcAngle_GetAtan2Val
    46.         move.w  #$FF,d2                         ; Edge case where X and Y values are too close for the division to handle
    47.  
    48. CalcAngle_GetAtan2Val:
    49.         lea     Atan2Table(pc),a2               ; Get atan2 value
    50.         move.b  (a2,d2.w),d2
    51.         move.b  OctantAdjust(pc,d0.w),d0
    52.         eor.b   d2,d0
    53.         rts
    54.  
    55. ; -------------------------------------------------------------------------
    56.  
    57. CalcAngle_YZero:
    58.         tst.b   d0                              ; Was the X value negated?
    59.         beq.s   CalcAngle_End                   ; If not, branch (d0 is already 0, so no need to set it again on branch)
    60.         moveq   #$FFFFFF80,d0                   ; 180 degrees
    61.  
    62. CalcAngle_End:
    63.         rts
    64.  
    65. CalcAngle_XZero:
    66.         tst.w   d2                              ; Is the Y value negative?
    67.         bmi.s   CalcAngle_XZeroYNeg             ; If so, branch
    68.         moveq   #$40,d0                         ; 90 degrees
    69.         rts
    70.  
    71. CalcAngle_XZeroYNeg:
    72.         moveq   #$FFFFFFC0,d0                   ; 270 degrees
    73.         rts
    74.  
    75. ; -------------------------------------------------------------------------
    76.  
    77. OctantAdjust:
    78.         dc.b    %00000000                       ; +X, +Y, |X|>|Y|
    79.         dc.b    %00111111                       ; +X, +Y, |X|<|Y|
    80.         dc.b    %11111111                       ; +X, -Y, |X|>|Y|
    81.         dc.b    %11000000                       ; +X, -Y, |X|<|Y|
    82.         dc.b    %01111111                       ; -X, +Y, |X|>|Y|
    83.         dc.b    %01000000                       ; -X, +Y, |X|<|Y|
    84.         dc.b    %10000000                       ; -X, -Y, |X|>|Y|
    85.         dc.b    %10111111                       ; -X, -Y, |X|<|Y|
    86.  
    87. WordShiftTable:
    88.         dc.b    $00, $01, $02, $02, $03, $03, $03, $03
    89.         dc.b    $04, $04, $04, $04, $04, $04, $04, $04
    90.         dc.b    $05, $05, $05, $05, $05, $05, $05, $05
    91.         dc.b    $05, $05, $05, $05, $05, $05, $05, $05
    92.         dc.b    $06, $06, $06, $06, $06, $06, $06, $06
    93.         dc.b    $06, $06, $06, $06, $06, $06, $06, $06
    94.         dc.b    $06, $06, $06, $06, $06, $06, $06, $06
    95.         dc.b    $06, $06, $06, $06, $06, $06, $06, $06
    96.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    97.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    98.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    99.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    100.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    101.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    102.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    103.         dc.b    $07, $07, $07, $07, $07, $07, $07, $07
    104.  
    105. Log2Table:
    106.         dc.b    $00, $00, $1F, $32, $3F, $49, $52, $59
    107.         dc.b    $5F, $64, $69, $6E, $72, $75, $79, $7C
    108.         dc.b    $7F, $82, $84, $87, $89, $8C, $8E, $90
    109.         dc.b    $92, $94, $95, $97, $99, $9A, $9C, $9E
    110.         dc.b    $9F, $A0, $A2, $A3, $A4, $A6, $A7, $A8
    111.         dc.b    $A9, $AA, $AC, $AD, $AE, $AF, $B0, $B1
    112.         dc.b    $B2, $B3, $B4, $B5, $B5, $B6, $B7, $B8
    113.         dc.b    $B9, $BA, $BA, $BB, $BC, $BD, $BE, $BE
    114.         dc.b    $BF, $C0, $C0, $C1, $C2, $C2, $C3, $C4
    115.         dc.b    $C4, $C5, $C6, $C6, $C7, $C8, $C8, $C9
    116.         dc.b    $C9, $CA, $CA, $CB, $CC, $CC, $CD, $CD
    117.         dc.b    $CE, $CE, $CF, $CF, $D0, $D0, $D1, $D1
    118.         dc.b    $D2, $D2, $D3, $D3, $D4, $D4, $D5, $D5
    119.         dc.b    $D5, $D6, $D6, $D7, $D7, $D8, $D8, $D8
    120.         dc.b    $D9, $D9, $DA, $DA, $DA, $DB, $DB, $DC
    121.         dc.b    $DC, $DC, $DD, $DD, $DE, $DE, $DE, $DF
    122.         dc.b    $DF, $DF, $E0, $E0, $E0, $E1, $E1, $E1
    123.         dc.b    $E2, $E2, $E2, $E3, $E3, $E3, $E4, $E4
    124.         dc.b    $E4, $E5, $E5, $E5, $E6, $E6, $E6, $E7
    125.         dc.b    $E7, $E7, $E8, $E8, $E8, $E8, $E9, $E9
    126.         dc.b    $E9, $EA, $EA, $EA, $EA, $EB, $EB, $EB
    127.         dc.b    $EC, $EC, $EC, $EC, $ED, $ED, $ED, $ED
    128.         dc.b    $EE, $EE, $EE, $EE, $EF, $EF, $EF, $F0
    129.         dc.b    $F0, $F0, $F0, $F1, $F1, $F1, $F1, $F1
    130.         dc.b    $F2, $F2, $F2, $F2, $F3, $F3, $F3, $F3
    131.         dc.b    $F4, $F4, $F4, $F4, $F5, $F5, $F5, $F5
    132.         dc.b    $F5, $F6, $F6, $F6, $F6, $F7, $F7, $F7
    133.         dc.b    $F7, $F7, $F8, $F8, $F8, $F8, $F8, $F9
    134.         dc.b    $F9, $F9, $F9, $F9, $FA, $FA, $FA, $FA
    135.         dc.b    $FA, $FB, $FB, $FB, $FB, $FB, $FC, $FC
    136.         dc.b    $FC, $FC, $FC, $FD, $FD, $FD, $FD, $FD
    137.         dc.b    $FE, $FE, $FE, $FE, $FE, $FE, $FF, $FF
    138.  
    139. Atan2Table:
    140.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    141.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    142.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    143.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    144.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    145.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    146.         dc.b    $00, $00, $00, $00, $00, $00, $01, $01
    147.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    148.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    149.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    150.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    151.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    152.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    153.         dc.b    $01, $02, $02, $02, $02, $02, $02, $02
    154.         dc.b    $02, $02, $02, $02, $02, $02, $02, $02
    155.         dc.b    $02, $02, $02, $02, $02, $02, $02, $02
    156.         dc.b    $03, $03, $03, $03, $03, $03, $03, $03
    157.         dc.b    $03, $03, $03, $03, $03, $03, $03, $03
    158.         dc.b    $04, $04, $04, $04, $04, $04, $04, $04
    159.         dc.b    $04, $04, $04, $05, $05, $05, $05, $05
    160.         dc.b    $05, $05, $05, $05, $05, $06, $06, $06
    161.         dc.b    $06, $06, $06, $06, $06, $07, $07, $07
    162.         dc.b    $07, $07, $07, $08, $08, $08, $08, $08
    163.         dc.b    $08, $09, $09, $09, $09, $09, $09, $0A
    164.         dc.b    $0A, $0A, $0A, $0B, $0B, $0B, $0B, $0B
    165.         dc.b    $0C, $0C, $0C, $0C, $0D, $0D, $0D, $0D
    166.         dc.b    $0E, $0E, $0E, $0F, $0F, $0F, $0F, $10
    167.         dc.b    $10, $10, $11, $11, $11, $12, $12, $12
    168.         dc.b    $13, $13, $13, $14, $14, $14, $15, $15
    169.         dc.b    $16, $16, $16, $17, $17, $17, $18, $18
    170.         dc.b    $19, $19, $1A, $1A, $1A, $1B, $1B, $1C
    171.         dc.b    $1C, $1C, $1D, $1D, $1E, $1E, $1F, $1F
    172.  
    What I did to get around the input argument limitation was basically just to get the largest of the 2 values (which is needed anyways to do the atan2 calculation properly), and then determine how many bit shifts it would take to get it down into a byte, and then apply that number of bit shifts to both the X and Y (hence, the introduction of WordShiftTable). I'm also using my own generated log2 and atan2 tables. It also takes into account some edge cases regarding the edges of octants. Yes, it's also not save d3 or a2 on the stack. In Sonic 1, it's not really necessary, I found. I dunno about Sonic 2 or 3K, but that shouldn't be hard to get around anyways.

    If there's any further optimizations that can be made, or if there's any issues that I somehow missed, I'd like to hear it. Currently, the worst case scenario for this function is like 246 cycles, and the best case scenario (excluding if both X and Y are 0) in the original is like around the 272 cycles mark.
     
    Last edited: Dec 7, 2023
    • Like Like x 4
    • Useful Useful x 4
    • Informative Informative x 2
    • List
  2. Devon

    Devon

    I'm a loser, baby, so why don't you kill me? Tech Member
    1,248
    1,419
    93
    your mom
    Quick update: accidentally used a wrong register in a place. That's been fixed now.
     
  3. Hivebrain

    Hivebrain

    Administrator
    3,049
    161
    43
    53.4N, 1.5W
    Github
    Good stuff. I've been wanting new ways to cut down on CPU usage.

    Doesn't this leave the stack pointer offset by 1 byte?
     
  4. Devon

    Devon

    I'm a loser, baby, so why don't you kill me? Tech Member
    1,248
    1,419
    93
    your mom
    Reading/writing a byte on the stack actually advances the stack pointer by 2, not 1, in an attempt to keep it aligned at an even address (although, it doesn't really matter which address it's on, it will ALWAYS advance 2 bytes). This can be abused to do a slightly faster multiplication or division by $100 with a word value than with bit-shifting (20 cycles vs. 22 cycles). You can save an additional 4 cycles by not doing a register clear, but, you gotta make sure that it's safe to do that.
     
    Last edited: Sep 15, 2022
    • Informative Informative x 2
    • List
  5. Devon

    Devon

    I'm a loser, baby, so why don't you kill me? Tech Member
    1,248
    1,419
    93
    your mom
    Another small update: did a quick optimization of the edge case where Y = 0, and it picks between 0 and 180 degrees (thanks @Spicy Bread SSR for pointing this out).
     
  6. Devon

    Devon

    I'm a loser, baby, so why don't you kill me? Tech Member
    1,248
    1,419
    93
    your mom
    I have created an alternate version that is smaller and faster, but it's only designed for cases where you know that both the X and Y values are between -$FF and $FF, since it skips out on performing bit shifts to make the values compatible with the tables. The tables are the same as in the other, but are included here for completeness.

    Code (Text):
    1. ; ------------------------------------------------------------------------------
    2. ; 2-argument arctangent (angle between (0,0) and (x,y)) (8-bit)
    3. ; Based on http://codebase64.org/doku.php?id=base:8bit_atan2_8-bit_angle
    4. ; ------------------------------------------------------------------------------
    5. ; PARAMETERS:
    6. ;       d1.w - X value (must be between -$FF and $FF)
    7. ;       d2.w - Y value (must be between -$FF and $FF)
    8. ; RETURNS:
    9. ;       d0.b - 2-argument arctangent value (angle between (0,0) and (x,y))
    10. ; ------------------------------------------------------------------------------
    11.  
    12. CalcAngle8:
    13.         moveq   #0,d0                                           ; Default to bottom right quadrant
    14.         tst.w   d1                                              ; Is the X value negative?
    15.         beq.s   CalcAngle8_XZero                                ; If the X value is zero, branch
    16.         bpl.s   CalcAngle8_CheckY                               ; If not, branch
    17.         neg.w   d1                                              ; If so, get the absolute value
    18.         moveq   #4,d0                                           ; Shift to left quadrant
    19.  
    20. CalcAngle8_CheckY:
    21.         tst.w   d2                                              ; Is the Y value negative?
    22.         beq.s   CalcAngle8_YZero                                ; If the Y value is zero, branch
    23.         bpl.s   CalcAngle8_Divide                               ; If not, branch
    24.         neg.w   d2                                              ; If so, get the absolute value
    25.         addq.b  #2,d0                                           ; Shift to top quadrant
    26.  
    27. CalcAngle8_Divide:
    28.         lea     Log2Table(pc),a2                                ; Perform logarithmic division
    29.         move.b  (a2,d2.w),d2
    30.         sub.b   (a2,d1.w),d2
    31.         bcs.s   CalcAngle8_GetAtan2Val
    32.         neg.b   d2                                              ; Use octant that's horizontally closer to the center
    33.         addq.b  #1,d0
    34.  
    35. CalcAngle8_GetAtan2Val:
    36.         move.b  Atan2Table(pc,d2.w),d2                          ; Get atan2 value
    37.         move.b  OctantAdjust(pc,d0.w),d0
    38.         eor.b   d2,d0
    39.      
    40. CalcAngle8_End:
    41.         rts
    42.  
    43. ; ------------------------------------------------------------------------------
    44.  
    45. CalcAngle8_YZero:
    46.         tst.b   d0                                              ; Was the X value negated?
    47.         beq.s   CalcAngle8_End                                  ; If not, branch (d0 is already 0, so no need to set it again on branch)
    48.         moveq   #$FFFFFF80,d0                                   ; 180 degrees
    49.         rts
    50.  
    51. CalcAngle8_XZero:
    52.         tst.w   d2                                              ; Is the Y value negative?
    53.         bmi.s   CalcAngle8_XZeroYNeg                            ; If so, branch
    54.         moveq   #$40,d0                                         ; 90 degrees
    55.         rts
    56.  
    57. CalcAngle8_XZeroYNeg:
    58.         moveq   #$FFFFFFC0,d0                                   ; 270 degrees
    59.         rts
    60.  
    61. ; ------------------------------------------------------------------------------
    62.  
    63. OctantAdjust:
    64.         dc.b    %00000000                                       ; +X, +Y, |X|>|Y|
    65.         dc.b    %00111111                                       ; +X, +Y, |X|<|Y|
    66.         dc.b    %11111111                                       ; +X, -Y, |X|>|Y|
    67.         dc.b    %11000000                                       ; +X, -Y, |X|<|Y|
    68.         dc.b    %01111111                                       ; -X, +Y, |X|>|Y|
    69.         dc.b    %01000000                                       ; -X, +Y, |X|<|Y|
    70.         dc.b    %10000000                                       ; -X, -Y, |X|>|Y|
    71.         dc.b    %10111111                                       ; -X, -Y, |X|<|Y|
    72.  
    73. Atan2Table:
    74.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    75.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    76.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    77.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    78.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    79.         dc.b    $00, $00, $00, $00, $00, $00, $00, $00
    80.         dc.b    $00, $00, $00, $00, $00, $00, $01, $01
    81.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    82.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    83.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    84.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    85.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    86.         dc.b    $01, $01, $01, $01, $01, $01, $01, $01
    87.         dc.b    $01, $02, $02, $02, $02, $02, $02, $02
    88.         dc.b    $02, $02, $02, $02, $02, $02, $02, $02
    89.         dc.b    $02, $02, $02, $02, $02, $02, $02, $02
    90.         dc.b    $03, $03, $03, $03, $03, $03, $03, $03
    91.         dc.b    $03, $03, $03, $03, $03, $03, $03, $03
    92.         dc.b    $04, $04, $04, $04, $04, $04, $04, $04
    93.         dc.b    $04, $04, $04, $05, $05, $05, $05, $05
    94.         dc.b    $05, $05, $05, $05, $05, $06, $06, $06
    95.         dc.b    $06, $06, $06, $06, $06, $07, $07, $07
    96.         dc.b    $07, $07, $07, $08, $08, $08, $08, $08
    97.         dc.b    $08, $09, $09, $09, $09, $09, $09, $0A
    98.         dc.b    $0A, $0A, $0A, $0B, $0B, $0B, $0B, $0B
    99.         dc.b    $0C, $0C, $0C, $0C, $0D, $0D, $0D, $0D
    100.         dc.b    $0E, $0E, $0E, $0F, $0F, $0F, $0F, $10
    101.         dc.b    $10, $10, $11, $11, $11, $12, $12, $12
    102.         dc.b    $13, $13, $13, $14, $14, $14, $15, $15
    103.         dc.b    $16, $16, $16, $17, $17, $17, $18, $18
    104.         dc.b    $19, $19, $1A, $1A, $1A, $1B, $1B, $1C
    105.         dc.b    $1C, $1C, $1D, $1D, $1E, $1E, $1F, $1F
    106.  
    107. Log2Table:
    108.         dc.b    $00, $00, $1F, $32, $3F, $49, $52, $59
    109.         dc.b    $5F, $64, $69, $6E, $72, $75, $79, $7C
    110.         dc.b    $7F, $82, $84, $87, $89, $8C, $8E, $90
    111.         dc.b    $92, $94, $95, $97, $99, $9A, $9C, $9E
    112.         dc.b    $9F, $A0, $A2, $A3, $A4, $A6, $A7, $A8
    113.         dc.b    $A9, $AA, $AC, $AD, $AE, $AF, $B0, $B1
    114.         dc.b    $B2, $B3, $B4, $B5, $B5, $B6, $B7, $B8
    115.         dc.b    $B9, $BA, $BA, $BB, $BC, $BD, $BE, $BE
    116.         dc.b    $BF, $C0, $C0, $C1, $C2, $C2, $C3, $C4
    117.         dc.b    $C4, $C5, $C6, $C6, $C7, $C8, $C8, $C9
    118.         dc.b    $C9, $CA, $CA, $CB, $CC, $CC, $CD, $CD
    119.         dc.b    $CE, $CE, $CF, $CF, $D0, $D0, $D1, $D1
    120.         dc.b    $D2, $D2, $D3, $D3, $D4, $D4, $D5, $D5
    121.         dc.b    $D5, $D6, $D6, $D7, $D7, $D8, $D8, $D8
    122.         dc.b    $D9, $D9, $DA, $DA, $DA, $DB, $DB, $DC
    123.         dc.b    $DC, $DC, $DD, $DD, $DE, $DE, $DE, $DF
    124.         dc.b    $DF, $DF, $E0, $E0, $E0, $E1, $E1, $E1
    125.         dc.b    $E2, $E2, $E2, $E3, $E3, $E3, $E4, $E4
    126.         dc.b    $E4, $E5, $E5, $E5, $E6, $E6, $E6, $E7
    127.         dc.b    $E7, $E7, $E8, $E8, $E8, $E8, $E9, $E9
    128.         dc.b    $E9, $EA, $EA, $EA, $EA, $EB, $EB, $EB
    129.         dc.b    $EC, $EC, $EC, $EC, $ED, $ED, $ED, $ED
    130.         dc.b    $EE, $EE, $EE, $EE, $EF, $EF, $EF, $F0
    131.         dc.b    $F0, $F0, $F0, $F1, $F1, $F1, $F1, $F1
    132.         dc.b    $F2, $F2, $F2, $F2, $F3, $F3, $F3, $F3
    133.         dc.b    $F4, $F4, $F4, $F4, $F5, $F5, $F5, $F5
    134.         dc.b    $F5, $F6, $F6, $F6, $F6, $F7, $F7, $F7
    135.         dc.b    $F7, $F7, $F8, $F8, $F8, $F8, $F8, $F9
    136.         dc.b    $F9, $F9, $F9, $F9, $FA, $FA, $FA, $FA
    137.         dc.b    $FA, $FB, $FB, $FB, $FB, $FB, $FC, $FC
    138.         dc.b    $FC, $FC, $FC, $FD, $FD, $FD, $FD, $FD
    139.         dc.b    $FE, $FE, $FE, $FE, $FE, $FE, $FF, $FF
    140.  
    141. ; ------------------------------------------------------------------------------

    You could get away with using this variant when managing Sonic's collision in the air, since the usage for CalcAngle there is to detect the general direction Sonic is moving, which doesn't really need precision, since it only picks from the 4 cardinal directions. You'd just use the integer parts of the X and Y speeds (their high bytes) for the parameters instead of the full 16-bit values.

    I also updated the other one to use neg instead of not for getting the absolute value of the X and Y parameters.
     
    Last edited: Dec 7, 2023