So I'm writing my own 32X program and I'm just trying to test drawing to the framebuffer SH-2 side for now before actually implementing what I want. I load up a palette of all green (excep for color 0) and use the following code to write to the framebuffer, but instead of showing me 40 lines of green I still get a black screen. Code (Text): mov.l #me_fbbase,r0 mov.l #($1C0>>1),r1 mov.l #40,r2 - mov.w r1,@r0 dt r2 bf/s - add #2,r0 mov.l #me_fbline,r0 mov.l #1,r1 mov.l #320,r2 - mov.b r1,@r0 dt r2 bf/s - add #1,r0 mov.l #me_ffb,r0 MasterMain: jsr @r0 nop bra MasterMain nop me_fbbase: dc.l $24000000 me_fbline: dc.l $240001C0 me_initpal: dc.l initpal me_ffb: dc.l flipframebuf The code I use to initialize the VDP: Code (Text): initscreen: sts.l pr,@-r15 mov.l #$4000,r0; 32X has priority mov.l #$80,r1 mov.b r1,@r0 add #1,r0; VInt on mov.l #$8,r1 mov.b r1,@r0 mov.l #$4101,r0; 8bpp mode mov.l #1,r1 mov.b r1,@r0 rts lds.l @r15+,pr What am I doing wrong? Thanks.
When you set the line table, make sure to start at $100. Up to that is the line table itself, so each entry points to AFTER the line table. In C, I do Code (Text): // rewrite line table for (I=0; I<224; I++) frameBuffer16[I] = I*160 + 0x100; /* word offset of line */ Go to 240 is in 240 line mode. Also, make sure the frame buffer is finished flipping before trying to write to it, otherwise the writes won't be to the proper frame buffer. Here's my frame buffer init. Code (Text): // init both framebuffers // Flip the framebuffer selection bit and wait for it to take effect MARS_VDP_FBCTL = currentFB ^ 1; while ((MARS_VDP_FBCTL & MARS_VDP_FS) == currentFB); currentFB ^= 1; // rewrite line table for (I=0; I<224; I++) frameBuffer16[I] = I*160 + 0x100; /* word offset of line */ // clear screen for (I=0x100; I<0x10000; I++) frameBuffer16[I] = 0; // Flip the framebuffer selection bit and wait for it to take effect MARS_VDP_FBCTL = currentFB ^ 1; while ((MARS_VDP_FBCTL & MARS_VDP_FS) == currentFB); currentFB ^= 1; // rewrite line table for (I=0; I<224; I++) frameBuffer16[I] = I*160 + 0x100; /* word offset of line */ // clear screen for (I=0x100; I<0x10000; I++) frameBuffer16[I] = 0;
I don't know SH-2 assembly, and I couldn't find a reference to the instruction set when doing a search for "SH2 instruction set" so I'm going to make an ass of myself. But I've noticed two unusual things about your code and was wondering if you have a second and maybe could explain these things. First unusual thing I notice is the very end of your init screen method. If the rts instruction is returning then why do you have an lds instruction right after that? Isn't it never going to run? The second unusual thing I notice is the line "add #1,r0" right after your second loop in the first section. You're loading #me_ffb into r0 right after that, so it seems useless as well. Was the blank line supposed to imply that you omitted some code? This obviously doesn't answer any questions, so I hope the previous poster did.
From what I've read, there's like a delay before say a branch command runs, so the command after it runs just before branching (Or something like that), again I'm not sure meself so don't take my word as the correct reason why.
Thanks, however this time, half the time nothing shows up, and half the time CRAM is cleared. I have no idea what's going on at this point. I'm using as, and sometimes the r0 for the framebuffer pointers actually load weird values when dereferenced (for example, $00092400, which I assume is wordswapped because Gens's, and therefore Gens/GS's, 32X debugger is stupid). What's going on? Code (Text): CPU SH7600; true model is SH7095; but SH7000 lacks mul.l PHASE $6000000 PADDING OFF SUPMODE ON ; These are exported so the 68000 code can refer to them in the 32X header. SHARED MasterVec, SlaveVec, MasterEntry, SlaveEntry MasterStack equ $603F000 SlaveStack equ $603EF00 MasterVec: dc.l MasterEntry; Power on dc.l MasterStack dc.l MasterEntry; Manual reset dc.l MasterStack dc.l error; Illegal instruction dc.l 0 ; Reserved dc.l error; Invalid slot dc.l $20100400; Reserved dc.l $20100400; Reserved dc.l error; CPU address error dc.l error; DMA address error dc.l error; NMI dc.l error; User break dc.l 0 ; Reserved dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l error; Traps dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l irq ; IRQ 1 dc.l irq ; IRQ 2/3 dc.l irq ; IRQ 4/5 dc.l irq ; PWM interrupt dc.l irq ; Command interrupt dc.l irq ; H blank dc.l vint dc.l MasterEntry; Soft reset SlaveVec: dc.l SlaveEntry; Power on dc.l SlaveStack dc.l SlaveEntry; Manual reset dc.l SlaveStack dc.l error; Illegal instruction dc.l 0 ; Reserved dc.l error; Invalid slot dc.l $20100400; Reserved dc.l $20100400; Reserved dc.l error; CPU address error dc.l error; DMA address error dc.l error; NMI dc.l error; User break dc.l 0 ; Reserved dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l error; Traps dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l irq ; IRQ 1 dc.l irq ; IRQ 2/3 dc.l irq ; IRQ 4/5 dc.l irq ; PWM interrupt dc.l irq ; Command interrupt dc.l irq ; H blank dc.l vint dc.l SlaveEntry; Soft reset MasterEntry: mov.l #MasterStack,r15 bsr initscreen nop mov.l #me_initpal,r0 bsr loadpalette mov.l @r0,r0 bsr flipframebuf nop mov.l #me_fbline,r0 mov.l #1,r1 mov.l #320,r2 shll2 r2 - mov.b r1,@r0 dt r2 bf/s - add #1,r0 MasterMain: bra MasterMain nop me_initpal: dc.l initpal me_fbline: dc.l $24000100 LTORG initpal: dc.w $0000 REPT 255 dc.w $0B20 ENDM SlaveEntry: bra SlaveEntry nop error: bra error nop vint: rte nop irq: rte nop flipframebuf: sts.l pr,@-r15 mov.l r0,@-r15 mov.l r1,@-r15 mov.l #$410B,r0 mov.b @r0,r1 not r1,r1 mov.b r1,@r0 mov.l #$410A,r0 xor r1,r1 - mov.b @r0,r1 cmp/pl r1 bt - nop mov.l @r15+,r1 mov.l @r15+,r0 rts lds.l @r15+,pr initscreen: sts.l pr,@-r15 mov.l #$4000,r0; 32X has priority mov.l #$80,r1 mov.b r1,@r0 add #1,r0 ; VInt on mov.l #$8,r1 mov.b r1,@r0 mov.l #$4101,r0; 8bpp mode mov.l #1,r1 mov.b r1,@r0 REPT 2 bsr flipframebuf nop mov.l #is_fb,r0; set up line pointers mov.l @r0,r0 xor r1,r1 ; output address mov.l #224,r2 ; 224 lines mov.l #160,r3 ; output = (I * 160) + 0x100 mov.l #$100,r4 xor r5,r5 ; I - mul.l r5,r3 sts.l macl,r1 add r4,r1 mov.w r1,@r0 add #2,r0 dt r2 bf/s - add #1,r5 ENDM rts lds.l @r15+,pr is_fb: dc.l $24000000 loadpalette: sts.l pr,@-r15 mov.l r1,@-r15 mov.l r2,@-r15 mov.l r3,@-r15 mov.l #$4200,r1 mov.l #255,r2 - mov.w @r0+,r3 mov.w r3,@r1 dt r2 bf/s - add #2,r1 mov.l @r15+,r3 mov.l @r15+,r2 mov.l @r15+,r1 rts lds.l @r15+,pr DEPHASE Thanks. http://www.eidolons-inn.net/tiki-list_file...hp?galleryId=10 As MarkeyJester guessed, this is because of the delay pipeline; all jump instructions except bt and bf run the next instruction before the jump. Same deal. bt/s and bf/s are provided for delay slot handling.
Okay, one serious problem I just noticed - all hardware needs to be accessed through the cache-through region, I.e., offset by $20000000. It's not $4100, it's $20004100. Especially when reading registers as the second time you read $4100 (or similar address), it will come from the cache, not the hardware. Writing the frame buffer without $20000000 is okay as the SH2 is write-through caching, so the data goes to the memory, but this floods the cache, slowing the program as you need to reload the cache for code and other data. Try to writing large buffers (like video or audio) as cache-through instead of cached for better speed.
After fixing that, the program still doesn't work. I'm blaming as for this because of this register endian nonsense, as manually putting the value in with mov/shll16/shll8 yields the right value. So can I do something like SHARED in asmsh, where it exports the addresses of the shared labels to an include file so I can put them in the 32X header 68000 side? Thanks.
Like everything but the Z80, the SH2 and 32X hardware is all big-endian. Sometimes that throws PC programmers. I never think about it since I'm used to it. When setting the line table, rather than multiply r5 and r3, just add r3 to r1. Not an error, just more optimal, and easy to do in assembly. Rather than Code (Text): mov.l #is_fb,r0; set up line pointers mov.l @r0,r0 Just do Code (Text): mov.l is_fb,r0; set up line pointers As long as the variable is close to the code, it's not a problem. I noticed you did a lot of similar moves that aren't necessary. You also commonly save pr and various registers when it isn't necessary. Your frame buffer flip also looks wrong. Read/write FBCTL as a word, and just XOR the word with 1 to flip which frame you want. Look at my C code... load the current framebuffer (a word), set another register to that XOR 1, store the new framebuffer (as a word) to FBCTL, then wait until the register (read a s a word) is no longer the same as the original value. More like this... Code (Text): flipframebuf: mov.l MARS_VDP_FBCTL,r2 mov.w @r2,r1 mov r1,r0 xor #1,r0 mov.w r0,@r2 - mov.w @r2,r0 cmp/eq r1,r0 bt - nop rts nop MARS_VDP_FBCTL: dc.l $2000410A
No problem. Be sure to ask if you run into anything else. There are not many devs out there working on the 32X, so we gotta look out for each other.
Okay, I dunno what I'm doing wrong this time: Code (Text): CPU SH7600; true model is SH7095; but SH7000 lacks mul.l PHASE $6000000 PADDING OFF SUPMODE ON ; These are exported so the 68000 code can refer to them in the 32X header. SHARED MasterVec, SlaveVec, MasterEntry, SlaveEntry MasterStack equ $603F000 SlaveStack equ $603EF00 MasterVec: dc.l MasterEntry; Power on dc.l MasterStack dc.l MasterEntry; Manual reset dc.l MasterStack dc.l error; Illegal instruction dc.l 0; Reserved dc.l error; Invalid slot dc.l $20100400; Reserved dc.l $20100400; Reserved dc.l error; CPU address error dc.l error; DMA address error dc.l error; NMI dc.l error; User break dc.l 0; Reserved dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l error; Traps dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l irq; IRQ 1 dc.l irq; IRQ 2/3 dc.l irq; IRQ 4/5 dc.l irq; PWM interrupt dc.l irq; Command interrupt dc.l irq; H blank dc.l vint dc.l MasterEntry; Soft reset SlaveVec: dc.l SlaveEntry; Power on dc.l SlaveStack dc.l SlaveEntry; Manual reset dc.l SlaveStack dc.l error; Illegal instruction dc.l 0; Reserved dc.l error; Invalid slot dc.l $20100400; Reserved dc.l $20100400; Reserved dc.l error; CPU address error dc.l error; DMA address error dc.l error; NMI dc.l error; User break dc.l 0; Reserved dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l 0 dc.l error; Traps dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l error dc.l irq; IRQ 1 dc.l irq; IRQ 2/3 dc.l irq; IRQ 4/5 dc.l irq; PWM interrupt dc.l irq; Command interrupt dc.l irq; H blank dc.l vint dc.l SlaveEntry; Soft reset MasterEntry: mov.l #MasterStack,r15 bsr initscreen nop mov.l #me_initpal,r0 bsr loadpalette mov.l @r0,r0 ptc macro x,y,z mov.l #y,r1 mov.l #z,r2 bsr pt3topt mov.l #x,r0 mov.l r0,@(0,r13) mov.l r1,@(4,r13) add #8,r13 endm box macro x1,y1,z1,x2,y2,z2,color mova pbuf,r0 mov.l r0,r13 ptc x1,y1,z1 ptc x2,y1,z1 ptc x1,y1,z1 ptc x1,y2,z1 ptc x1,y2,z1 ptc x2,y2,z1 ptc x2,y1,z1 ptc x2,y2,z1 ptc x1,y1,z2 ptc x2,y1,z2 ptc x1,y1,z2 ptc x1,y2,z2 ptc x1,y2,z2 ptc x2,y2,z2 ptc x2,y1,z2 ptc x2,y2,z2 ptc x1,y1,z1 ptc x1,y1,z2 ptc x1,y2,z1 ptc x1,y2,z2 ptc x2,y1,z1 ptc x2,y1,z2 ptc x2,y2,z1 ptc x2,y2,z2 mova pbuf,r0 mov.l r0,r13 mov.l #12,r14 - mov.l @r13+,r0 mov.l @r13+,r1 mov.l @r13+,r2 mov.l @r13+,r3 bsr drawline mov.l #color,r4 dt r14 bf - nop endm bsr flipframebuf nop box $20,$20,$0, $40,$40,$7F, 1 box $20,$20,$0, $40,$40,$7F, 2 MasterMain: bra MasterMain nop pbuf: dc.l 0,0, 0,0; x1->x2 y1 z1 dc.l 0,0, 0,0; x1 y1->y2 z1 dc.l 0,0, 0,0; x1->x2 y2 z1 dc.l 0,0, 0,0; x2 y1->y2 z1 dc.l 0,0, 0,0; x1->x2 y1 z2 dc.l 0,0, 0,0; x1 y1->y2 z2 dc.l 0,0, 0,0; x1->x2 y2 z2 dc.l 0,0, 0,0; x2 y1->y2 z2 dc.l 0,0, 0,0; x1 y1 z1->z2 dc.l 0,0, 0,0; x2 y1 z1->z2 dc.l 0,0, 0,0; x1 y2 z1->z2 dc.l 0,0, 0,0; x2 y2 z1->z2 me_fbline: dc.l $24000200 me_initpal: dc.l initpal LTORG initpal: dc.w $0000 dc.w $01B4 REPT 254 dc.w $0B20 ENDM SlaveEntry: bra SlaveEntry nop error: bra error nop vint: rte nop irq: rte nop flipframebuf: sts.l pr,@-r15 mov.l r0,@-r15 mov.l r1,@-r15 mov.l ffb_fbctl,r1 mov.w @r1,r0 xor #1,r0 mov.w r0,@r1 - mov.w @r1,r14 cmp/eq r0,r14 bt - mov.l @r15+,r1 mov.l @r15+,r0 rts lds.l @r15+,pr waitfb: sts.l pr,@-r15 mov.l r0,@-r15 mov.l r1,@-r15 mov.l ffb_fbctl,r1 - mov.w @r1,r0 tst #2,r0 bf - mov.l @r15+,r1 mov.l @r15+,r0 lds.l @r15+,pr rts nop ffb_fbctl: dc.l $2000410A LTORG initscreen: sts.l pr,@-r15 mov.l is_r0,r0; 32X has priority mov.l #$80,r1 mov.b r1,@r0 mov.l is_r1,r0; 8bpp mode mov.l #81,r1 mov.b r1,@r0 REPT 2 bsr flipframebuf nop mov.l is_fb,r0; set up line pointers mov.l #224,r2; 224 lines mov.l #160,r3; output = (I * 160) + 0x100 mov.l #$100,r1; (160 because 320>>1==160) - mov.w r1,@r0 add r3,r1 dt r2 bf/s - add #2,r0 ENDM lds.l @r15+,pr rts nop is_r0: dc.l $20004000 is_r1: dc.l $20004101 is_fb: dc.l $24000000 LTORG loadpalette: sts.l pr,@-r15 mov.l r1,@-r15 mov.l r2,@-r15 mov.l r3,@-r15 mov.l lp_cram,r1 mov.l #256,r2 - mov.w @r0+,r3 mov.w r3,@r1 dt r2 bf/s - add #2,r1 mov.l @r15+,r3 mov.l @r15+,r2 mov.l @r15+,r1 rts lds.l @r15+,pr lp_cram: dc.l $20004200 LTORG ; drawpoint (r0 r1 r2 --) ; draw color r2 at point (r0,r1) drawpoint: sts.l pr,@-r15 mov.l r3,@-r15 mov.l r4,@-r15 mov.l r5,@-r15 mov.l dp_fbline,r3 mov.l #320,r4 mul.l r4,r1 sts.l macl,r5 add r0,r5 add r5,r3 mov.b r2,@r3 mov.l @r15+,r5 mov.l @r15+,r4 mov.l @r15+,r3 lds.l @r15+,pr rts nop dp_fbline: dc.l $24000200 LTORG ; drawline (r0 r1 r2 r3 r4 -- ) ; draws a line using an optimized Bresenham's algorithm from (r0,r1) to (r2,r3) ; in color r4 drawline: sts.l pr,@-r15 mov.l r6,@-r15; dx mov.l r7,@-r15; dy mov.l r8,@-r15; error mov.l r9,@-r15; y increment for framebuffer (either 320 or -320) mov.l r10,@-r15; framebuffer pointer mov.l r11,@-r15; steep; x increment ; drawing a point? cmp/eq r0,r2 bf dl_doline cmp/eq r1,r3 bf dl_doline bra drawpoint mov.l r4,r2 ; STEEP dl_doline: mov.l r2,r6; steep = abs(y1 - y0) > abs(x1 - x0) sub r0,r6 cmp/pz r6 bt + neg r6,r6 + mov.l r3,r7 sub r1,r7 cmp/pz r7 bt + neg r7,r7 + cmp/gt r6,r7 movt r11 bf +; if (steep) { swap(x0, y0); swap(x1, y1); } mov.l r0,r6 mov.l r1,r0 mov.l r6,r1 mov.l r2,r6 mov.l r3,r2 mov.l r6,r3 ; x0 > x1 + cmp/gt r2,r0; if (x0 > x1) { swap(x0,x1); swap(y0,y1); } bf + mov.l r0,r9 mov.l r2,r0 mov.l r9,r2 mov.l r1,r9 mov.l r3,r1 mov.l r9,r3 ; VARIABLE INIT + mov.l r2,r6; dx = x1 - x0 sub r0,r6 mov.l r3,r7; dy = y1 - y0 sub r1,r7 mov.l r6,r8; error = dx / 2 shlr r8 cmp/pz r7; dy = abs(dy) bt + neg r7,r7 ; STEEP INITIALIZE FRAMEBUFFER + tst r11,r11; steep? bt dl_shallow; no, go to shallow mov.l r0,r9; 320 * y0 == (y0 << 8) + (y0 << 6) mov.l r0,r10 shll8 r9 shll8 r10; y0 << 6 == (y0 << 8) >> 2 (and [0,224) is small enough) shlr2 r10 add r10,r9 mov.l dl_fbline,r10 add r9,r10 add r1,r10 mov.l #320,r11; x increment bra dl_lastsetup mov.l #1,r9 ; SHALLOW INITIALIZE FRAMEBUFFER dl_shallow: mov.l r1,r9; 320 * y0 == (y0 << 8) + (y0 << 6) mov.l r1,r10 shll8 r9 shll8 r10; y0 << 6 == (y0 << 8) >> 2 (and [0,224) is small enough) shlr2 r10 add r10,r9 mov.l dl_fbline,r10 add r9,r10 add r0,r10 mov.l #1,r11; x increment mov.l #320,r9 dl_lastsetup: cmp/gt r3,r1; if (y0 > y1) ystep = -ystep bf dl_loop neg r9,r9 dl_loop: mov.b r4,@r10; drawpoint(x0, y0, color) sub r7,r8; error -= dy cmp/pz r8; if (error < 0) { bt + add r9,r10; y0++ add r6,r8; error += dx ; } + add r11,r10; x0++ add #1,r0 cmp/eq r0,r2; if (x0 != x1) goto loop bf dl_loop mov.l @r15+,r11 mov.l @r15+,r10 mov.l @r15+,r9 mov.l @r15+,r8 mov.l @r15+,r7 mov.l @r15+,r6 rts lds.l @r15+,pr dl_fbline: dc.l $24000200 LTORG focaldist: dc.l 600 ; pt3topt (r0 r1 r2 -- r0 r1) ; convert 3D point (r0,r1,r2) to 2D point (r0,r1) ; r2 is destroyed ; thanks to Jorge for floating point to integer migration help (it's based on old C code I wrote that used double) pt3topt: sts.l pr,@-r15 mov.l r4,@-r15 mov.l r5,@-r15 mov.l p3_const,r4 add r4,r2 tst r2,r2 bf + mov #1,r2 + mov.l p3_fdist,r4; x = x3 * focaldist mov.l @r4,r4 clrmac mul.l r0,r4 sts.l macl,r0 clrmac ; y = y3 * focaldist mul.l r1,r4 sts.l macl,r1 mov.l p3_div,r5 mov.l r2,@(0,r5); x /= z3 mov.l r0,@(4,r5) mov.l @(4,r5),r0 mov.l r2,@(0,r5); y /= z3 mov.l r1,@(4,r5) mov.l @(4,r5),r1 mov.l @r15+,r5 mov.l @r15+,r4 rts lds.l @r15+,pr nop p3_const: dc.l 600 p3_div: dc.l $FFFFFF00 p3_fdist: dc.l focaldist LTORG DEPHASE What I see: I'm expecting there to be a green cube completely overlapping the orange one. What am I doing wrong now? And yes I know my method might be inefficient; I'm just trying to get it to work =P
You know you defined a number of macros right in the middle of the master SH2 code? You should probably move all your macros to the start of the file, or to another file altogether. Also, use 0x06040000 for the slave stack pointer, and 0x0603FFF0 for the master stack pointer. Since you don't use the slave, you don't really need (much of) a stack for it, and always put the master stack below the slave since the master sh2 will almost certainly use far more stack space than the slave, even if you eventually use the slave. I normally use 0x06040000 for the slave, and 0x0603FF00 for the master. As to your problem, I'm not sure how you're seeing anything at all: you flip the frame buffer, do two box macros, then wait forever WITHOUT flipping the frame buffer again! My guess is your wait for the frame buffer to finish flipping is failing, so you immediately return before the frame actually flips. You then have time to draw one box before the frame DOES flip, then the second box is drawn into the other frame buffer, which is never shown. Remember that the frame you draw into is not the visible frame. You have to flip the frame buffer after drawing to see what you just drew.
Yes, right now I'm just trying to get my method to work; when it does work I'll change it to a regular function and get rid of the macros and separate the slave code. Thank you, however when I added a flipframebuf call after the box macros, I got a blank screen. If I kept flipping frame buffers constantly, I didn't get an image change. The only way I actually got the image I wanted was with this: Code (Text): MasterMain: box $20,$20,$0, $40,$40,$7F, 1 box $20,$20,$0, $40,$40,$7F, 2 bsr flipframebuf nop bra MasterMain nop however the image now flashes, which I don't want. I tried both the flipframebuf I had and the one you posted; both have the same effect.
Okay, let's see if we can't spif up that flip routine. Code (Text): flipframebuf: mov.l ffb_fbctl,r1 mov.w @r1,r0 xor #1,r0 mov.w r0,@r1 - mov.w @r1,r14 cmp/eq r0,r14 bt - Here's your first bug. cmp/eq bt means it will branch if they are the same... which won't be the case since you only JUST changed it! They will be NOT equal until the next vertical blank when the value you wrote will take affect. Remember, the read value is ALWAYS the current setting - not what you wrote last. So you write a new value, read will return the old value until the next vblank, THEN it returns the new value. So the loop there needs to be bf, not bt. That will make it wait until the page is actually flipped. Oh, you should also either AND off all but bit 1 and 0, or read the low byte of the reg as a byte. That way the HBLANK/VBLANK status bits don't interfere with the comparison. Come to think of it, I need to do that myself. :v:
Ah, I see now. I also see that I was still doing the wrong compare — I was supposed to compare the value I wrote to the current register, not the old value. So yeah, thanks again.