Sonic and Sega Retro Message Board: Sonic 2 Split Disassembly - Sonic and Sega Retro Message Board

Jump to content

Hey there, Guest!  (Log In · Register) Help
  • 8 Pages +
  • 1
  • 2
  • 3
  • Last ►
    Locked
    Locked Forum

Sonic 2 Split Disassembly

#1 User is offline FraGag 

Posted 05 October 2008 - 11:06 PM

  • Posts: 659
  • Joined: 09-January 08
  • Gender:Male
  • Location:Québec, Canada
  • Project:an assembler
  • Wiki edits:6
Xenowhirl's Sonic 2 split disassembly is now up on the SVN server, as Scarred Sun announced. This will allow everyone to improve the disassembly, but keep in mind that the built ROM must be identical to Sonic the Hedgehog 2 REV01. The ROM is included in the repository (named s2rev01.bin), and a batch file (chkbitperfect.bat), which uses this file, is provided to allow you to check if you didn't do anything wrong. Please, do not commit if the built ROM is not identical to Sonic 2 REV01!

Before committing, please leave a comment about what you changed in changelog.txt. You can also put the message in the message field when committing your changes.

Another reason I've created this topic was to gather opinions if someone wants to change something, but doesn't know if it will be well received. You don't need to get approbation for every change, but if you're unsure, just ask here. Also, if you want to revert someone's changes, please ask here first, unless what you want to revert is vandalism.

So, I'll go first! I've changed build.bat to make AS output the errors to a separate file (s2.log) using the -E switch, instead of outputting them in the console. AS will delete the file if there are no errors, so it is possible to test the presence of this file to determine if there were build errors/warnings. Having the errors in a file makes it easy to see all the errors, because the console may cut off the start of the errors if there are too many. It will also make it easier for starters to share their errors. Is this a good idea?

Another change I'd like to see is having the code split to several .asm files. If you've worked with the modified version of Hivebrain's Sonic 1 disassembly targetting AS, you may have remarked that when the messages appear, they can be lost (because the console's output is cut off) when there are a lot of files processed after that error. I believe outputting the error messages to a separate file will solve this problem and allow us to split the code without too many worries. What do you think?

#2 User is offline shobiz 

Posted 06 October 2008 - 03:30 AM

  • Posts: 863
  • Joined: 27-March 05
  • Gender:Male
  • Location:Karachi, Pakistan
  • Wiki edits:4,411
As I said yesterday on IRC, this is awesome stuff. One question I had in mind was that what naming scheme should we follow for new or renamed names? Hive's 2005 scheme is pretty inconsistent, as is Xeno's, so it can be a bit of a hassle thinking of a new name. Another thing to note is that when changing labels, it's a good idea to note the old label name in comment form above it (as done already in many places by Xenowhirl), so that anyone searching for the old name can figure out the new one easily. (I forgot to do this myself yesterday, I'll fix that now).

View PostFraGag, on Oct 6 2008, 10:06 AM, said:

Another change I'd like to see is having the code split to several .asm files. If you've worked with the modified version of Hivebrain's Sonic 1 disassembly targetting AS, you may have remarked that when the messages appear, they can be lost (because the console's output is cut off) when there are a lot of files processed after that error. I believe outputting the error messages to a separate file will solve this problem and allow us to split the code without too many worries. What do you think?

The thing I dislike about splitting it like that is it makes searching much harder - for example if you want to find out where a RAM variable is used, it's far easier to do that when you have one large ASM file rather than multiple split ones. On the other hand, it makes editing much easier, so I'm pretty undecided.

View PostFraGag, on Oct 6 2008, 10:06 AM, said:

So, I'll go first! I've changed build.bat to make AS output the errors to a separate file (s2.log) using the -E switch, instead of outputting them in the console.

The version of build.bat at the SVN doesn't have that change in it though :S

EDIT: Nvm, I modified it. I also added an if condition so that if the first parameter is -pe (I.e. you run it as build -pe or build.bat -pe) the errors are printed out on screen rather than being sent to file. chkbitperfect.bat has also been modified to support this parameter.

#3 User is offline shobiz 

Posted 06 October 2008 - 11:04 AM

  • Posts: 863
  • Joined: 27-March 05
  • Gender:Male
  • Location:Karachi, Pakistan
  • Wiki edits:4,411
Double posting because I noticed this in qiuu's log message:

Quote

A few propositions on renaming RAM address equates:
Metablock_Table -> ChunkTable (I think chunk is more commonly used nowadays)
BG_X_pos -> Camera_BG_X_pos
BG_Y_pos -> Camera_BG_Y_pos

I'm fine with all three, except with Chunk_Table instead of ChunkTable for consistency with all the other equates. What do you guys think?

#4 User is offline Scarred Sun 

Posted 06 October 2008 - 12:35 PM

  • In Defense of Lost Causes
  • Posts: 3726
  • Joined: 06-February 05
  • Gender:Female
  • Location:SD/LA/SF
  • Project:Staying woke
  • Wiki edits:36,091

View Postshobiz, on Oct 6 2008, 03:30 AM, said:

View PostFraGag, on Oct 6 2008, 10:06 AM, said:

Another change I'd like to see is having the code split to several .asm files. If you've worked with the modified version of Hivebrain's Sonic 1 disassembly targetting AS, you may have remarked that when the messages appear, they can be lost (because the console's output is cut off) when there are a lot of files processed after that error. I believe outputting the error messages to a separate file will solve this problem and allow us to split the code without too many worries. What do you think?

The thing I dislike about splitting it like that is it makes searching much harder - for example if you want to find out where a RAM variable is used, it's far easier to do that when you have one large ASM file rather than multiple split ones. On the other hand, it makes editing much easier, so I'm pretty undecided.



I'm not sure how feasible this would be for building purposes, but would it be possible to just put a switch in split.bat to allow for both options?

#5 User is offline Sik 

Posted 06 October 2008 - 12:38 PM

  • Sik is pronounced as "seek", not as "sick".
  • Posts: 6719
  • Joined: 17-March 06
  • Gender:Male
  • Project:being an asshole =P
  • Wiki edits:11
It should be possible. I know how to deal with both command line parameters and asking for simple input (e.g. one key stroke) in batch files.

#6 User is offline FraGag 

Posted 06 October 2008 - 04:08 PM

  • Posts: 659
  • Joined: 09-January 08
  • Gender:Male
  • Location:Québec, Canada
  • Project:an assembler
  • Wiki edits:6

View Postshobiz, on Oct 6 2008, 04:30 AM, said:

View PostFraGag, on Oct 6 2008, 10:06 AM, said:

So, I'll go first! I've changed build.bat to make AS output the errors to a separate file (s2.log) using the -E switch, instead of outputting them in the console.

The version of build.bat at the SVN doesn't have that change in it though :S

EDIT: Nvm, I modified it.

It was a suggestion :P. That's why I didn't commit it yet. I also added a check to see if the log file exists, I'll (merge and) commit when I'm home.

View Postshobiz, on Oct 6 2008, 12:04 PM, said:

Double posting because I noticed this in qiuu's log message:

Quote

A few propositions on renaming RAM address equates:
Metablock_Table -> ChunkTable (I think chunk is more commonly used nowadays)
BG_X_pos -> Camera_BG_X_pos
BG_Y_pos -> Camera_BG_Y_pos

I'm fine with all three, except with Chunk_Table instead of ChunkTable for consistency with all the other equates. What do you guys think?

That's good for me.

Another thing I've been wondering about is how the objects are labelled. Currently, objects are identified with their object ID. I think it's easier to remember names rather than numbers, so I suggest we rename e.g. Obj01 to Sonic, Obj25 to Ring, Obj26 to Monitor, etc., and all labels starting with these names. Stealth's Sonic & Knuckles disassembly uses a similar approach because some objects are not in an object list, but that doesn't mean we should not adopt this idea.

About splitting the code, I agree that it makes searching more difficult. At the very least, I think that we could put the named constants in a separate file, so when one opens s2.asm, he sees the start of the code almost right away. Another advantage I see is that, when giving a name to a RAM variable, I usually add the constant, then want to do a global search & replace in the whole document, but that would also rename the value in the constant declaration itself (giving something like X = ramaddr( X )), and by having the constants in a separate file, searching & replacing wouldn't replace the value of the constant with the name of the constant. Shall we do this?

#7 User is offline shobiz 

Posted 07 October 2008 - 03:26 AM

  • Posts: 863
  • Joined: 27-March 05
  • Gender:Male
  • Location:Karachi, Pakistan
  • Wiki edits:4,411

View PostFraGag, on Oct 7 2008, 03:08 AM, said:

Another thing I've been wondering about is how the objects are labelled. Currently, objects are identified with their object ID. I think it's easier to remember names rather than numbers, so I suggest we rename e.g. Obj01 to Sonic, Obj25 to Ring, Obj26 to Monitor, etc., and all labels starting with these names. Stealth's Sonic & Knuckles disassembly uses a similar approach because some objects are not in an object list, but that doesn't mean we should not adopt this idea.

I agree, though I'd prefer ObjSonic or Obj_Sonic over just Sonic.

View PostFraGag, on Oct 7 2008, 03:08 AM, said:

At the very least, I think that we could put the named constants in a separate file, so when one opens s2.asm, he sees the start of the code almost right away. Another advantage I see is that, when giving a name to a RAM variable, I usually add the constant, then want to do a global search & replace in the whole document, but that would also rename the value in the constant declaration itself (giving something like X = ramaddr( X )), and by having the constants in a separate file, searching & replacing wouldn't replace the value of the constant with the name of the constant. Shall we do this?

I've encountered that a shitload of times, so definitely yes.

#8 User is offline FraGag 

Posted 07 October 2008 - 11:32 PM

  • Posts: 659
  • Joined: 09-January 08
  • Gender:Male
  • Location:Québec, Canada
  • Project:an assembler
  • Wiki edits:6

View Postshobiz, on Oct 7 2008, 04:26 AM, said:

View PostFraGag, on Oct 7 2008, 03:08 AM, said:

I think that we could put the named constants in a separate file, so when one opens s2.asm, he sees the start of the code almost right away. Another advantage I see is that, when giving a name to a RAM variable, I usually add the constant, then want to do a global search & replace in the whole document, but that would also rename the value in the constant declaration itself (giving something like X = ramaddr( X )), and by having the constants in a separate file, searching & replacing wouldn't replace the value of the constant with the name of the constant. Shall we do this?

I've encountered that a shitload of times, so definitely yes.

Done. The constants are now in s2.constants.asm.

View Postshobiz, on Oct 7 2008, 04:26 AM, said:

I agree, though I'd prefer ObjSonic or Obj_Sonic over just Sonic.

Stealth used the latter format in his Sonic & Knuckles disassembly. I think it's a good compromise, but some objects may be hard to describe and end with very long names. However, I'd like to hear more opinions on this before we start renaming.

Speaking of long names, do we want to keep the 15 characters limit on the labels? I think it doesn't make much sense to keep this limit since most of the code is indented with 1 tabulation. I restricted myself to 23 characters when labelling the object pointers in Obj_Index to avoid making the lines too long, but if nobody objects, we could give longer names instead of sometimes cryptic abbreviations that are not very used. For example, I think we should put "Horizontal" instead of "Horiz" and "Vertical" instead of "Vert," but we can keep common abbreviations like "Pal," Obj" and "Ptr."

#9 User is offline Sik 

Posted 07 October 2008 - 11:48 PM

  • Sik is pronounced as "seek", not as "sick".
  • Posts: 6719
  • Joined: 17-March 06
  • Gender:Male
  • Project:being an asshole =P
  • Wiki edits:11

View PostFraGag, on Oct 8 2008, 02:32 AM, said:

View Postshobiz, on Oct 7 2008, 04:26 AM, said:

View PostFraGag, on Oct 7 2008, 03:08 AM, said:

I think that we could put the named constants in a separate file, so when one opens s2.asm, he sees the start of the code almost right away. Another advantage I see is that, when giving a name to a RAM variable, I usually add the constant, then want to do a global search & replace in the whole document, but that would also rename the value in the constant declaration itself (giving something like X = ramaddr( X )), and by having the constants in a separate file, searching & replacing wouldn't replace the value of the constant with the name of the constant. Shall we do this?

I've encountered that a shitload of times, so definitely yes.

Done. The constants are now in s2.constants.asm.

View Postshobiz, on Oct 7 2008, 04:26 AM, said:

I agree, though I'd prefer ObjSonic or Obj_Sonic over just Sonic.

Stealth used the latter format in his Sonic & Knuckles disassembly. I think it's a good compromise, but some objects may be hard to describe and end with very long names. However, I'd like to hear more opinions on this before we start renaming.

Speaking of long names, do we want to keep the 15 characters limit on the labels? I think it doesn't make much sense to keep this limit since most of the code is indented with 1 tabulation. I restricted myself to 23 characters when labelling the object pointers in Obj_Index to avoid making the lines too long, but if nobody objects, we could give longer names instead of sometimes cryptic abbreviations that are not very used. For example, I think we should put "Horizontal" instead of "Horiz" and "Vertical" instead of "Vert," but we can keep common abbreviations like "Pal," Obj" and "Ptr."

I guess the 15 character limit is so to make sure it's compatible with old assemblers (yet, if that was the case, we should restrict it to 8 characters instead). But none of snasm68k, asm68k and asmx have such a limit, so it's really pointless. I'd go with a 31 character limit. Object names don't need to be so descriptive either, abbreviations can be tolerated.

Also, I use H and V rather than Horiz and Vert, for the win =P

#10 User is online Qjimbo 

Posted 08 October 2008 - 01:29 AM

  • Your friendly neighbourhood lemming.
  • Posts: 4422
  • Joined: 17-February 03
  • Gender:Male
  • Location:Vancouver
  • Wiki edits:69
Will versions of this in non-SVN form be available periodically on the Disassemblies page or will users be required to install an SVN client to get a recent version?
This post has been edited by Qjimbo: 08 October 2008 - 01:30 AM

#11 User is offline Puto 

Posted 08 October 2008 - 01:52 AM

  • Shin'ichi Kudō, detective.
  • Posts: 2012
  • Joined: 31-July 05
  • Gender:Male
  • Location:Portugal, Oeiras
  • Project:Part of Team Megamix, but haven't done any actual work in ages.
  • Wiki edits:51
Personally I think that problem would be non-existant if the option to download a tarball of the entire repository was turned on in the WebSVN config...

#12 User is offline shobiz 

Posted 08 October 2008 - 03:51 AM

  • Posts: 863
  • Joined: 27-March 05
  • Gender:Male
  • Location:Karachi, Pakistan
  • Wiki edits:4,411

View PostFraGag, on Oct 8 2008, 10:32 AM, said:

Speaking of long names, do we want to keep the 15 characters limit on the labels? I think it doesn't make much sense to keep this limit since most of the code is indented with 1 tabulation. I restricted myself to 23 characters when labelling the object pointers in Obj_Index to avoid making the lines too long, but if nobody objects, we could give longer names instead of sometimes cryptic abbreviations that are not very used. For example, I think we should put "Horizontal" instead of "Horiz" and "Vertical" instead of "Vert," but we can keep common abbreviations like "Pal," Obj" and "Ptr."

I had no idea we even had a limit. About abbreviations, personally I think Horiz and Verti/Vert are not that hard to understand, but I don't really mind either way.

#13 User is offline FraGag 

Posted 08 October 2008 - 07:33 AM

  • Posts: 659
  • Joined: 09-January 08
  • Gender:Male
  • Location:Québec, Canada
  • Project:an assembler
  • Wiki edits:6
My main concern about label names is that if we try to give every label a significant name, we'll have to use quite long names in some places.

shobiz, the 15 characters limit comes from IDA, who suggests this by default, although it can be turned off. 31 characters is already better, but I don't like the idea of having a "conventional" limit if we can technically use longer labels. Obviously, giving very long names to labels that are use very frequently is not desirable. Maybe we could use more temporary symbols? Nameless temporary symbols are already used, but maybe we could try using named temporary symbols in large blocks of code like objects, where some labels are only referenced within that object, but using nameless temporary symbols wouldn't work. Also, when replacing labels like "loc_ABC" to temporary symbols, should we really keep the old label? I know some guides reference a few of them, so it may be a bit difficult to determine if some label should be kept or not.

#14 User is offline Sik 

Posted 08 October 2008 - 07:45 AM

  • Sik is pronounced as "seek", not as "sick".
  • Posts: 6719
  • Joined: 17-March 06
  • Gender:Male
  • Project:being an asshole =P
  • Wiki edits:11
The problem about removing character limits is that assemblers may not like it. I said 31 because it seems coherent to me; in C, the standard defines that compilers should at least remember the first 31 characters of an identifier name, and I don't think this situation may be so different in assemblers. So I wouldn't go over that limit or we run the risk of it not working on assemblers for real.

#15 User is offline ICEknight 

Posted 08 October 2008 - 09:07 AM

  • Posts: 10633
  • Joined: 11-January 03
  • Gender:Male
  • Location:Spain
  • Wiki edits:18

View Postshobiz, on Oct 7 2008, 04:26 AM, said:

I agree, though I'd prefer ObjSonic or Obj_Sonic over just Sonic.
Or obj01_Sonic, so you don't have to look for its object number somewhere else?

Also, having the object number in there will allow to put certain objects as obj45_unk, obj46_unk, etc, when they're unknown, without reusing the same name.
This post has been edited by ICEknight: 08 October 2008 - 09:07 AM

  • 8 Pages +
  • 1
  • 2
  • 3
  • Last ►
    Locked
    Locked Forum

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users