don't click here

Writing portable C

Discussion in 'Technical Discussion' started by BenoitRen, Mar 1, 2024.

  1. Glitch

    Glitch

    Tech Member
    176
    13
    18
    My point was that "portable" isn't a binary thing. Sure, you could write C code that will run on anything from a PDP11, through an Android smartphone, up to the latest supercomputers, but do you really need to do that? By being pragmatic about what platforms you're actually targeting you can write code that is portable within that feature set, giving yourself and the compiler more opportunities to optimise for the platforms that 99% of your users will be using.

    You can't compile the linux kernel with a Sun C compiler. Does that mean it's not portable?
     
    • Like Like x 3
    • Agree Agree x 1
    • List
  2. synchronizer

    synchronizer

    Member
    2,278
    103
    43
    I agree with Glitch. At some point you need to be realistic and target a class of platforms you actually care about. I think the real thing that is a problem is a lack of open-source / lack of maintenance of code-bases. Just because you use a more portable int type, doesn't mean you're going to recompile it for everyone. The code still needs to be available.
    —and this is why typedefs are useful.
     
  3. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    I must admit that giving the compiler more opportunities to optimise is a good point.

    That being said, I'm still not convinced when it comes to fixed width integer types. Leaving aside the fact that they are optional, on Windows MSVC has been notorious for being slow to adopt C99. The first version to have stdint.h (which adds the types) is Visual Studio 2015! That means there are seven versions of the compiler out there (since C99) that do not have the fixed-width types (at least not out of the box).

    Now, I'm sure some of you will tell me "but those are old!". Sure, but 1) we're on Sega Retro ;), and 2) I'd like my code to be as backwards compatible on Windows as possible. Ideally as far back as Windows 95. Given that Sonic & Knuckles Collection was made for it, I don't think that's a strange idea.
     
  4. MainMemory

    MainMemory

    Kate the Wolf Tech Member
    4,785
    367
    63
    SonLVL
    If you're planning on targeting those specific compilers, you could add checks for those versions and define your own integer types instead of using stdint.h:
    Code (Text):
    1. #ifdef _MSC_VER
    2. #if _MSC_VER < 1900
    3. typedef unsigned char uint8_t;
    4. /* ... */
    5. #else
    6. #include "stdint.h"
    7. #endif
    8. #else
    9. #include "stdint.h"
    10. #endif
     
    • Like Like x 2
    • Agree Agree x 2
    • List
  5. Clownacy

    Clownacy

    Tech Member
    1,094
    668
    93
    I disagree that 'portable' is not binary. Though, perhaps 'portable' is just too vague a word. My preference is to write code that adheres to the C standard, invoking no undefined behaviour nor relying on optional features. That way, my code is universally portable to any platform that provides a standards-compliant C implementation. To me, if it is written in C, but cannot be compiled with a C compiler, then it is not portable C.

    Regardless, whether you disagree with BenoitRen's definition of 'portable' or not, this thread is about writing code that adheres to that standard of portability, not whether code should adhere to it in the first place. This is the typical StackOverflow behaviour of commenting on the question itself rather than answering it.
     
    • Agree Agree x 3
    • Like Like x 1
    • List
  6. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    I now have a case where I have to copy the upper 16 bits of a signed long into the upper 16 bits of another signed long. Thinking on this, I've come up with the following code:
    Code (Text):
    1. pNewActwk->xposi = pActwk->xposi / 65536 * 65536;
    It's not ideal, because the lower 16 bits of the destination will be overwritten. For my current case it suffices as it's for a new object, but I can imagine this won't always be the case.
     
  7. nineko

    nineko

    I am the Holy Cat Tech Member
    6,351
    510
    93
    italy
    Premise: I know nothing about this particular scenario, and my C is rusty, so I might be talking out of my ass, but, can't you just use ANDs and ORs? Like:
    Code (Text):
    1. result = (long1 & 0xFFFF0000) | (long2 & 0x0000FFFF);
    edit: I originally masked 8 bits out of 16 instead of 16 out of 32, thanks to Glitch for noticing, this is what I get when I post while half asleep.
     
    Last edited: Mar 14, 2024
  8. President Zippy

    President Zippy

    Zombies rule Belgium! Member
    I'm glad you pushed back on Stack Overflow behavior, although you forgot about all the incorrect answers that don't even compile :P

    As far as adhering to C standards, I think it's hard to go wrong with using C99 when there's a C99 compiler for every platform known to man. Some folks will insist on strict ANSI C, but people are too used to having certain nice things:

    1. Declaring their for loop counter in-line
    2. Leaving "//" single-line comments
    3. Not having to declare all their variables at the top of a function
    4. Having a bool type.
    5. Fixed-width integer types​

    As far as using fixed-width integer types, this is the most incontrovertibly sound and fundamentally essential advice posted so far. It's also only possible to portably use fixed-width integer types in C99. Having been forced to use the Solaris Studio compiler in the past for SPARC and x86 Solaris, you could still claim portability to Solaris if you ask users to install GCC or Clang, which have binary distributions for some pretty old versions of Solaris and even z/OS.

    I think the OP was just asking in a nutshell, "how can I minimize the amount of time it will take to add support for a new platform in the future?" Here's my crack at a straightforward, albeit intensely paranoid answer:

    1. Stick to strict C99 in case you want to compile on a platform that doesn't support the latest version of GCC/Clang (e.g. SGI Irix). Make sure when you set the compiler flags on GCC/Clang that you explicitly set "-std=c99".

    2. On your primary platform, set "-Wall -Wpedantic -Werror -Wextra -Walloc-zero -Wcast-align" to enforce compliance to the standard language spec, catch an OS-specific behavior (malloc(0)), and catch an arch-specific behavior (unaligned accesses).

    3. Use the C standard library as much as possible and use POSIX only when absolutely necessary (e.g. for dealing with endianness). Never use Linux-specific or GNU-specific functions. Porting S&K Collection should not necessitate multithreading, so steer clear of pthreads and hthreads. For the same reason, don't send signals on POSIX platforms. Be sure to write a signal handler and make sure it behaves as similarly as possible to the Windows task manager ways of killing a process as possible.

    4. If you're going to do anything OS-specific, compiler-specific (compiler intrinsics), or architecture specific (inline-ASM), remember to put a feature test macro around it (#ifdef) and to write some #else code that does the same thing more slowly. For example, if you write some code that does __sync_add_and_fetch, write an #else block that does a thread.

    5. Don't convert ASCII characters to numbers and vice versa. I learned this the hard way when my code didn't work on z/OS, which uses EBCDIC. (EDIT) You won't have to deal with EBCDIC, but you don't know for sure whether your code page will be UTF-8 or just plain-old ASCII.

    6. Don't assume the size of a struct (e.g. sizeof(my_struct)) will be the same on all platforms. Some platforms force structs to be word-aligned instead of byte-aligned if the processor in question doesn't support unaligned memory access (e.g. POWER/PPC arch).

    7. Always use fixed-width integer types instead of long and short when you need a 64-bit or 16-bit integer.

    8. Should you find yourself saving binary data to a file, always write it out in big-endian format just as TCP and IP packet metadata is big-endian. Be prepared to define your own inline functions for converting to/from big/little endian as needed.

    9. Remember to use the volatile keyword when working with any kind of buffer that could be shared between the CPU and some other piece of hardware acceleration.

    10. Factor your graphics and sound code as separately from your game logic as humanly-possible, because this will necessarily have to be rewritten for other platforms.

    Bonus. Contrary to one of OP's points, you can safely use unions. Just don't try to use them for C++-style inheritance + dynamic polymorphism. Here's an example where they are necessary:

    struct tiny_rpc {
    enum {
    error=-1,
    unknown=0,
    hello=1,
    goodbye=2,
    text=3​
    } type;
    union { // IIRC, anonymous unions are part of C99
    struct hello {...};
    struct goodbye { ... };
    struct text {...};​
    };​
    };​

    I can edit with more stuff I remember from my experience writing code that ran on Linux, AIX, Windows, Solaris, z/OS, HP/UX, and FreeBSD if it pops up in my head, but I think this would reduce your portability problems by about 95%.

    Appendix: Also, if you ever need to spawn a child process, then never fork without exec'ing and don't do anything in between fork() and exec(), as Windows doesn't have fork(). Just use posix_spawn() to achieve the same behavior as Windows process spawning. I put this in "appendix" because porting S&K Collection doesn't involve spawning child processes.
     
    Last edited: Mar 13, 2024
  9. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    Yeah, that's a better way. But I'm not entirely sure if it works as expected for signed integers.
     
  10. Glitch

    Glitch

    Tech Member
    176
    13
    18
    Interesting that you say that and then go on to repeat the exact same argument that we've been making. Nobody is saying "don't write portable code"; we're saying "work out what you mean by 'portable' and then write code that fits that definition".

    To answer the question: what nineko suggests will work (reinterpret it as an unsigned long and mask with 0xFFFF0000). It's technically not "portable", as per Clownacy's definition, becuase it assumes 2's comp arithmetic but, realistically, it will work on any platform you can get your hands on.

    Edit to add: note that we're assuming long is 32bits here which, again, is not strictly portable. It's usually either 32 or 64 bit. You can pick an appropriate mask based on the size of long
     
    Last edited: Mar 14, 2024
  11. President Zippy

    President Zippy

    Zombies rule Belgium! Member
    Slow your roll there, brother man! I wasn't talking about anything someone said in this thread, just Stack Overflow itself.

    I just really hate Stack Overflow because it's a piss poor surrogate for well-thought out documentation, and it's a cesspool of Dunning-Kruger hackers who gatekeep people who just want to make stuff that makes the world a better place. More so, I hate it because I'm fixing code full of snippets of bad Stack Overflow answers copied verbatim right now.

    I saw the posts about doing bit manipulations on signed integers, and I was satisfied with other peoples' answers, absent context on why bit hacking is necessary.

    FWIW, I read Clownacy's blog post proscribing the use of fixed-width integer types, and I think it's based on the assumption that the code using it is performance-critical all or at least most of the time. Given the "culture" of C programming I think it's a fair assumption and his advice when followed will reduce the amount of bad code in the world. However, there are other important reasons to use C99 in 2024, chiefly that you can write a program with no external dependencies: only what your OS's libc gives you. This alone justifies writing core utilities like ls, cat, awk, etc. in C99. Fixed-width integer types in conjunction with trapping overflow/underflow is a good way to detect bad behavior in test cases. For a QUIC implementation or software implementations of IP, UDP, and TCP, it's a more expressive way to explain what's what and is a more intuitive way to use inline functions that convert big endian to host endianness.

    As far as recommendations not to use C99 because of Microsoft's politically-driven refusal to adopt C99 in MSVC, I think using Clang on Windows for portability's sake is less painful than writing ANSI C,
     
    Last edited: Mar 14, 2024
  12. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    Nobody is recommending to not use C99 in this thread.
     
  13. President Zippy

    President Zippy

    Zombies rule Belgium! Member
    I thought that's the direction you were going because you mentioned trying to support as many compilers as possible and MSVC < 2015 not having support for the full C99 spec.

    When you asked about portability, I wasn't sure if you were trying to just be as compatible as possible with future platforms and platforms that currently receive maintenance, or if you wanted to leave the door open for discontinued platforms like pre-Vista Windows NT or Windows 9x.

    Speaking of discontinued platforms, are you only targeting platforms that guarantee hardware floating point support and use virtual memory, or are you interested in embedded platforms like Nintendo DS/3DS? The amount of work you need to do to remain portable varies depending on your answer.

    Also, are you interested in WebAssembly? That has some idosyncratic rules of its own as well, in which case you will not be able to read/write save and config files directly, but will need to write some JavaScript code to do it for you and then make C calls to that JS code.

    Now that I think about it, if you targeted your decomp to WebAssembly, that could be a panacea for portability (not that I tried it before). Anything that can run Firefox or Chromium could run your C code.
     
  14. nineko

    nineko

    I am the Holy Cat Tech Member
    6,351
    510
    93
    italy
    You're right, I was half asleep and I masked 8 bits out of 16 instead of 16 out of 32, thanks for proofreading it for me and for confirming that it would work!

    I'll amend my original post to rectify my mistake.
     
  15. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    I said that in the context of C99's fixed-width types. If my goal was to use C89, I wouldn't have been able to use them at all.
    Being able to port it to handheld consoles like the 3DS and PSP would be great.
    Not at the moment.
     
  16. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    I was missing a way to portably extract the lower 16 bits of a signed long. Thanks to an answer on StackExchange I now know how!

    That means that I can complete the puzzle of how to copy the upper 16 bits of a signed long into the upper 16 bits of another signed long in a portable way. Behold!
    Code (Text):
    1. pNewActwk->xposi =  (pActwk->xposi / 65536 * 65536) + (pNewActwk->xposi % 65536);
    Parentheses added for readability; they're not required.

    That's another macro for my arsenal!
     
  17. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    My way to extract the higher bits of a long in a portable way doesn't seem to work correctly when said long is negative. Or, at least, it doesn't match the way Sonic 3 does it.

    Example: my long is -5750784. The portable way to extract the higher bits is to divide it by 65536, which nets me -87 (binary: 1111 1111 1010 1001). However, if I do it the same way assembly does, which is equivalent to masking with 0xFFFF0000 and shifting 16 bits to the right, I get -88 (binary: 1111 1111 1010 1000).

    Is this a case where a macro won't do, and I need a function to detect if the number is negative to compensate?
     
  18. Cooljerk

    Cooljerk

    Professional Electromancer Oldbie
    4,714
    380
    63
    My general advice to anybody looking into writing portable C code is to read and study the Doom source code. Doom is probably the single best example of how to write portable C code, to the point where they offered a class on it at my university.
     
  19. BenoitRen

    BenoitRen

    Tech Member
    772
    381
    63
    I looked at Doom's source code, which does use signed fixed point integers. It doesn't have methods or macros for getting the integral part, so I had to search a bit for where they're used.

    From what I've seen, negative integers are turned into positive integers before being processed, and turned back into negative integers when the operation is done.

    Both dividing 5750784 by 65536 and shifting it by 16 bits net me 87 instead of the wanted 88. Which means I'll have to compensate for the way the Mega Drive handles parts of a negative integer.

    Note that I can't just subtract 1 from negative integers and call it a day, because if there's no fractional part the results of division and shifting are identical. Here's an example using -6291456 (binary: 1111 1111 1010 0000 0000 0000 0000 0000):

    -6291456 / 65536 = -96
    -6291456 >> 16 = -96
     
  20. nineko

    nineko

    I am the Holy Cat Tech Member
    6,351
    510
    93
    italy
    Ha, this is similar to the rounding error which happens in the Virtual Console version of Super Mario 64. I wonder if someone (e.g. emulator developers) looked into that and already tried to find a viable solution? I mean, this is trivial to fix with some IFs, even I could do that, but perhaps there already is some quality research about this.
     
  21. DigitalDuck

    DigitalDuck

    Arriving four years late. Member
    5,404
    488
    63
    Lincs, UK
    TurBoa, S1RL
    ...

    floor(x/65536.0)