C++ discussion thread (w/ Boost)

President Zippy · Sep 17, 2022

A thread for discussing the best way to do stuff in C++, i.e. best-practices and features that might be unfamiliar to beginners.

I made this thread because most of my experience was using C++03 in an environment where STL could not be used, due to a combination of an outdated version of xlC on IBM z/OS (for z/series mainframes), and also due the fact that my previous employer wrote most of its base libraries before the C++98 standard. The point here is that modern C++ (>= 11) is full of useful functionality like constexpr, static_assert, smart pointers, move semantics, and of course the STL containers that can be used to quickly produce useful code that's also type-safe. Better yet, Boost has libraries for graph search algorithms and good ongoing proposals, like a decimal floating point library.

To give an example of why I need this thread and why I think other people might be in the same boat, on my github I wrote an example matrix library to support complex numbers, and in it I needlessly reinvented std::complex.

No question is stupid, no proposal is ridiculous, and any answers or examples are worth their weight in gold- but good ones are worth their weight in FL-grade diamond.

EDIT: Here are some thoughts and observations that I think are especially relevant to C programmers who think C++ is just C with classes, as someone who used to think C++ is just C with classes...

The C learning curve has just 1 plateau between being a beginner and fully mastering the core language (minus OS-specific libraries such as POSIX): learning how to pass functions around as data via function pointers and building big projects around this concept.

On the other hand, the C++ learning curve has 5-10 plateaus between a beginner and Bjarne Stroustrup. Here are just the 5 I've observed and some hallmarks of someone who has reached each plateau:

1. Dunning-Kruger (unknowingly incompetent) C++ programmers: C-style programming, but with std::string, classes, and occasionally virtual functions (dynamic polymorphism).

2. Basically competent C++ programmers: Using the STL, using RAII idioms where possible, using smart pointers instead of raw pointers, allocating on the heap only when data is too big for the stack (i.e. no unnecessary uses of "new"), and passing non-primitive data by reference.

3. People who should be allowed to work on low-latency systems in C++: Using functors and lambdas instead of function pointers, creating template classes to avoid dynamic polymorphism, implementing custom iterators and streams.

4. People who should be allowed to write books on how to use C++: Using move semantics to pass around data without copying instead of pointers, understanding the general concept of rvalues vs lvalues, implementing custom "new" operators, and occasionally specializing template classes to optimize code.

5. People who should be allowed to architect low-latency systems in C++: Using full-blown template metaprogramming and constexpr instead of C macros to safely inline code.

After 5 years of full-time work almost entirely in C++, I'm only somewhere between (3) and (4). To be uncomfortably blunt I don't think anyone with an account on this forum is above 3. If you take the time to learn the "C++ way of writing C++", you can do things in 1 week that take 6 weeks in C. Speaking from experience, you can even write a simple plaintext HTTP client in a day by the time you reach (3), a task which takes 2 weeks in C.

Here are some telltale signs you're not fully utilizing C++:

1. Your own code contains void pointer function args and member variables, and you use C-style casts on them instead of static_cast or reinterpret_cast where appropriate. NOTE: usually reinterpret_cast is baaaaaad.

2. You're using raw pointers.

3. You're using smart pointers to data that cannot be null, rather than using references.

4. You're using C stdlib and posix/Win32 API calls that have a higher-level abstraction in the C++ STL.

5. You're allocating small, non-persistent data on the heap, i.e. using "new" to allocate temporary variables.

6. You're aliasing with "typedef" instead of "using".

7. You're using C macros to write inlined code instead of constexpr functions.

MarkeyJester · Sep 17, 2022

I often program tools which look for repeating data to omit while it is actively being generated (for example compression streams, or art/tile generation where repeats can be referenced, etc).

Because the size of the data won't be known until after processing is finished, the size of the heap is variable, so I compromise between fragmenting the heap and wasting memory by; allocating a small space, and when used, reallocating twice the size each time (so the space grows exponentially).

In order to keep the searching time down, the data is hashed and indexed with an array of index pointers pointing to the heap. But of course, if the heap is reallocated, then the pointers change, and so the index array needs its pointers adjusting.

The annoying thing about C/C++ in my opinion, is pointer arithmetic. Simply subtracting the old pointer and adding the new pointer won't cut it, as the compiler multiplies the address by the object/element size.

Over the years I have tried a variety of methods to get around it, some involving dividing the pointers by the size of the object element, some involving scanf into integers to force the arimetic to behave normally so I can perform the addition/subtraction and then recast back. The latest version I've constructed to get around it as of today is:

Code (Text):

Pointer = (Type*) ((Pointer - (Type*) Old) + ((Type*) New);

Where "Pointer" would be the index pointer, "New" is the new realloc'd pointer, and "Old" is the original pointer the index pointers were pointing to. The cast type has to be specified even if the pointers are already said cast type.

All these methods have worked, but my question is; is there a specific or more efficient way of dealing with adjusting the index pointers? Because this seems like an awefully unnecesary mess to have to deal with, all because the pointer arithmetic is interfering with basic binary mathematics.

MainMemory · Sep 17, 2022

I would think the best way to deal with it would be to have plain indexes that you use as offsets from the main pointer, rather than storing a full pointer that you then need to adjust during reallocation. You could also try using an associative container from the STL, which come in sorted and hashed (unordered) varieties.

President Zippy · Sep 17, 2022

MarkeyJester said:

I often program tools which look for repeating data to omit while it is actively being generated (for example compression streams, or art/tile generation where repeats can be referenced, etc).

Because the size of the data won't be known until after processing is finished, the size of the heap is variable, so I compromise between fragmenting the heap and wasting memory by; allocating a small space, and when used, reallocating twice the size each time (so the space grows exponentially).

In order to keep the searching time down, the data is hashed and indexed with an array of index pointers pointing to the heap. But of course, if the heap is reallocated, then the pointers change, and so the index array needs its pointers adjusting.

The annoying thing about C/C++ in my opinion, is pointer arithmetic. Simply subtracting the old pointer and adding the new pointer won't cut it, as the compiler multiplies the address by the object/element size.

Over the years I have tried a variety of methods to get around it, some involving dividing the pointers by the size of the object element, some involving scanf into integers to force the arimetic to behave normally so I can perform the addition/subtraction and then recast back. The latest version I've constructed to get around it as of today is:

Code (Text):

Pointer = (Type*) ((Pointer - (Type*) Old) + ((Type*) New);

Where "Pointer" would be the index pointer, "New" is the new realloc'd pointer, and "Old" is the original pointer the index pointers were pointing to. The cast type has to be specified even if the pointers are already said cast type.

All these methods have worked, but my question is; is there a specific or more efficient way of dealing with adjusting the index pointers? Because this seems like an awefully unnecesary mess to have to deal with, all because the pointer arithmetic is interfering with basic binary mathematics.
Click to expand...

It sounds like you're doing dictionary-based compression, e.g. the LZ family of compression algorithms. If you wish to avoid fragmentation while maintaining a reasonable amount of wasted memory, I think redesigning your compressor around a memory pool is the way to go.

Luckily, Boost has a mem pool library ripe for use:
https://www.boost.org/doc/libs/1_46_1/libs/pool/doc/concepts.html
https://www.boost.org/doc/libs/1_78_0/libs/pool/doc/html/index.html

The idea of a pool is that unlike malloc() (and by extension, the built-in implementation of new), all objects in a pool are of equal size, i.e. no fragmentation. For a compressor, I think this is a reasonable tradeoff if you have a reliable estimate of the average compression ratio of an N-byte chunk of data, and the standard deviation from this average is relatively small. You can always make the default object size for a memory pool equal to (avg + 1 std dev), which accounts for 84% of the blocks of data being compressed. Rather than fitting each allocation request to the then-current size of data and reallocating later, you can always request a larger (but not too large) chunk ahead of time, and expand into it. For the 16% of the time (size > 1 std dev) where you need more than 1 chunk, you can request 2 chunks.

Each of your memory blocks would be some smart C++ object like this:

class data_chunk {
public:

// A read-only pointer to another pool object for overflow data.
// If size > block_size, defaults to nullptr.
std::weak_ptr<pool_object> next;
unsigned int size; // bytes used
char[block_size] data;
};

// UserAllocator is some surrogate for the built-in new which takes advantage of the fixed data size.
boost::object_pool<pool_object, UserAllocator> my_pool;
Instead of updating your dictionary on overflow, you just allocate another chunk and put a pointer to it in the data chunk contained in your dictionary. Ideally, you can just use std::unordered_map as your dictionary and define your own hash function if the built-in is too slow.

The most important part is that this approach wastes on average only (std dev) bytes of memory per chunk, which is more predictable than anywhere from zero to double, and usually the former is much less than the latter. I need to think this through a little harder to give you a mathematician's answer, but it's simpler, more predictable, more type-safe, and avoids dangerous pointer arithmetic.

EDIT: Also worth mentioning, since you're doing the compressor rather than the decompressor, the cost of C++ abstractions is much less on the host hardware, than on something like a Sega MD/Genesis.

MarkeyJester · Sep 18, 2022

Well no, I mean, I wasn't actually making an LS compression, it was merely an example of a few where one would want to search for repeats. For LZ algorithms I would probably create a small window the size of the offset + length, and make copies of the uncompressed bytes, and wrap the index around that, thus not needing any reallocation for the indexing itself anyway... But I digress...

...my question was meant to be generic, and I only threw in compression as an example, my bad.

The idea of allocating brand new segments in sepearate blocks sounds like a reason idea, as does storing the relative address (or slot index) rather than absolute address (though I suspect for the latter there will be instances where; in order to obtain the relative or slot, I would need to perform the stupid pointer arithmetic avoidence methods as above anyway, but it's still a reasonable compromise).

Thanks for your suggestions guys, not really the solutions I was hoping for, but they are useful ideas none the less~

President Zippy · Sep 20, 2022

MarkeyJester said:

Well no, I mean, I wasn't actually making an LS compression, it was merely an example of a few where one would want to search for repeats. For LZ algorithms I would probably create a small window the size of the offset + length, and make copies of the uncompressed bytes, and wrap the index around that, thus not needing any reallocation for the indexing itself anyway... But I digress...

...my question was meant to be generic, and I only threw in compression as an example, my bad.

The idea of allocating brand new segments in sepearate blocks sounds like a reason idea, as does storing the relative address (or slot index) rather than absolute address (though I suspect for the latter there will be instances where; in order to obtain the relative or slot, I would need to perform the stupid pointer arithmetic avoidence methods as above anyway, but it's still a reasonable compromise).

Thanks for your suggestions guys, not really the solutions I was hoping for, but they are useful ideas none the less~
Click to expand...

I think I would have lead you astray anyway, since my idea was a way of managing a dictionary of uncompressed data patterns, and your idea of only storing pointers to the patterns as they exist in the file was not just a design choice, but an implied requirement.

In any case, this thread probably isn't what you're looking for, since the point of modern C++ is to outright avoid raw pointers entirely, i.e. solving the kind of problems where you're doing it wrong if you have any raw pointer variables (as opposed to std::unique_ptr, std::shared_ptr, std::weak_ptr, or ideally just references). Data compression algorithms are relatively self-contained, so you don't need all the type safety of C++, or "object-oriented programming" for that matter.

C++ also compiles into much bigger machine code than C, and statically linking libc++ into a ROM file for anything with as little memory as a 20th century video game console is asking too much. Anytime you hear a Boost contributor propose a new "zero cost abstraction", just know that they're promising you the moon.

MainMemory · Sep 22, 2022

I don't believe he has any intention of using C++ on the Mega Drive, rather making tools to make files for MD software.

President Zippy · Sep 23, 2022

President Zippy said:

EDIT: Also worth mentioning, since you're doing the compressor rather than the decompressor, the cost of C++ abstractions is much less on the host hardware, than on something like a Sega MD/Genesis.
Click to expand...

MainMemory said:

I don't believe he has any intention of using C++ on the Mega Drive, rather making tools to make files for MD software.
Click to expand...

I suspected that from the outset, but the point seemed nonetheless worth mentioning in case someone who isn't MarkeyJester or MainMemory gets ideas about trying to compile it into embeddable 68k ASM. I proposed writing libraries in C and compiling them into 68k ASM before, but I wouldn't do the same with C++. If you statically link libc or libc++, you only link in the functions and variables that actually get used, and while it is feasible to write C code without using the C library functions at all, it is next to impossible in C++. If you try and use std::vector (the dynamic array), I suspect you'll pull in at least 10 other classes and possibly even the exception library.

President Zippy · Nov 14, 2023

Value-adding thread bump: a very informative 1hr talk on the inner workings of clang++, and a great example of how a lot of C++ compiler bugs in the wild are based either on ambiguity in the language spec, or the incomprehensibility of the massive language spec.

"C++ is the most complicated language there is, the most complicated language there ever was, and the most complicated language there ever will be".
-Bret Hart, probably

C++ discussion thread (w/ Boost)

Useful Searches