🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

C++ Memory Tracking - Bad to have the allocation information before the real alloc?

Started by
10 comments, last by Shaarigan 4 years, 2 months ago

Hi everybody!
Is it bad to have the memory allocation information before the real allocation?
Is it a not recommended and bad way to achieve memory tracking?
Is it better to keep the information of the allocation in a hash table somewhere else?
Basically is it bad to do: malloc(SizeOfInfo + SizeAlloc)
Is it going to change the behavior when the memory tracking is enabled?
Is it changing the memory alignment and other things?
Thanks a lot!

Advertisement

You seem to think so, given how you phrase the questions ?

I would say it all depends on what you aim to achieve. That is, what is the purpose of tracking memory allocations?

Folding that data in a header before the “real” data is one option, which has the advantage that it expands as you have more allocations without much work. On the other hand, getting an overview of what you have is more complicated, as the administration is scattered across your entire memory space. Also, your tracking data is in-between “real” data, thus if code managing the real data writes or reads out of bounds, your tracking data may get corrupted. Depending on what you aim for, this may or may not worry you.

Obviously, malloc is still going to return the start-address of the entire block, so yes, it needs a bit of care to ensure “SizeAlloc” is properly aligned for its members, and somewhere you need to manage the offset of “SizeOfInfo”. Otherwise, I think it's not much of a problem, as malloc() returns a pseudo-random address anyway, so if you offset that by a bit, nobody should notice.

Last but not least, C++ allows overriding ‘new’ so you can do your own memory management of a class. That is arguably a better path for what you're trying to do, at least useful to explore as an alternative imho.

Thanks for the answer!

The goal is to know the memory usage at any time with category and be able to show that on screen but also save in a file.
Of course the result file saved on disk will also contains the memory leak.
This file can be used during CI (continuous integration) as well to have some graph and possible blocking of merging.

As you said, overloading new and delete in class is part of the task, they will call the special malloc/free which track the memory.

Basically:
1) Custom malloc, custom free, they just call malloc and free + memory tracking code inside.
2) Macro to have a custom new.
3) Macro to have classes implement the custom new/delete overloading.

That way, malloc, free, new and classes are all managed and the tracking with category is only about your code.
The plan is clear but the question is how keep intact the behavior mixed with a clean implementation of memory tracking.

Did you look for memory analysis tools already? There are quite nifty tools around, developed more than you can ever manage by yourself. “valgrind” comes to mind (not sure it runs at Windows) that can produce a list of allocations, that eg also detect use after freeing, double ‘free’, or writing out of bounds. There is another tool that produces fancy “heat” diagrams that relate functions and memory use (don't remember its name, unfortunately).

As for your question, I would likely avoid big changes between development and deployment, and simply keep all the stuff in the deployed version as well, except for showing and saving (although could be nice for a crash analysis??).

If you don't want that, the simplest path to removing tracking is likely keep the custom ‘new’ (thus all classes don't change), but implement it with a malloc of just the requested datasize, making it basically an “no-op" step.

What I do in my engine is two kinds of memory tracking. My allocations always have a small amount of header bytes set before the real data which contains the total amount of data allocated - the header bytes. I use this at all circumstances to keep track of allocations per allocator to be able to detect and assert memory leaks early if an allocator goes out of scope but has still memory left it didn't released so far. Also I can keep track in the allocator of the total amount of allocations and bytes requested from memory. I think this is what you want. because you can track the amount of memory used during runtime.

As this is not very good for profiling which function exactly allocated which amount of memory and when, I also added some profiling data to my allocators. These data is not saved and only alvayilable on profiling builds, but contains the amount of memory requested and the trace of which function requested it. Those data is sent via UDP to my profiler frontend or the editor and I can see the total memory consumption as a realtime graph

This is exactly a 100% what I'm looking to have.
It looks like you have 0 issue at all with the header allocation method, surely not that bad at the end.
There is 2 questions:
1) Are you managing to have the trace using Win32 API? That opens a cross platform issue then.
2) Is it really good to have different allocator classes instead of just one malloc/free with the category name as param?

I don't have any issues with it so far, just some wasted bytes when aligning memory to certain alignment for performance reasons. It is quite simple, allocate as much space + header as you want and then shift the final pointer before delivery to the caller.

For the traces I use CaptureStackBackTrace WINAPI call, which provides a session unique function pointer list. I guess backtrace is the Linux alternative (as I don't develop for Apple Platforms, this is anything I need). I'm not using any names or descriptor flags, just tracking when a function accquires and releases memory using the obtained pointer address.

I have different allocators for different allocation strategies and possibly the same allocator in multiple instances. This depends what code is calling it and other circumstances. My allocation model is a strict must-pass parameter in any function that uses memory, using a default allocator (that has to be defined first) for convinience. If you want to allocate memory by category, simply add an optional 1 or 2 byte parameter to any memeory allocation you have and put that into the header. You allocator then has to keep track of which ID the allocation/ release belongs to and add it into some kind of static array, where each index is the category ID. This way you don't have to maintain additional memory for a map structure

Thanks for the answers, that's extremely helpful.
About allocators, I only know regular malloc/new and pool allocation. Are there others?
About the alignment, I guess the special “MallocWithAlignment” has to now take account of the header of tracking when enabled.

I've seen and used a few more allocators over time

  • Pooled/ Bucket allocators for example to minimize memory fragmentation. A strategy of game engines to order memory requests into some kind of sized pools, depending on the size of the object and the boundaries of the pool. This way you can have requests fit better into available memory and don't waste too much over time because small and large objects don't share the same pool
  • Paged allocator that will reserve memory in page sizes, useful to handle virtual memory or fiber stacks
  • Stack allocator if you need to rely on the order of allocation/ deallocation

And a special case of allocator when I wanted to have fiber local storage not per thread but call-context. In general, it doesn't matter where the memory comes from as long as the caller is not bothered with the allocation strategy.

While my philosophy is “as much as needed, as less as possible” related to dependencies and API calls when developing my engine, I also wrote my own aligned allocation code. I know there is an _aligned_malloc call in the CRT but this one also takes the header bytes into account

typedef byte AlignmentHeader;
inline void* Align(void* ptr, uint16 alignment)
{
    uint16 mod = static_cast<uint8>((uptrint)(ptr) & (alignment - 1));
    return (byte*)ptr + (alignment - mod);
}
inline void* Allocate(varying size, uint16 alignment, uint16 header)
{
    void* result = se_null;
    void* ptr = Runtime::malloc(size + (alignment - 1) + header);
    if (ptr)
    {
        result = Align((byte*)ptr + header, alignment);
        *(AlignmentHeader*)((byte*)result - header) = static_cast<AlignmentHeader>((uptrint)(result) - (uptrint)(ptr));
    }
    return result;
}

I shift the memory pointer by the desired header size and perform the final alignment after that. So the header is anywhere I don't care about because it isn't performance critical and the final memory pointer is proper aligned to the desired boundary. The first byte of the header is filled with the choosen alignment, as I think more than a single byte isn't needed to store the value (alignment of 256 and more sounds insane, you want any of 2, 4, 8 or rarely 16, sometimes also 32)

Ok yes I see, malloc the header + data aligned but align after the header.
The allocator looks to be a big topic, I wonder how to write something very clean to have the user to use that in a clean way.

This topic is closed to new replies.

Advertisement