Memory usage is often overlooked as Moore’s law brings us larger amounts of memory year after year; still, it matters significantly for multiple reasons. First, even if the total memory available to an application has been increasing, speed and cache size haven’t. This means that the more memory an application has to walk, the more likely it is to step out of the cache and slow down the application, a problem for performance-critical tasks. There is a difference in latency access of two orders of magnitude to use cache vs. the main memory; for reference, you can use 1ns for a cache hit and 100ns for a memory access.
Something else you might not realize is that fetching data from the memory subsystem consumes more energy than not doing so. Stating it like this seems obvious, but it’s true, consuming less memory also means consuming less energy. Some hardware can push this even further by dynamically turning off the unused memory bank to avoid even the refresh cost of that memory. It is one of the possible uses of Linux Hotplug Memory. One final, more obvious reason for the importance of restricting memory usage is that when applications use less memory, it’s possible to have more applications running smoothly at the same time.
Just like everything else you don’t pay attention to, application memory usage will get worse if left unchecked, this short checklist should provide a helpful starting point.
The Memory Optimization Checklist
To get started, like all optimization work you need at least one benchmark, if not a set of benchmark tools that reproduce a meaningful, real-life scenario. Once you have this, you can run it under the valgrind massif tool to produce a trace that can be visualized with massif-visualizer.
Here’s a simple checklist to get you started.
- Look for any trend in the curve that could indicate a forever grow, this indicates a memory leak that should be fixed first!
- Look for any initialization of libraries that are unexpected for the specific use case.
- Look for the biggest memory consumers and check the structure to ensure the following
- The structure actually has a use,
- Every field of the structure is used,
- The structure fields are ordered from largest to smallest (basically sorting sizeof),
- Costly, hidden structures like pthread mutex aren’t overused, and
- The arrays are properly aligned on both sides (If you choose an arbitrary size, it is best for the total size of the structure to align with 4K to avoid memory waste).
- Finally, look at how the different fields are used, find the ones that are used together and pack them together
This checklist should provide a good starting point to optimize your application’s memory usage! If you have any tips of your own, feel free to leave them in a comment.