Jump to content

Optimize Grid memory layout for faster memory I/O


Recommended Posts

The class Grid contains a lot of the map data. What's interesting here is a bunch of pointers, which seems to be used for arrays, which in turn each stores a cached value for each cell.

The concept seems to be good. It grants quick access to data, which needs to be accessed quickly for performance reasons (likely TheSIM). The problem is that each array is allocated at a random location in memory meaning they are optimized for looping through all cells, but they are bad when getting data for the same cell from multiple arrays.

I propose making one array of struct instances. The class would then contain data like temperature, mass, elementIdx etc. This will result in an array where all cells are after each other like right now, but if you read say the temperature, when you need to read the mass next, it's already in the level 1 cache because those two numbers are right next to each other in memory. You might also be able to save some overhead if you implement a get method, which returns a class reference instead of fetching from an index each time you need a variable.

Keep in mind that for this to work efficiently, the class layout matters alot. Changing pack size could add additional overhead, meaning it's better to order the variables manually for minimal memory footprint.

Update: removed all references to structs. Structs are returned by value while classes are returned by reference. Since we want references, classes, not structs should be used.

Link to comment
Share on other sites

20 minutes ago, Nightinggale said:

The problem is that each array is allocated at a random location in memory meaning they are optimized for looping through all cells, but they are bad when getting data for the same cell from multiple arrays.

It's fast. We are talking about BaseAddr + Offset levels of access. Sure you are going to end up getting cache misses, but if you aren't accessing all the data in a given cell, then packing them in structures (larger data that needs to be pre-fetched) you might end up loosing performance due to walking over the cache-line, no?

Link to comment
Share on other sites

1 minute ago, Moonkis said:

packing them in structures (larger data that needs to be pre-fetched) you might end up loosing performance due to walking over the cache-line, no?

It's possible. Which approach is the fastest will entirely depend on how the calls are made from the most performance intensive part of the code, which I will assume to be TheSIM. Since it's the part fo the code we don't have access to, we can't tell for sure. Trying different layouts of this and benchmarking each would be the correct way of figuring out which approach is the fastest. There is enough difference in performance from cache misses vs avoiding cache misses that it would be worth the time to investigate, particularly if the code in question is viewed as working too slow.

Link to comment
Share on other sites

I agree that both solutions have advantages and disadvantages. Therfore, which is best can only be determined by actual benchmark testing.

For example, when performing temperature calculations on the whole map, it may be best to store all temperature related information (temperature, thermal capacity, thermal conductivity) seperately from all other cell data, so that the processors' caches are not clogged with data that is irrelevant for these calculations.

Storing the data in seperate arrays also has the advantage that it makes SIMD instructions (e.g. Intel SSE and AVX) more efficient, since it is not necessary to gather the individual data out of structs before performing the actual calculations.

Link to comment
Share on other sites

I just came across something, which might be a very valid reason for the current memory layout. TheSIM.dll makes use of those arrays (presumably) and I just learned that this DLL file is written in C++. Maybe the rather simple data structure is used due to the C#/C++ data bridge.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

Please be aware that the content of this thread may be outdated and no longer applicable.

×
  • Create New...