Getting system parameters in order to improve data structures

Dear programmer, there are a lot of situations when you have to deal with very efficient data structures to get a good performance. An important characteristic of a data structure is granularity. How big the data structure should be? Which is the optimum size of its elements?

Of course there is no recipe or magic formula that gives you the answer immediately. But, as we run our programs on computers, if there was such a formula, it would contain a lot of system characteristics as its input. By system characteristics we understand size of main memory, number and topology of cores, number of cache levels, size of each, their sharing mode (which cache level is shared by cores, which is not).

When you know these parameters, you can plug them into the code that decides the granularity of your data structures.
Here's how to get some of them on a Linux environment, using long int sysconf(int name) system call, defined in unistd.h:

// gets the size of the level 1 data cache
var = sysconf(_SC_LEVEL1_DCACHE_SIZE);
// gets the size of a line from level 1 data cache
var = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
_SC_LEVEL1_DCACHE_SIZE, _SC_LEVEL1_DCACHE_ASSOC, _SC_LEVEL1_DCACHE_LINESIZE and many others are defined in confname.h

In order to get information about the amount of system memory, you can use
int sysinfo(struct sysinfo *info), defined in sys/sysinfo.h.
A sysinfo structure, the output, contain fields like totalram, freeram, totalswap, etc.

Now, when you split data into chunks, you may consider those values and system parameters. For example, you may build chunks no larger than the size of the level 1 data cache and pass them to tasks, one at a time. While the algorithm is working inside the task, most memory accesess will be fast, as the data is in the cache. The closest cache, the non-shared one :)
Einzelheiten zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Kategorien: