Last updated: Sat Mar 13 05:04:37 CET 2010

How big is the design space?

This table lists the dimension (including non-valid points) of the design space per cache mechanism.
Cache Mechanism L1 cache L2 cache L3 cache L4 cache
Blocking Cache[3]60363618
Non-blocking Cache[10]540324324162
Victim Cache[13]8640388838881296
Timekeeping Victim Cache[11]155520699846998423328
Stride Prefetcher[8]4860291629161458
Content-Directed Prefetcher[5]4860291629161458
Stride + Content-Directed Prefetcher[12]43740262442624413122
Tag Prefetcher[7]216017281728864
Global History Prefetcher[9]1728010368103685184
Blocking Skewed Associative Cache[21]2401447254
Delta Correlating Prediction Tables[26]20736012441612441662208
[28]20736012441612441662208
Test1700[30]20736012441612441662208
DemoTuesday[31]20736012441612441662208
Parametric2799360 = 10^6
Structural5533090560 = 10^10
Hierarchical6.213557e+26 = 10^27
Compiler flags2723 = 10^3
Joint Compiler/Architecture1.691951e+30 = 10^30

How to make my cache mechanism compatible with the framework?

The easiest way is to start from an existing cache module, e.g. write back non-blocking cache, and integrate your new features inside. This technique was successfully used by Nathanaël Prémillieu to implement a skewed datacache proposed by André Seznec.

How to register a parameter?

Use the construct 'parameters.add()' in your NewModule.sim.
    // Registering nCPUtoCacheDataPathSize
    parameters.add("nCPUtoCacheDataPathSize",nCPUtoCacheDataPathSize);
    // Registering nAssociativity with range specified
    parameters.add("nAssociativity",nAssociativity,2,4);
There are fixed names for the three basic cache parameters: nCacheLines, nAssociativity, and nLineSize. You may not change them. All other parameter namings are free.

How to replace a cache

Assume you want the Stride Prefetcher instead of the default non-blocking cache. Then you simply replace in dse.uni.cxx
#include "CacheWBNB.sim"
with
#include "CacheWBNBSP.sim"
and modify the MyL1Cache instantiation corresponding the class signature for CacheWBNBSP.sim into
typedef MicrolibCacheWrapperWBNBSP<Instruction,
__CacheWBNBSP_nCPUtoCacheDataPathSize,__CacheWBNBSP_nCachetoCPUDataPathSize,
__CacheWBNBSP_nMemtoCacheDataPathSize,__CacheWBNBSP_nCachetoMemDataPathSize,
__CacheWBNBSP_nLineSize,__CacheWBNBSP_nCacheLines,__CacheWBNBSP_nAssociativity,
__CacheWBNBSP_nStages,__CacheWBNBSP_nDelay,1,__CacheWBNBSP_nMSHR,
__CacheWBNBSP_nMSHRRead,__CacheWBNBSP_nSPEntries,__CacheWBNBSP_nSPPCShift,0>MyL1Cache;
You need to recompile by typing 'make'.

How to create my NewModule.default.h?

For all registered parameters you need to specify a default value
    If in your NewModule.sim
    parameters.add("nCPUtoCacheDataPathSize",nCPUtoCacheDataPathSize);
    you should include in your NewModule.default.h
    #ifndef __NewModule_nCPUtoCacheDataPathSize
    #define __NewModule_nCPUtoCacheDataPathSize 8
    #endif
    If in your NewModule.sim
    parameters.add("nAssociativity",nAssociativity,2,4);
    you should include in your NewModule.default.h
    #ifndef __NewModule_nAssociativity
    #define __NewModule_nAssociativity 2
    #endif

How to create my NewModule.range?

For all registerd parameters, you need a single line in your NewModule.range which either enumerates the possible values that should be explored, or specifies that the value depends on settings outside your module.
    Specifying that the DataPath-like parameters should be set from outside NewModule.h by using -1:
    nCPUtoCacheDataPathSize -1
    nCachetoCPUDataPathSize -1
    nMemtoCacheDataPathSize -1
    nCachetoMemDataPathSize -1
    Enumerates the possible values to be explored:
    nLineSize 32 64 128
    nCacheLines 256 1024 4096 16384 65536
    nStages 1
    nDelay 1
    Snooping false
    You can allow direct-mapped by using 1 and fully associativity by using 999:
    nAssociativity 1 2 4 999
    You can specify separate ranges for the different cache levels by using a keyword to separate the different levels (see sample in NewModule.range).
    MODULE_AS_L2

How can I verify area and latency of my configuration?

For latency and area estimating we rely on Cacti described here. Every run generates a file 'areachecks' and 'latencychecks', with call to Cacti that are used to determine latency and area estimates. If you install Cacti yourself you can see the estimates.
The option '--checking' allows you to stop simulation after generating the check-files.
    ./dse --checking ../benchmarks/powerpc/hello

How can latency and area be set properly?

For latency and area estimating we will rely on Cacti described here. To do so, we ask you to implement a 'get_area()' and 'get_latency()' function that simply generates the external Cacti calls to estimate latency and area for your cache. For area you may generate multiple AreaEstimator calls if your cache consists of several blocks. Latency and area will be set automatically to what is specified by Cacti.
    Sample for estimating the area
    fprintf(fp, "AreaEstimator %d %d %d 1 0 0 0 1 %f %d 0 0 0 0 %d\n",
    (nAssociativity==nCacheLines ? nLineSize*nAssociativity :
    nLineSize*nAssociativity*nCacheLines),
    nLineSize,
    (nAssociativity==nCacheLines ? 0 : nAssociativity),
    TECHNOLOGY,
    nCachetoCPUDataPathSize, extra_tag_bits);
    Use LatencyEstimator instead of AreaEstimator in 'get_latency()'.

Why is nAssociativity limited to 16?

The current version of Cacti does not provide latency/area/power estimates for associativity higher than 16 if not fully associative. So, this limitation only holds if you rely on Cacti for latency, area or power estimates.

I have evaluated millions of design points myself. Can I add my results?

No. The reason is that we cannot verify your results without re-simulating. Yet, you can submit your module specifying your best configuration as the default settings and see how it ranks.