The answer depends on your CPU architecture.
That said, if you are using gcc or clang, you can use the __builtin_prefetch
instruction to try to generate a prefetch instruction. On Pentium 3 and later x86-type architectures, this will generate a PREFETCHh
instruction, which requests a load into the data cache hierarchy. Since these architectures have unified L2 and higher caches, it may help.
The function looks like this:
__builtin_prefetch(const void *address, int locality);
The locality
argument should be in the range 0...3. Assuming locality
maps directly to the h
part of the PREFETCHh
instruction, you want to pass 1 or 2, which ask for the data to be loaded into the L2 and higher caches. See Intel? 64 and IA-32 Architectures Software Developer's Manual
Volume 2B: Instruction Set Reference, M-Z (PDF) page 4-277. (Find other volumes here.)
If you're using another compiler that doesn't have __builtin_prefetch
, see whether it has the _mm_prefetch
function. You may need to include a header file to get that function. For example, on OS X, that function, and constants for the locality
argument, are declared in xmmintrin.h
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…