Showing posts with label cache. Show all posts
Showing posts with label cache. Show all posts

Wednesday, December 3, 2014

How does Cache work

1. instruction cache
2. data cache: L1, L2, L3
3. TLB

Data cache:
http://en.wikipedia.org/wiki/CPU_cache

Example:
The original Pentium 4 processor had a four-way set associative L1 data cache of 8 KB in size, with 64-byte cache blocks. Hence, there are 8 KB / 64 = 128 cache blocks. The number of sets is equal to the number of cache blocks divided by the number of ways of associativity, what leads to 128 / 4 = 32 sets, and hence 25 = 32 different indices. There are 26 = 64 possible offsets. Since the CPU address is 32 bits wide, this implies 21 + 5 + 6 = 32, and hence 21 bits for the tag field.
The original Pentium 4 processor also had an eight-way set associative L2 integrated cache 256 KB in size, with 128-byte cache blocks. This implies 17 + 8 + 7 = 32, and hence 17 bits for the tag field.[3]

Saturday, November 3, 2012

Flush the Instruction Cache


// flush the cache, eip is the current eip address

__asm__ __volatile__("wbinvd"); // write back cache and invalidate cache

__asm__ __volatile__(
"CLFLUSH (%0)"
:"=r"(eip));
wbinvd:

Writes back all modified cache lines in the processor’s internal cache to main memory and invalidates (flushes) the internal caches. The instruction then issues a special- function bus cycle that directs external caches to also write back modified data and another bus cycle to indicate that the external caches should be invalidated. 


clfush:

Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy (data and instruction). The invalidation is broadcast throughout the cache coherence domain. If, at any level of the cache hierarchy, the line is inconsistent with memory (dirty) it is written to memory before invalidation


Please see intel manual instruction volume for more details