[GTALUG] Thinkpad T420 as a VM host

D. Hugh Redelmeier hugh at mimosa.com
Tue Jul 21 13:54:51 UTC 2015


| From: Lennart Sorensen <lsorense at csclub.uwaterloo.ca>

| On Mon, Jul 20, 2015 at 01:48:45PM -0400, Scott Allen wrote:
| > Probably not much slower, though. Perhaps just a few percent. This
| > test is rather old but the situation is somewhat similar:
| > <http://www.tomshardware.com/reviews/Intel-Core-i7-Nehalem,2057-13.html>

It's an interesting test.  But a bit superficial.

DDR memory modules are 64-bits wide (if I remember correctly).

So typically, the bottom 3 bits of an address select which byte in a
module is being used (this need not be the case).

The easy way to interleave memory access between a-power-of-two
channels is to use the next low-order address bits as module select.
That spreads the load between modules and exploits their parallel
operation quite well (because nearby references are common).

When you have three channels (as in the baseline for this test), I
don't know where you put the third channel in the address space.
That's me being lazy: it is surely written up somewhere.  But I'd
guess that it isn't as useful as the power-of-two case, and the
figures seem to support that.

It would have been nice to see some synthetic benchmarks in that test.

| I remember going from 1 to 2 sticks of ram on a Core 2 Duo machine in
| the past made the intel graphics (using shared ram) a lot faster.
| So at least on some CPUs it has made a big difference.

Perhaps the much-improved memory system of the i series makes a
difference?  See section 5 of that article.

Maybe caches are so big that real memory access is infrequent for many
programs.

Perhaps page-mode access to RAM captures a lot of the possible
advantage of interleaving?

In any case, my intuition would have predicted a bigger difference.

As always, careful, informed, exploratory measurement trumps
intuition when it comes to performance.


More information about the talk mailing list