[GTALUG] Running Dell branded Nvidia gtx 1060 in non-dell system

D. Hugh Redelmeier hugh at mimosa.com
Mon Aug 12 21:01:42 EDT 2019


| From: xerofoify via talk <talk at gtalug.org>
| 
| On Mon, Aug 12, 2019 at 6:11 PM D. Hugh Redelmeier via talk
| <talk at gtalug.org> wrote:
| > <https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf>
| >
| > "Intel(TM) 64 and IA-32 ArchitecturesSoftware Developer's Manual"
| > Volume 1 of 9.
| >
| > I don't see PCIe mentioned there.  Nor would I expect it.  There is
| > mention of PCI in an example of using the MOVNTDQA instruction.
| >
| It was odd before but instructions can touch or swap with PCI so that's why. PCI
| is not like USB or other protocols it requires overhead on the CPU side if that
| makes sense including lanes/instructions to a lesser degree. It may not be
| mentioned for assembly manuals directly but in other hardware documentation
| very likely.

The architecture seen by a program is often separated from bus issues.
PCI has historically been addressed as part of the memory address
space (as opposed to the IO address space).

Once caches were introduced, software needed to be able to make sure
that it didn't cause misbehaviour in PCI bus operations.  When talking
to a device, you usually (but not always) wish the cache to be
bypassed.

Historically on x86 (post i486), you did that using the MTR Registers.
I'm sure that has since been changed since there were too few of
those.  But the ideas are there.  See 18.3.1 "Memory-Mapped I/O".

If you look at 12.10.3 "Streaming Load Hint Instruction", you will see
a discussion of this issue and the MOVNTDQA instruction.  That's the
context of the example referencing PCI.  There is no need for the PCIe
version to bleed into the abstract X86 architecture.

BTW "WC" means "Write Combining".  Memory so-designated (e.g. by an
MTRR) is uncached but the processor may combine writes.  This, for
example, is often used for accessing graphics card buffers.  Without
write combining, many more writes would be required.

Interestingly, on the machine I'm using to compose this email,
/proc/mtrr shows 7 registers with write-back and one uncachable.  None
is write combining.


More information about the talk mailing list