[GTALUG] Running Dell branded Nvidia gtx 1060 in non-dell system

Mon Jul 22 13:09:28 EDT 2019

Hey Hugh,

Thank you for your reply, see my comments below.

On 2019-07-22 9:30 a.m., D. Hugh Redelmeier via talk wrote:
> | From: Alex Volkov via talk <talk at gtalug.org>
>
> | I'm looking to buy used Nvidia GeForce GXT 1060 to run some ML
> | tutorials.
>
> The advantage of nvidia over AMD is wider support.  CUDA is nvidia-only
> (but AMD's ROCm is intended to be easy to port to from CUDA).

I have another system with Ryzen 5 2400G and was hoping to run ROCm on 
it, but as it turns out -- ROCm doesn't fully support AMD cards with 
built-in graphics. I still can install discreet card into that system 
but the solution is not as cheap as getting a used GTX off craigslist.

> The disadvantage is that nvidia's stuff is closed source.  Yuck.
>
> nvidia also has terrible licensing terms that can force you to buy
> more expensive cards.  You probably won't be hit by this:
> <https://www.theregister.co.uk/2018/01/03/nvidia_server_gpus/>

Yes. I haven't yet figured out how to fix screen tearing with 
proprietary Nvidia drivers.

As for ML, I had to register on their website to download cuDNN packages 
required for for tensorflow. Packages are for ubuntu, but they seem to 
work on debian.

Nvidia asks some pretty invasive questions about how card is going to be 
used (which I don't yet know), so randomly checking off boxes and giving 
them one of the email addresses where I dump all of the subscriptions helps.

> In general, nvidia does more "price discrimination".  But AMD is not
> immune: AMD sells "workstation" cards for extra money.

I like AMD approach more, they just giving tensorflow-rocm binary on 
pypi -- https://pypi.org/project/tensorflow-rocm/

Nvidia, however, makes you jump through some hoops -- 
https://developer.nvidia.com/cudnn

> For raw computing power per dollar, my impression is that AMD can be a
> better deal.

That seems to start to work around $500 price point, but I want 
something to toy with, so I can't justify that expense right now. Used 
GTX 1060 gives large enough performance to try things out, if I grow out 
if it, I'll switch to cards that have good ROCm support.

> | I got a good deal on Dell OEM one. Are there any pitfalls in running one
> | in a non-dell system?  More specifically nothing even close to that i.e
> | amd fx CPU and AMD 970 chipset.
>
> There are no problems that I know of.  I used a Dell OEM nvidia card
> many years ago without issue.
>
> Some OEM cards are a little crippled.  My 5-year-old desktop came with
> an OEM AMD card.  The specs said "1920x1200 max resolution" but also
> said "Dual Link DVI" (which is only needed for higher resolutions).
> So I assumed that it could do 2560x1600 like the non-OEM versions.  It
> could not. (My best guess is that they cheaped-out on a TDMS chip and
> did not, in fact, support dual-link, but I had no way to test.)
> So check out the specs.

Turns out there are lot of $200 - $250 GTX 1060 cards being sold on 
kijiji but the price for which they actually sell is much lower. I 
offered $160 to 3 sellers, got reply from two, one was the dell card 
which I was unsure about, the other was non-oem MSI with dual fans. I 
went with MSI.

More fans more better.

> Bonus hint: before buying the card, make sure it will fit in your
> system:
>
> - I had a problem with an RX 570 being too long for my computer's
>    motherboard
>
> - many cards now require extra power connectors that your power supply
>    might not support.  And the number of pins on those connectors
>    changed in recent years.
>
> - you may need a power supply with more capacity.
>
> - with more power comes more heat -- will your case handle that?
>    (Probably)

I'm really glad back in 2013 I bought decent mid-atx Antec (Sonata II?) 
with 500W power supply with enough room inside and 2x 120mm fans. It has 
2x 6-pin 12V connector. After minor AMDectomy where I removed old Radeon 
that doesn't really do anything besides displaying things, I was able to 
install and run the card without any hardware changes to the rest of the 
system.

I got to the part when I'm able to run some tutorials on tensorflow, 
which I do believe run on the CPU because there seem to be a software 
bug somewhere related to LD_LIBRARY_PATH that I still haven't figured out.

2019-07-21 22:29:58.204355: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:42] 
Successfully opened dynamic library libcuda.so.1
2019-07-21 22:29:58.367642: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 
with properties:
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
2019-07-21 22:29:58.367858: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not 
dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot 
open shared object file: No such file or directory
2019-07-21 22:29:58.367982: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not 
dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot 
open shared object file: No such file or directory
2019-07-21 22:29:58.368112: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not 
dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot 
open shared object file: No such file or directory
2019-07-21 22:29:58.368234: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not 
dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot 
open shared object file: No such file or directory
2019-07-21 22:29:58.368369: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not 
dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: 
cannot open shared object file: No such file or directory
2019-07-21 22:29:58.368498: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not 
dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: 
cannot open shared object file: No such file or directory
2019-07-21 22:29:58.374333: I 
tensorflow/stream_executor/platform/default/dso_loader.cc:42] 
Successfully opened dynamic library libcudnn.so.7
2019-07-21 22:29:58.374376: W 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen 
some GPU libraries. Skipping registering GPU devices...
2019-07-21 22:29:58.374999: I 
tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports 
instructions that this TensorFlow binary was not compiled to use: FMA
2019-07-21 22:29:58.405862: I 
tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 
3214800000 Hz
2019-07-21 22:29:58.407424: I 
tensorflow/compiler/xla/service/service.cc:168] XLA service 
0x55a967b9ab00 executing computations on platform Host. Devices:
2019-07-21 22:29:58.407486: I 
tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device 
(0): <undefined>, <undefined>
2019-07-21 22:29:58.563751: I 
tensorflow/compiler/xla/service/service.cc:168] XLA service 
0x55a967b97fa0 executing computations on platform CUDA. Devices:
2019-07-21 22:29:58.563835: I 
tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device 
(0): GeForce GTX 1060 6GB, Compute Capability 6.1
2019-07-21 22:29:58.564029: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device 
interconnect StreamExecutor with strength 1 edge matrix:
2019-07-21 22:29:58.564055: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]
2019-07-21 22:30:00.791745: W 
tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time 
warning): Not using XLA:CPU for cluster because envvar 
TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, 
either set that envvar, or use experimental_jit_scope to enable XLA:CPU. 
To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 
(as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar 
XLA_FLAGS=--xla_hlo_profile.
W0721 22:30:01.203895 140325207855424 deprecation_wrapper.py:119] From 
classify_image.py:85: The name tf.gfile.GFile is deprecated. Please use 
tf.io.gfile.GFile instead.

giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score 
= 0.89107)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00779)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens 
(score = 0.00296)
custard apple (score = 0.00147)
earthstar (score = 0.00117)

Alex.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/talk/attachments/20190722/a2b2ee2d/attachment.html>