Home > life is fun > Nvidia Tesla K20c on Ubuntu 16.04

Nvidia Tesla K20c on Ubuntu 16.04


Spent 2 days to sort this out.  To install Nvidia Tesla K20c on Ubuntu 16.04.

My hardware + OS env :

  • Nvidia Tesla K20c
  • Nvidia GT 520 (graphic card to monitor)
  • Intel i5 CPU
  • 16GB RAM
  • Ubuntu 16.04 on SSD

Initially, I installed the Nvidia official driver, and CUDA using :

  • sudo apt-get install nvidia-cuda-toolkit
  • sudo ./NVIDIA-Linux-x86_64-331.89.run

But then ran into several weird errors like below.

“`
victor@ubuntu-tesla:~/Downloads$ cat /var/log/nvidia-installer.log
nvidia-installer log file ‘/var/log/nvidia-installer.log’
creation time: Sat Feb 4 11:15:23 2017
installer version: 331.89

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

nvidia-installer command line:
./nvidia-installer
–no-cc-version-check

Using: nvidia-installer ncurses user interface
-> License accepted.
-> Installing NVIDIA driver version 331.89.
-> There appears to already be a driver installed on your system (version: 331.89). As part of installing this driver (version: 331.89), the existing driver will be uninstalled. Are you sure you want to continue? (‘no’ will abort installation) (Answer: Yes)
-> Running distribution scripts
executing: ‘/usr/lib/nvidia/pre-install’…
-> done.
-> The distribution-provided pre-install script failed! Continue installation anyway? (Answer: Yes)
-> Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later. (Answer: No)
-> Performing CC sanity check with CC=”cc”.
-> Kernel source path: ‘/lib/modules/4.4.0-59-generic/build’
-> Kernel output path: ‘/lib/modules/4.4.0-59-generic/build’
-> Performing rivafb check.
-> Performing nvidiafb check.
-> Performing Xen check.
-> Performing PREEMPT_RT check.
-> Cleaning kernel module build directory.
executing: ‘cd ./kernel; make clean’…
-> Building NVIDIA kernel module:
executing: ‘cd ./kernel; make module SYSSRC=/lib/modules/4.4.0-59-generic/build SYSOUT=/lib/modules/4.4.0-59-generic/build NV_BUILD_MODULE_INSTANCES=’…
NVIDIA: calling KBUILD…
make[1]: Entering directory ‘/usr/src/linux-headers-4.4.0-59-generic’
test -e include/generated/autoconf.h -a -e include/config/auto.conf || ( \
echo >&2; \
echo >&2 ” ERROR: Kernel configuration is invalid.”; \
echo >&2 ” include/generated/autoconf.h or include/config/auto.conf are missing.”;\
echo >&2 ” Run ‘make oldconfig && make prepare’ on kernel src to fix it.”; \
echo >&2 ; \
/bin/false)
mkdir -p /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/.tmp_versions ; rm -f /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/.tmp_versions/*
make -f ./scripts/Makefile.build obj=/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel
cc -Wp,-MD,/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/.nv.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-linux-gnu/5/include -I./arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I./include/uapi -Iinclude/generated/uapi –
include ./include/linux/kconfig.h -Iubuntu/include -D__KERNEL__ -fno-pie -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -fno-PIE -fno-pie -no-pie -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -Wno-maybe-uninitialized -O2 –param=allow-store-data-races=0 -Wframe-larger-than=1024 -fstack-protector-strong -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls
-fno-var-tracking-assignments -pg -mfentry -DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO -DNV_MODULE_INSTANCE=0 -DNV_BUILD_MODULE_INSTANCES=0 -UDEBUG -U_DEBUG -DNDEBUG -I/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\”331.89\” -Wno-unused-function -Wuninitialized -mno-red-zone -mcmodel=kernel -DNV_UVM_ENABLE -D__linux__ -DNV_DEV_NAME=\”nvidia\” -DMODULE -D”KBUILD_STR(s)=#s” -D”KBUILD_BASENAME=KBUILD_STR(nv)” -D”KBUILD_MODNAME=KBUILD_STR(nvidia)” -c -o /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/.tmp_nv.o /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.c
In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from ./include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/uapi/linux/capability.h:16,
from include/linux/capability.h:15,
from include/linux/sched.h:15,
from include/linux/utsname.h:5,
from /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv-linux.h:44,
from /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.c:13:
include/asm-generic/qrwlock.h: In function ‘queued_write_trylock’:
include/asm-generic/qrwlock.h:93:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
cnts, cnts | _QW_LOCKED) == cnts);
^
include/linux/compiler.h:165:40: note: in definition of macro ‘likely’
# define likely(x) __builtin_expect(!!(x), 1)
^
In file included from ./arch/x86/include/asm/preempt.h:5:0,
from include/linux/preempt.h:59,
from include/linux/spinlock.h:50,
from include/linux/seqlock.h:35,
from include/linux/time.h:5,
from include/uapi/linux/timex.h:56,
from include/linux/timex.h:56,
from include/linux/sched.h:19,
from include/linux/utsname.h:5,
from /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv-linux.h:44,
from /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.c:13:
include/linux/percpu-refcount.h: In function ‘percpu_ref_get_many’:
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:419:34: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_1(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:364:11: note: in expansion of macro ‘this_cpu_add_1’
case 1: stem##1(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-refcount.h:177:3: note: in expansion of macro ‘this_cpu_add’
this_cpu_add(*percpu_count, nr);
^
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:420:34: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_2(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:365:11: note: in expansion of macro ‘this_cpu_add_2’
case 2: stem##2(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-refcount.h:177:3: note: in expansion of macro ‘this_cpu_add’
this_cpu_add(*percpu_count, nr);
^
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:421:34: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_4(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:366:11: note: in expansion of macro ‘this_cpu_add_4’
case 4: stem##4(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-refcount.h:177:3: note: in expansion of macro ‘this_cpu_add’
this_cpu_add(*percpu_count, nr);
^
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:478:35: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_8(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:367:11: note: in expansion of macro ‘this_cpu_add_8’
case 8: stem##8(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-refcount.h:177:3: note: in expansion of macro ‘this_cpu_add’
this_cpu_add(*percpu_count, nr);
^
include/linux/percpu-refcount.h: In function ‘percpu_ref_put_many’:
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:419:34: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_1(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:364:11: note: in expansion of macro ‘this_cpu_add_1’
case 1: stem##1(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-defs.h:506:33: note: in expansion of macro ‘this_cpu_add’
#define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val))
^
include/linux/percpu-refcount.h:276:3: note: in expansion of macro ‘this_cpu_sub’
this_cpu_sub(*percpu_count, nr);
^
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:420:34: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_2(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:365:11: note: in expansion of macro ‘this_cpu_add_2’
case 2: stem##2(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-defs.h:506:33: note: in expansion of macro ‘this_cpu_add’
#define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val))
^
include/linux/percpu-refcount.h:276:3: note: in expansion of macro ‘this_cpu_sub’
this_cpu_sub(*percpu_count, nr);
^
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:421:34: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_4(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:366:11: note: in expansion of macro ‘this_cpu_add_4’
case 4: stem##4(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-defs.h:506:33: note: in expansion of macro ‘this_cpu_add’
#define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val))
^
include/linux/percpu-refcount.h:276:3: note: in expansion of macro ‘this_cpu_sub’
this_cpu_sub(*percpu_count, nr);
^
./arch/x86/include/asm/percpu.h:130:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
((val) == 1 || (val) == -1)) ? \
^
./arch/x86/include/asm/percpu.h:478:35: note: in expansion of macro ‘percpu_add_op’
#define this_cpu_add_8(pcp, val) percpu_add_op((pcp), val)
^
include/linux/percpu-defs.h:367:11: note: in expansion of macro ‘this_cpu_add_8’
case 8: stem##8(variable, __VA_ARGS__);break; \
^
include/linux/percpu-defs.h:496:33: note: in expansion of macro ‘__pcpu_size_call’
#define this_cpu_add(pcp, val) __pcpu_size_call(this_cpu_add_, pcp, val)
^
include/linux/percpu-defs.h:506:33: note: in expansion of macro ‘this_cpu_add’
#define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val))
^
include/linux/percpu-refcount.h:276:3: note: in expansion of macro ‘this_cpu_sub’
this_cpu_sub(*percpu_count, nr);
^
In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from ./include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/uapi/linux/capability.h:16,
from include/linux/capability.h:15,
from include/linux/sched.h:15,
from include/linux/utsname.h:5,
from /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv-linux.h:44,
from /tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.c:13:
./arch/x86/include/asm/uaccess.h: In function ‘copy_from_user’:
./arch/x86/include/asm/uaccess.h:717:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (likely(sz < 0 || sz >= n))
^
include/linux/compiler.h:165:40: note: in definition of macro ‘likely’
# define likely(x) __builtin_expect(!!(x), 1)
^
./arch/x86/include/asm/uaccess.h: In function ‘copy_to_user’:
./arch/x86/include/asm/uaccess.h:735:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (likely(sz < 0 || sz >= n))
^
include/linux/compiler.h:165:40: note: in definition of macro ‘likely’
# define likely(x) __builtin_expect(!!(x), 1)
^
/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.c: In function ‘nvidia_unlocked_ioctl’:
/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.c:2027:29: error: ‘struct file’ has no member named ‘f_dentry’
return nvidia_ioctl(file->f_dentry->d_inode, file, cmd, i_arg);
^
scripts/Makefile.build:258: recipe for target ‘/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.o’ failed
make[2]: *** [/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel/nv.o] Error 1
Makefile:1420: recipe for target ‘_module_/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel’ failed
make[1]: *** [_module_/tmp/selfgz26270/NVIDIA-Linux-x86_64-331.89/kernel] Error 2
make[1]: Leaving directory ‘/usr/src/linux-headers-4.4.0-59-generic’
NVIDIA: left KBUILD.
nvidia.ko failed to build!
Makefile:178: recipe for target ‘nvidia.ko’ failed
make: *** [nvidia.ko] Error 1
-> Error.
ERROR: Unable to build the NVIDIA kernel module.
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at http://www.nvidia.com.

“`

 

The solution is the DKMS !

“`

sudo dkms remove nvidia-current-updates/331.89

sudo apt-get install nvidia-current-updates

“`

Then re-install them:

“`

sudo ./NVIDIA-Linux-x86_64-331.89.run
sudo apt-get install nvidia-cuda-toolkit

“`

Ta-da! all works!!!!!!

“`

victor@ubuntu-tesla:~/$ nvidia-smi
Sat Feb 4 12:11:39 2017
+—————————————————————————–+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|——————————-+———————-+———————-+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K20c Off | 0000:01:00.0 Off | 0 |
| 30% 35C P8 27W / 225W | 0MiB / 4742MiB | 0% Default |
+——————————-+———————-+———————-+
| 1 GeForce GT 520 Off | 0000:02:00.0 N/A | N/A |
| 40% 40C P0 N/A / N/A | 348MiB / 955MiB | N/A Default |
+——————————-+———————-+———————-+

+—————————————————————————–+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 1 Not Supported |
+—————————————————————————–+
“`

 

 

Advertisements
Categories: life is fun
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: