实验室服务器RTX2080Ti安装nvidia驱动

发表于 2020-08-27 更新于 2021-04-10 分类于 ubuntu 阅读次数：

ubuntu16.04 RTX2080Ti

一、安装NVIDIA驱动

1.下载驱动文件
去官网下载和自己的显卡适配的驱动文件，是个.run文件。贴个下载地址：https://www.nvidia.cn/Download/index.aspx

search之后下载就行，记住它下载到了哪个文件夹里，以后会用的到
2.文件已经下载好了，但是在安装之前我我们需要做一些准备。
（1）、删除原有驱动（可选）：$ sudo apt-get remove –purge nvidia*
我在删除的时候就提示我没有原有的驱动。
（2）、禁用nouveau，安装NVIDIA需要把系统自带的驱动禁用
打开文件：sudo gedit /etc/modprobe.d/blacklist.conf
在文本最后添加以下内容：

1 2	blacklist nouveau options nouveau modeset=0

然后保存退出
执行：sudo update-initramfs -u
重启，命令行输入：lsmod | grep nouveau
没有任何输出的话就代表禁用成功。

3.安装NVIDIA驱动程序（准备另一台电脑，用来看教程或者是查找需要输入的命令）

用图形界面服务：sudo service lightdm stop
按Ctrl+Alt+F1进入命令行，输入用户名密码登录（通过Ctrl+Alt+F7可返回界面）
下载的是.run文件，首先给文件赋予执行权限，首先cd到下载目录：（重要的事情说三遍：注意参数、注意参数、注意参数）
打开所在的文件夹：cd ～/download/ （这里应该写你当时把.run文件下载到的文件夹的路径，一般默认是下载文件夹，如果你在这里出现错误，比如找不到文件，或者路径是中文的，可以尝试把.run文件移动到一个home下，这样可以直接打：cd /home/你的用户名/ 这样就可以打开了。）
然后输入：sudo chmod a+x NVIDIA-Linux-x86_64-410.78.run （敲自己下载的文件名字）
这是不会有什么输出。
接着安装：sudo ./NVIDIA-Linux-x86_64-410.78.run –no-opengl-files

no-opengl-files 只安装驱动文件，不安装opengl文件。这个参数最重要
–no-x-check 安装驱动时不检查X服务
–no-nouveau-check 安装驱动时不检查nouveau
后面两个参数可不加。
接着就是一直选择它默认的那个选项就好。
然后挂载驱动： modprobe nvidia
最后查看是否已经安装好了：nvidia-smi

出现这种界面，说明已经安装成功。可以打开图形界面了：sudo service lightdm start。后面的安装直接在终端即可。

二、安装cuda 10.0

安装和自己显卡的适配版本，我选择的是cuda 10.0。

1.下载文件

官网地址：https://developer.nvidia.com/cuda-zone 选择适合自己的版本

2.安装

1 2	sudo chmod a+x cuda_10.0.130_410.48_linux.run // 获取权限 sudo sh cuda_10.0.130_410.48_linux.run

这时会出现很长的声明，一直按enter键，把声明读完，然后就可以进行选择了。

因为是独立安装的NVIDIA 程序，所以进行上面的选择，安装完成后会出现个提醒，这是因为我在安装CUDA的时候没有选择安装驱动，提示需要安装驱动，忽略就行。

3.加入环境变量

打开.bashrc 文件：sudo gedit ~/.bashrc
打开文件后将下面两句话加入进去：

1
2
3

export LD_LIBRARY_PATH=$LDLIBRARY_PATH:/usr/local/cuda-10.0/lib64
export PATH=$PATH:/usr/local/cuda-10.0/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-10.0

保存关闭，命令行输入：

1	$source ~/.bashrc

4.终端输入

1	$nvcc --version

会输出CUDA的版本信息

安装成功截图

5.测试CUDA的samples

为什么需要安装cuda samples？一方面为了后面学习cuda使用，另一方面，可以检验cuda是否真的安装成功。如果cuda samples全部编译通过，没有一个Error信息（Warning忽略），那么就说明成功地安装了cuda。如果最后一行虽然显示PASS，但是编译过程中有ERROR，请自行网上搜索相关错误信息解决之后。

# 切换到cuda-samples所在目录
cd /usr/local/cuda-10.0/samples 或者 cd ~/NVIDIA_CUDA-10.0_Samples 
# 没有make，先安装命令 sudo apt-get install cmake，-j是最大限度的使用cpu编译，加快编译的速度
make –j
# 编译完毕，切换release目录（/usr/local/cuda-8.0/samples/bin/x86_64/linux/release完整目录）
cd ./bin/x86_64/linux/release
 
# 检验是否成功，运行实例
./deviceQuery 
 
# 可以认真看看自行结果，它显示了你的NVIDIA显卡的相关信息，最后能看到Result = PASS就算成功。

注意:最终能看到Result=PASS就成功了！！！

如下图：

测试结果

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: "GeForce RTX 2080 Ti"
  CUDA Driver Version / Runtime Version          11.0 / 10.0
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 11019 MBytes (11554717696 bytes)
  (68) Multiprocessors, ( 64) CUDA Cores/MP:     4352 CUDA Cores
  GPU Max Clock rate:                            1545 MHz (1.54 GHz)
  Memory Clock rate:                             7000 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 5767168 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 26 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: "GeForce RTX 2080 Ti"
  CUDA Driver Version / Runtime Version          11.0 / 10.0
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 11016 MBytes (11551440896 bytes)
  (68) Multiprocessors, ( 64) CUDA Cores/MP:     4352 CUDA Cores
  GPU Max Clock rate:                            1545 MHz (1.54 GHz)
  Memory Clock rate:                             7000 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 5767168 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 104 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
> Peer access from GeForce RTX 2080 Ti (GPU0) -> GeForce RTX 2080 Ti (GPU1) : No
> Peer access from GeForce RTX 2080 Ti (GPU1) -> GeForce RTX 2080 Ti (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 10.0, NumDevs = 2
Result = PASS