oVirt环境安装使用Nvidia vGPU(Nvidia虚拟化显卡)

说明

因为M60显卡在用中,有一块Tesla P4空闲中,所以就拿它做试验记录吧,本过程也适用于其它支持虚拟化的Nvidia显卡,另本次使用的oVirt环境是4.4版本,也应适用于4.3版本。

本文章中涉及的Nvidia vGPU驱动安装包请在本站点的安装包下载页面下载(Nvidia Grid vGPU for oVirt

过程

将Nvidia Tesla P4显卡插入物理服务器,P4不需要单独供电,注意有些显卡例如M60是需要单独供电的;

将主机上运行的虚机全部关机或迁移走;

将主机置为维护模式(“管理”->“维护”);

编辑主机,进入内核标签页,将“Hostdev透传和SR-IOV”以及“黑名单Nouveau”选中;

执行重新安装主机操作(“安装”->“重新安装”);

在/etc/modprobe.d/blacklist.conf中增加“blacklist nouveau”,如下:

完事重启主机

SSH进入主机,通过cat /proc/cmdline |grep nouveau检查确认内核参数修改生效;

将Nvidia的vGPU驱动安装包拷贝到这台主机上,然后安装;

[root@nodep4 ~]# rpm -ivh NVIDIA-vGPU-rhel-8.2-450.55.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
  1:NVIDIA-vGPU-rhel-1:8.2-450.55 ################################# [100%]

通过lsmod | grep nvidia_vgpu_vfio和systemctl status nvidia-vgpu-mgr查看下状态,如下:

如果上述状态不正常,重启下主机;

通过nvidia-smi查看下显卡的状态,如下:

通过“vdsm-client Host hostdevListByCaps”查看下可用的vGPU类型,找到“mdev”,如下:

"mdev": {
        "nvidia-157": {
          "available_instances": "4",
          "description": "num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4",
          "name": "GRID P4-2B"
        },
        "nvidia-214": {
          "available_instances": "4",
          "description": "num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4",
          "name": "GRID P4-2B4"
        },
        "nvidia-243": {
          "available_instances": "8",
          "description": "num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8",
          "name": "GRID P4-1B4"
        },
        "nvidia-288": {
          "available_instances": "2",
          "description": "num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=4096x2160, max_instance=2",
          "name": "GRID P4-4C"
        },
        "nvidia-289": {
          "available_instances": "1",
          "description": "num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=4096x2160, max_instance=1",
          "name": "GRID P4-8C"
        },
        "nvidia-63": {
          "available_instances": "8",
          "description": "num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=8",
          "name": "GRID P4-1Q"
        },
        "nvidia-64": {
          "available_instances": "4",
          "description": "num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=7680x4320, max_instance=4",
          "name": "GRID P4-2Q"
        },
        "nvidia-65": {
          "available_instances": "2",
          "description": "num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=7680x4320, max_instance=2",
          "name": "GRID P4-4Q"
        },
        "nvidia-66": {
          "available_instances": "1",
          "description": "num_heads=4, frl_config=60, framebuffer=8192M, max_resolution=7680x4320, max_instance=1",
          "name": "GRID P4-8Q"
        },
        "nvidia-67": {
          "available_instances": "8",
          "description": "num_heads=1, frl_config=60, framebuffer=1024M, max_resolution=1280x1024, max_instance=8",
          "name": "GRID P4-1A"
        },
        "nvidia-68": {
          "available_instances": "4",
          "description": "num_heads=1, frl_config=60, framebuffer=2048M, max_resolution=1280x1024, max_instance=4",
          "name": "GRID P4-2A"
        },
        "nvidia-69": {
          "available_instances": "2",
          "description": "num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=1280x1024, max_instance=2",
          "name": "GRID P4-4A"
        },
        "nvidia-70": {
          "available_instances": "1",
          "description": "num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=1280x1024, max_instance=1",
          "name": "GRID P4-8A"
        },
        "nvidia-71": {
          "available_instances": "8",
          "description": "num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8",
          "name": "GRID P4-1B"
        }
      },

根据需要,确定自己要使用的mdev类型,每种mdev类型对应可分配的不同vGPU规格,例如“nvidia-157”代表“Grid P4-2B”,可分成4个vGPU,每个vGPU的framebuffer显存大小为2048M,最大分辨率支持5120×2880。这里我们选择“nvidia-157”做测试。

新建一个win10虚机,安装好系统(过程略),安装好tools工具(virtio-win,过程略),配置好IP地址,开启远程桌面访问

然后关闭此虚拟机,编辑自定义属性,添加mdev_type,取值为“nvidia-157”,如下图:

将此win10虚机开机,通过windows远程桌面访问

将Nvidia vGPU win10的驱动拷贝到此虚机中,双击运行,按步骤安装即可,安装完成后重启虚机,再次通过远程桌面访问,查看我的电脑->设备管理器中显示适配器,会看到有了一个“Nvidia GRID P4-2B”,证明驱动安装成功;

任务栏右下角进入Nvidia控制面板,选中“管理许可证”,输入许可证服务器的IP地址和端口号,ok,这样就可以了;

注意spice是不支持物理显卡的,所以这里通过windows远程桌面(RDP)访问虚机,如果使用spice的话会黑屏,下一篇文章我们会介绍一种使用这个vGPU做显示用的方法。(目前通过spice也可以使用具有Nvidia vGPU的虚机了,官方虽注明了不支持,但实测是可以的,但是效果还不太好)

PS:转载文章请注明来源:oVirt中文社区(www.cnovirt.com)

扫码加好友拉你进oVirt技术交流群!

2 条回复 A 作者 M 管理员 E
  1. […] oVirt中安装使用Nvidia vGPU的方法参考这篇文章https://www.cnovirt.com/archives/2683; […]

  2. NVIDIA-vGPU-rhel-8.2-450.55.x86_64.rpm 有下载连接吗?

欢迎您,新朋友,感谢参与互动!欢迎您 {{author}},您在本站有{{commentsCount}}条评论