在vmware里部署ovirt遇到的问题及解决办法

前言

有不少同学反映在vmware的虚机里部署oVirt遇到了问题,我也试了一把,果然到处是坑,为了便于手头受限、只能在vmware里学习ovirt的同学,我把遇到的问题及解决办法分享给大家。

注:虽然能跑,但还是不建议在vmware里跑ovirt,仅限学习用。本文是针对HostedEngine方式。

环境

硬件环境:Thinkpad W510,4核8线,16G内存;

vmware环境是:vmware workstation 12;(已知vmware workstation 15版本同样存在问题,vmware vsphere未测试,如遇到问题可尝试本文中的解决办法)

创建的宿主虚机(作为ovirt的node)配置是:2颗4核cpu,8G内存,200G硬盘;

oVirt的版本为:4.3.9;

注意:

  1. 下面问题的解决办法均需要ssh到宿主虚机中执行;
  2. 其它的步骤要保证都正常,不会的去看本站中的教程或官网,尤其注意/etc/hosts里的域名映射问题;
  3. 如果你是用的vmware workstation环境部署ovirt,建议以下的每个解决办法在部署前都提前做一下,然后再执行部署;
  4. 由于oVirt各个版本部署有所差异,内部部署细节经常变动,有可能后续版本更新后不存在这些问题;
  5. 毕竟虚机里部署性能受限,可能会导致意料之外的问题,多试几次吧!

问题及解决办法

  • 卡在Get local VM IP后报错:

[ INFO ] TASK [ovirt.hosted_engine_setup : Get local VM IP]

 

[ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 90, “changed”: true, “cmd”: “virsh -r net-dhcp-leases default | grep -i 00:16:3e:03:01:30 | awk ‘{ print $5 }’ | cut -f1 -d’/’”, “delta”: “0:00:00.050951”, “end”: “2020-04-23 09:57:00.043571”, “rc”: 0, “start”: “2020-04-23 09:56:59.992620”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}

在vmware里部署ovirt遇到的问题及解决办法

解决办法:

在/usr/share/ansible/roles/ovirt.hosted-engine-setup/tasks/bootstrap_local_vm/02_create_local_vm.yml文件中找到“virt-install”的部分,将“–machine pc-i440fx-rhel7.6.0”修改为“–machine pc-i440fx-rhel7.1.0”,如下图:

在vmware里部署ovirt遇到的问题及解决办法
  • 卡在Inject network configuration with guestfish:

[ INFO ] TASK [ovirt.hosted_engine_setup : Generate DHCP network configuration for the engine VM]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Generate static network configuration for the engine VM, IPv4]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Generate static network configuration for the engine VM, IPv6]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Inject network configuration with guestfish]

解决办法:

在/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/create_target_vm/03_hosted_engine_final_tasks.yml文件中,找到“guestfish”部分,注意有两处,添加guestfish环境变量:LIBGUESTFS_BACKEND_SETTINGS: force_tcg,如下图:

在vmware里部署ovirt遇到的问题及解决办法
在vmware里部署ovirt遇到的问题及解决办法
  • 卡在Extract /etc/hosts from the Hosted Engine VM:

[ INFO ] TASK [ovirt.hosted_engine_setup : Inject network configuration with guestfish]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Extract /etc/hosts from the Hosted Engine VM]

解决办法:

在/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/create_target_vm/03_hosted_engine_final_tasks.yml文件中,找到“virt-copy-out”部分,添加环境变量:LIBGUESTFS_BACKEND_SETTINGS: force_tcg,如下图:

在vmware里部署ovirt遇到的问题及解决办法
  • 卡在Wait for the host to be up:

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Always revoke the SSO token]

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]

注意注意:

这里有可能是因为虚机性能问题,添加主机过程比较慢,导致超时了,然后后面会报错,也有可能是根据没有执行添加主机的操作,这时候需要我们手动去添加主机。如果你不确定就登到engine上看一下,下面是具体办法。

我们先把主机的超时时间改长一些。

找个/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/bootstrap_local_vm/05_add_host.yml这个文件,找到“Wait for the host to be up”部分,将“retries: 120”改成“retries: 360”,如下图:

在vmware里部署ovirt遇到的问题及解决办法

然后在部署到这一步时,注意注意了,

在你的本地机器上(你使用浏览器的这台),在hosts文件里增加node(即这台宿主虚机)的域名映射,例如

192.168.8.221 node221.com

然后通过域名打开这个engine虚机的临时管理台,注意地址是https://node221.com:6900,端口是6900,域名是你的node的域名,不是engine的。打开后点“管理门户”,使用admin帐号登录进去,密码是你之前在部署页面中所填的。

在vmware里部署ovirt遇到的问题及解决办法

然后到“计算”->“主机”页面中去看,主机列表中有没有一台正在执行安装操作的主机?如果列表是空的话,点右上角“新建”,我们手动添加主机,输入名称、地址、root密码,如下图:

在vmware里部署ovirt遇到的问题及解决办法

点确定后等着就可以了,一直到主机激活,部署页面中不再卡住。

  • 卡在Check engine VM health:

[ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health]

[ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 180, “changed”: true, “cmd”: [“hosted-engine”, “–vm-status”, “–json”], “delta”: “0:00:00.901148”, “end”: “2020-04-28 05:22:06.475416”, “rc”: 0, “start”: “2020-04-28 05:22:05.574268”, “stderr”: “”, “stderr_lines”: [], “stdout”: “{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=10895 (Tue Apr 28 05:21:55 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=10895 (Tue Apr 28 05:21:55 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”node221.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”35c7eb12\”, \”local_conf_timestamp\”: 10895, \”host-ts\”: 10895}, \”global_maintenance\”: false}”, “stdout_lines”: [“{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=10895 (Tue Apr 28 05:21:55 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=10895 (Tue Apr 28 05:21:55 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”node221.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”35c7eb12\”, \”local_conf_timestamp\”: 10895, \”host-ts\”: 10895}, \”global_maintenance\”: false}”]}

[ INFO ] TASK [ovirt.hosted_engine_setup : Check VM status at virt level]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Fail if engine VM is not running]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Get VDSM’s target engine VM stats]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Convert stats to JSON format]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address from VDSM stats]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Fail if Engine IP is different from engine’s he_fqdn resolved IP]

[ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “Engine VM IP address is while the engine’s he_fqdn engine222.com resolves to 192.168.8.222. If you are using DHCP, check your DHCP reservation configuration”}

解决办法:

到/usr/libexec/vdsm/hooks/before_vm_start目录下,新建一个名为“50_vm_rhel7_1_0”的文件,文件的内容为:

#!/usr/bin/python2

import os
import hooking

domxml = hooking.read_domxml()

os_elem = domxml.getElementsByTagName('os')[0]
type_elem = os_elem.getElementsByTagName('type')
if type_elem:
  if type_elem[0].attributes['machine'].value == 'pc-i440fx-rhel7.6.0':
    type_elem[0].setAttribute('machine', "pc-i440fx-rhel7.1.0")
    #type_elem[0].removeAttribute('machine')

hooking.write_domxml(domxml)
  • 卡在Copy engine logs:

[ INFO ] TASK [ovirt.hosted_engine_setup : Add additional gluster hosts to engine]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Add additional glusterfs storage domains]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Fetch logs from the engine VM]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Set destination directory path]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Create destination directory]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Find the local appliance image]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Set local_vm_disk_path]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Give the vm time to flush dirty buffers]

[ INFO ] ok: [localhost -> localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Copy engine logs]

解决办法:

在/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/fetch_engine_logs.yml文件中找到“Copy engine logs”,在virt-copy-out中添加环境变量LIBGUESTFS_BACKEND_SETTINGS: force_tcg,如下图:

在vmware里部署ovirt遇到的问题及解决办法

注意,一旦部署出问题,建议执行以下清理步骤后再重新部署:

1、结束掉可能在运行的qemu虚机

pkill qemu-kvm

2、执行ovirt-hosted-engine清理命令(输入y执行清理)

ovirt-hosted-engine-cleanup

3、清理临时文件

rm -rf /var/tmp/localvm*

4、清理日志文件

rm -rf /var/log/ovirt-hosted-engine-setup/*

5、清理用于存放engine的存储(以本机上的nfs为例)

rm -rf /data/images/nfs/*

祝你好运!

PS:转载文章请注明来源:oVirt中文社区(www.cnovirt.com)

扫码?加好友拉你进oVirt技术交流群!

58 条回复 A 作者 M 管理员 E
  1. 楼主你好,我在ovirt中创建了一台虚拟主机测试这种部署方式,卡在了这个位置,一直循环直到超时.我也看了他的yml文件,不知道它想获取什么东西,能提供下思路吗?谢谢

    • 是不是网络的问题?oVirt嵌套虚拟化的话要改下网络,看下这篇文章http://www.cnovirt.com/archives/446

    • 此问题已解决,/etc/hosts中node节点的解析名必须与主机名称保持一致,希望大家莫踩此坑

    • 在NOTE主机里面的网络设置位置把IPV6关闭

  2. VMware workstation15版本,安装失败,卡在Check engine VM health界面,已经按照你文档中说的创建了50_vm_rhel7_1_0文件,添加了内容,并且授予文件755的权限了。

    • 不能直接用vi创建,因为要是一个可执行文件,可以复制那个目录下的其它文件,然后修改文件内容跟文件名

  3. [ FAILED ] Check engine VM health
    [ 00:02 ] Check VM status at virt level
    [ 00:02 ] Get target engine VM IP address
    [ 00:02 ] Get VDSM’s target engine VM stats
    [ 00:01 ] Convert stats to JSON format
    [ 00:02 ] Get target engine VM IP address from VDSM stats
    [ FAILED ] Fail if Engine IP is different from engine’s he_fqdn resolved IP
    2020-05-08 20:27:07,256+0800 DEBUG ansible on_any args kwargs

    • 看报错是engine的IP和域名不对应,检查下node上/etc/hosts里的域名映射,ip和域名要和部署时所填写的一致。

    • 试了多次,每次都是在这个地方卡住,而且我已经检查过了/etc/hosts里的映射没有问题。

    • 按着楼主的改 吧环境重新初始化下不会有错的

    • 我通过登录node主机的cockpit查看hostengine的虚拟机状态发现是在运行的,但是无法通过VNC和virt-viewer连接,而且hostengine的主机地址也无法ping通,我尝试强行重启hostengine的虚拟机,然后虚拟机就无法正常启动了。

  4. 用GFS做了超融合后部署host-engine提示我不足61G 这个node1.com:/engine 如何扩容 在哪里能修改参数来实现吗

    node1.com:/engine 67G 5.2G 59G 9% /rhev/data-center/mnt/glusterSD/node1.com:_engine

    • 在部署的时候填LV大小的时候,engine的写大些,建议65G以上;
      部署时至少两块盘,一块安装node,另一块用于做存储,做存储的盘总大小建议100G以上;

    • 我用vm 部的时候都没问题

  5. 这个解决了 后面我把gfs的盘改成200就好使了
    我的engine 和超融合都搞好了 但是现在遇到了一个问题
    engine显示是 ngine status : {“health”: “good”, “vm”: “up”, “detail”: “Up”}
    但是网页死活进不去 这几台机器是在esxi上搞得虚拟机 是不是和esxi有关系

    • 网页进不去报什么错?

  6. 就是和那种没配hosts解析一样的 无法访问此网站 但是我本地hosts是配了的

  7. 而且部起来的engine只能ping通宿主机 node的ip 其余网段内的都ping不通

    • 遇到同样问题

    • vmware创建的虚机要用桥接的方式。

  8. 当部署遇到cockpit炸掉的时候 注定这个部署是不会成功的

  9. 能帮忙说明一下 增加这个环境变量的原因吗:LIBGUESTFS_BACKEND_SETTINGS: force_tcg

  10. 博主能交流一下 你的实验环境 vmware wokstation 网络环境是如何设置的吗

    • 虚机的网络用桥接,直接桥到物理网卡上,ip和宿主机同一个网段

  11. [ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]
    [ ERROR ] fatal: [localhost]: FAILED! => {“ansible_facts”: {“ovirt_hosts”: [{“address”: “ovirt.allin1.com”, “affinity_labels”: [], “auto_numa_status”: “unknown”, “certificate”: {“organization”: “test.com”, “subject”: “O=test.com,CN=ovirt.allin1.com”}, “cluster”: {“href”: “/ovirt-engine/api/clusters/ae098e12-a3bd-11ea-a916-00163e0c3513”, “id”: “ae098e12-a3bd-11ea-a916-00163e0c3513”}, “comment”: “”, “cpu”: {“speed”: 0.0, “topology”: {}}, “device_passthrough”: {“enabled”: false}, “devices”: [], “external_network_provider_configurations”: [], “external_status”: “ok”, “hardware_information”: {“supported_rng_sources”: []}, “hooks”: [], “href”: “/ovirt-engine/api/hosts/8f5d4e02-acdb-4c0c-929e-4c3a805b8681”, “id”: “8f5d4e02-acdb-4c0c-929e-4c3a805b8681”, “katello_errata”: [], “kdump_status”: “unknown”, “ksm”: {“enabled”: false}, “max_scheduling_memory”: 0, “memory”: 0, “name”: “ovirt.allin1.com”, “network_attachments”: [], “nics”: [], “numa_nodes”: [], “numa_supported”: false, “os”: {“custom_kernel_cmdline”: “”}, “permissions”: [], “port”: 54321, “power_management”: {“automatic_pm_enabled”: true, “enabled”: false, “kdump_detection”: true, “pm_proxies”: []}, “protocol”: “stomp”, “se_linux”: {}, “spm”: {“priority”: 5, “status”: “none”}, “ssh”: {“fingerprint”: “SHA256:dDfyxWmNDGYlPWj3i5CD8S0P+zFhYg9qoJSEj7bDStA”, “port”: 22}, “statistics”: [], “status”: “installing”, “storage_connection_extensions”: [], “summary”: {“total”: 0}, “tags”: [], “transparent_huge_pages”: {“enabled”: false}, “type”: “ovirt_node”, “unmanaged_networks”: [], “update_available”: false, “vgpu_placement”: “consolidated”}]}, “attempts”: 120, “changed”: false, “deprecations”: [{“msg”: “The ‘ovirt_host_facts’ module has been renamed to ‘ovirt_host_info’, and the renamed one no longer returns ansible_facts”, “version”: “2.13”}]}

    按博主说明 登陆engine的“管理门户” 查看到主机有自动添加 但状态一直是installing 请问应该如何排查啊

  12. [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health]
    [ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 180, “changed”: true, “cmd”: [“hosted-engine”, “–vm-status”, “–json”], “delta”: “0:00:01.775044”, “end”: “2020-06-01 18:24:04.225416”, “rc”: 0, “start”: “2020-06-01 18:24:02.450372”, “stderr”: “”, “stderr_lines”: [], “stdout”: “{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=25845 (Mon Jun 1 18:23:54 2020)\\nhost-id=1\\nscore=2830\\nvm_conf_refresh_time=25847 (Mon Jun 1 18:23:56 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”ovirt.allin1.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 2830, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”21872a3f\”, \”local_conf_timestamp\”: 25847, \”host-ts\”: 25845}, \”global_maintenance\”: false}”, “stdout_lines”: [“{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=25845 (Mon Jun 1 18:23:54 2020)\\nhost-id=1\\nscore=2830\\nvm_conf_refresh_time=25847 (Mon Jun 1 18:23:56 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”ovirt.allin1.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 2830, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”21872a3f\”, \”local_conf_timestamp\”: 25847, \”host-ts\”: 25845}, \”global_maintenance\”: false}”]}
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check VM status at virt level]
    [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if engine VM is not running]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address]
    [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get VDSM’s target engine VM stats]
    [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Convert stats to JSON format]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address from VDSM stats]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if Engine IP is different from engine’s he_fqdn resolved IP]
    [ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “Engine VM IP address is while the engine’s he_fqdn engine.test.com resolves to 69.172.200.109. If you are using DHCP, check your DHCP reservation configuration”}

    • 怎么解决的呀,我也卡在这里了

    • 我也是卡在这里试了几十次了,每次都是在这个地方卡住,而且我已经检查过了/etc/hosts里的映射没有问题。

    • 检查下为engine配的域名是否外网可访问,可能解析到外网地址上去了,把外网断掉或者换个域名。

    • 不行,还是同一个地方报错 啊。
      [ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health]
      [ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 180, “changed”: true, “cmd”: [“hosted-engine”, “–vm-status”, “–json”], “delta”: “0:00:00.337352”, “end”: “2020-06-08 21:44:50.035289”, “rc”: 0, “start”: “2020-06-08 21:44:49.697937”, “stderr”: “”, “stderr_lines”: [], “stdout”: “{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5017 (Mon Jun 8 21:44:45 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5017 (Mon Jun 8 21:44:46 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”hjq344.node1.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”vm not running on this host\”, \”health\”: \”bad\”, \”vm\”: \”down_unexpected\”, \”detail\”: \”unknown\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”db66f9a6\”, \”local_conf_timestamp\”: 5017, \”host-ts\”: 5017}, \”global_maintenance\”: false}”, “stdout_lines”: [“{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5017 (Mon Jun 8 21:44:45 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5017 (Mon Jun 8 21:44:46 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”hjq344.node1.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”vm not running on this host\”, \”health\”: \”bad\”, \”vm\”: \”down_unexpected\”, \”detail\”: \”unknown\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”db66f9a6\”, \”local_conf_timestamp\”: 5017, \”host-ts\”: 5017}, \”global_maintenance\”: false}”]}
      [ INFO ] TASK [ovirt.hosted_engine_setup : Check VM status at virt level]
      [ INFO ] changed: [localhost]
      [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if engine VM is not running]
      [ INFO ] skipping: [localhost]
      [ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address]
      [ INFO ] changed: [localhost]
      [ INFO ] TASK [ovirt.hosted_engine_setup : Get VDSM’s target engine VM stats]
      [ INFO ] changed: [localhost]
      [ INFO ] TASK [ovirt.hosted_engine_setup : Convert stats to JSON format]
      [ INFO ] ok: [localhost]
      [ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address from VDSM stats]
      [ INFO ] ok: [localhost]
      [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if Engine IP is different from engine’s he_fqdn resolved IP]
      [ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “Engine VM IP address is while the engine’s he_fqdn hjq344.engine1.com resolves to 192.168.0.100. If you are using DHCP, check your DHCP reservation configuration”}

    • 上面“卡在Check engine VM health”这一步做了吗?
      到/usr/libexec/vdsm/hooks/before_vm_start目录下,新建一个名为“50_vm_rhel7_1_0”的文件……

    • 1.已经按照你文档里要求,全部会卡住的地方需要做的配置和创建的文件都已经做了。而且就算创建的hostengine虚拟机创建为rhel7.1的版本也会多次在Get local VM IP界面,需要多次重新部署才能通过。
      2.经过多次测试还发现一个问题就是卡在Check engine VM health界面的时候,我通过node1的Cockpit界面可以看到hostengine的虚拟机的状态是运行的,但是通过cockpit界面的VNC和virt-viewer都无法进行SSH连接,强行重启hostengine虚拟机,再次就无法启动hostengine虚拟机了,我个人觉得是否还是虚拟机rhel7.1版本兼容性的问题。

    • 用的workstation还是esxi?(esxi的话我没有试过,可能有别的问题,workstation里的话如果用的和我一个版本的ovirt照着配应该不会有问题的)
      虚机什么配置?

    • VMware-workstation-15.5.5版本,intele5-2670×2,2×4的vCPU,24GB内存,500GB硬盘,桥接的网络。

    • 这个配置可以呀,去ansible的部署脚本里找到出问题的步骤分析分析吧。

    • 日志有如下报错:
      Message from syslogd@node1 at Jun 9 15:05:10 …
      kernel:NMI watchdog: BUG: soft lockup – CPU#0 stuck for 21s! [CPU 2/KVM:31794]

      Message from syslogd@node1 at Jun 9 15:05:10 …
      kernel:NMI watchdog: BUG: soft lockup – CPU#3 stuck for 21s! [kworker/3:0:25]

      2020-06-09 15:12:20,519+0800 ERROR ansible failed {‘status’: ‘FAILED’, ‘ansible_type’: ‘task’, ‘ansible_task’: u’Check engine VM health’, ‘ansible_result’: u’type: \nstr: {\’stderr_lines\’: [], u\’changed\’: True, u\’end\’: u\’2020-06-09 15:12:19.490055\’, \’_ansible_no_log\’: False, u\’stdout\’: u\'{“1”: {“conf_on_shared_storage”: true, “live-data”: true, “extra”: “metadata_parse_version=1\\\\nmetadata_feature_version=1\\\\ntimestamp=3899 (Tue Jun 9 15:12:15 2020)\\\\nhost-id=1\\\\nsco’, ‘task_duration’: 1027, ‘ansible_host’: u’localhost’, ‘ansible_playbook’: u’/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml’}
      2020-06-09 15:12:37,859+0800 ERROR ansible failed {‘status’: ‘FAILED’, ‘ansible_type’: ‘task’, ‘ansible_task’: u”Fail if Engine IP is different from engine’s he_fqdn resolved IP”, ‘ansible_result’: u’type: \nstr: {\’msg\’: u”Engine VM IP address is while the engine\’s he_fqdn engine.hjq.com resolves to 192.168.35.100. If you are using DHCP, check your DHCP reservation configuration”, \’changed\’: False, \’_ansible_no_log\’: False}’, ‘task_duration’: 1, ‘ansible_host’: u’localhost’, ‘ansible_playbook’: u’/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml’}

    • 不能直接用vi创建,因为要是一个可执行文件,可以复制那个目录下的其它文件,然后修改文件内容跟文件名

    • 把外网断掉,你用的这个域名engine.test.com在公网上能够解析。

    • 不能直接用vi创建,因为要是一个可执行文件,可以复制那个目录下的其它文件,然后修改文件内容跟文件名

  13. 离成功就差一步之遥 还是报错了

  14. [ INFO ] TASK [ovirt.hosted_engine_setup : Execute just a specific set of steps]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Force facts gathering]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check local VM dir stat]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Enforce local VM dir existence]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials]
    [ ERROR ] ConnectionError: Error while sending HTTP request: (7, ‘Failed connect to engine.poc.com:443; \xe8\xbf\x9e\xe6\x8e\xa5\xe8\xb6\x85\xe6\x97\xb6’)
    [ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 50, “changed”: false, “msg”: “Error while sending HTTP request: (7, ‘Failed connect to engine.poc.com:443; \\xe8\\xbf\\x9e\\xe6\\x8e\\xa5\\xe8\\xb6\\x85\\xe6\\x97\\xb6’)”}

    卡在这里了,求助~

  15. [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get host address resolution]
    [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check address resolution]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Parse host address resolution]
    [ INFO ] ok: [localhost]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if host’s ip is empty]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Avoid localhost]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Ensure host address resolves locally]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get target address from selected interface (IPv4)]
    [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get target address from selected interface (IPv6)]
    [ INFO ] changed: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check the resolved address resolves on the selected interface]
    [ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “The resolved address doesn’t resolve on the selected interface\n”}

    求助各位专家,这是什么问题,该如何解决?

    • hostname和/etc/hosts的配置不匹配导致的。

    • 还有可能是网络设置错误,在第一的设置页面要选择正确的网卡,我的部署选择的是bond0,选择eno8就会报这个错。

  16. [ INFO ] TASK [ovirt.hosted_engine_setup : Add iSCSI storage domain]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Add Fibre Channel storage domain]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get storage domain details]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Find the appliance OVF]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Parse OVF]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Get required size]
    [ INFO ] ok: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Remove unsuitable storage domain]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check storage domain free space]
    [ INFO ] skipping: [localhost]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Activate storage domain]
    [ ERROR ] Error: Fault reason is “Operation Failed”. Fault detail is “[]”. HTTP response code is 400.
    [ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “Fault reason is \”Operation Failed\”. Fault detail is \”[]\”. HTTP response code is 400.”}

    请教各位专家,这是什么问题,如何解决?

    • 请问你这个是怎么解决的呢?

  17. […] 如果你是在vmware workstation里部署,请查阅这篇文章《在vmware里部署ovirt遇到的问题及解决办法》 […]

  18. oVirt 4.3.10 系统是centos7.8(2003)+vmware 15.5.6 基本都参考着修改了,这里的“50_vm_rhel7_1_0”建立过,也测试过使用7.2.0进行替代,都没有办法成功。(试过直接engine-setup方式,安装后装VM系统要用7.2.0。)

    [ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health]
    [ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 180, “changed”: true, “cmd”: [“hosted-engine”, “–vm-status”, “–json”], “delta”: “0:00:00.297898”, “end”: “2020-07-13 23:08:03.429984”, “rc”: 0, “start”: “2020-07-13 23:08:03.132086”, “stderr”: “”, “stderr_lines”: [], “stdout”: “{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5043 (Mon Jul 13 23:07:51 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5046 (Mon Jul 13 23:07:53 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineDown\\nstopped=False\\n\”, \”hostname\”: \”engine.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”25b46cca\”, \”local_conf_timestamp\”: 5046, \”host-ts\”: 5043}, \”global_maintenance\”: false}”, “stdout_lines”: [“{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5043 (Mon Jul 13 23:07:51 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5046 (Mon Jul 13 23:07:53 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineDown\\nstopped=False\\n\”, \”hostname\”: \”engine.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”25b46cca\”, \”local_conf_timestamp\”: 5046, \”host-ts\”: 5043}, \”global_maintenance\”: false}”]}
    [ INFO ] TASK [ovirt.hosted_engine_setup : Check VM status at virt level]
    [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if engine VM is not running]
    [ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “Engine VM is not running, please check vdsm logs”}

    • 您问题解决了么?我也是相同的报错。

    • 不能直接用vi创建,因为要是一个可执行文件,可以复制那个目录下的其它文件,然后修改文件内容跟文件名

    • chmod +x不就可以了么?

    • 超时120改为360改了吗 我按照楼主的设置了一遍 完美通过了 workstations16

  19. [ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “The selected network interface is not valid”} [ ERROR ] Failed to execute stage ‘Closing up’: Failed executing ansible-playbook
    那位大佬帮解决一下这个报错

    • 报错信息都提示了的,选择的网络接口不可用

  20. 首先感谢楼主,总结的太全面了 不过其中 修改为“–machine pc-i440fx-rhel7.1.0” 复制的小伙伴注意啦 应该是 修改为“–machine pc-i440fx-rhel7.1.0” 楼主截图中是正确的

  21. [ INFO ] TASK [ovirt.hosted_engine_setup : Inject network configuration with guestfish]
    一直卡这一步,解决方法也做了,但好像还是不行啊?求助

  22. 安装好engine访问网页提示500内部服务器错误

欢迎您,新朋友,感谢参与互动!欢迎您 {{author}},您在本站有{{commentsCount}}条评论