排查部署 Linux 虚拟机时出现的问题

适用于:✔️ Linux VM

注意

本文中引用的 CentOS 是 Linux 分发版,将达到生命周期结束(EOL)。 请相应地考虑使用和规划。 有关详细信息,请参阅 CentOS 生命周期指南

尝试创建新的 Azure 虚拟机 (VM) 时,遇到的常见错误是预配失败或分配失败。

  • 当由于准备步骤不当,或者在从门户捕获映像期间选择了错误的设置而导致 OS 映像无法加载时,将发生预配失败。
  • 当群集或区域没有可用的资源或无法支持所请求的 VM 大小时,将发生分配失败。

如果本文未解决 Azure 问题,请访问 MSDN 和 Stack Overflow 上的 Azure 论坛。 可将问题发布到这些论坛上,或发布到 Twitter 上的 @AzureSupport。 还可提交 Azure 支持请求。 若要提交支持请求,请在 Azure 支持页上,选择“获取支持”。

现象

在创建自定义映像后,会出现典型的预配失败情况,然后从中部署 VM,然后遇到未显示 creatingVM 状态的 40 分钟,并看到以下错误消息:

Provisioning state Provisioning failed. 
OS Provisioning for VM 'sentilo' did not finish in the allotted time. 
The VM may still finish provisioning successfully. Please check provisioning state later. 
Also, make sure the image has been properly prepared (generalized). * Instructions for Windows: https://azure.microsoft.com/documentation/articles/virtual-machines-windows-upload-image/ * Instructions for Linux: https://azure.microsoft.com/documentation/articles/virtual-machines-linux-capture-image/.

或:

Deployment failed. Correlation ID: f9dcb33a-4e6e-45c5-9c9d-b29dd73da2e0. {
  "status": "Failed",
  "error": {
    "code": "ResourceDeploymentFailure",
    "message": "The resource operation completed with terminal provisioning state 'Failed'.",
    "details": [
      {
        "code": "OSProvisioningInternalError",
        "message": "OS Provisioning failed for VM 'iWishThisWouldCreateVM01' due to an internal error: The VM encountered an error during deployment. Please visit https://aka.ms/linuxprovisioningerror for more information on remediation."
      }
    ]
  }
}

然后,会看到标记为“ failedVM”状态。

为何会发生预配失败?

通常,预配失败的原因有多种,例如:

  • 缺少预配/配置错误代理

    需要确保代理存在且正常运行,应使用 cloud-init,或者映像不支持此功能,可以查看这些步骤

  • 映像配置不正确

    我们有有关如何使用 cloud-init 和其他 Azure 映像要求设置映像的指导,请查看这一点。

排查预配失败问题

若要确定预配失败的原因,需要从串行日志开始,可以使用 Azure 启动诊断部署 VM。

需要为 VM 部署启用了启动诊断的新 VM,该 VM 的映像失败,才能访问串行日志中的预配事件。

# create resource group
resourceGroup=myBrokenImageRG
location=westus2
az group create --name $resourceGroup --location $location
# create storage account
storageacct=mydiagdata$RANDOM
az storage account create \
  --resource-group $resourceGroup \
  --name $storageacct \
  --sku Standard_LRS \
  --location $location
# create VM
vmName=iWishThisWouldCreateVM01
brokenImageName=<ResourceID of brokenImage>
sshPubkeyPath=""
az vm create \
    --resource-group $resourceGroup \
    --name $vmName \
    --image $brokenImageName \
    --admin-username azadmin \
    --ssh-key-value $sshPubkeyPath \
    --boot-diagnostics-storage $storageacct

若要查看串行日志,可以转到门户,或运行以下命令以下载 serialConsoleLogBlobUri 日志:

az vm boot-diagnostics get-boot-log-uris --name $vmName --resource-group $resourceGroup

了解系统事件和预配事件的串行日志

首次创建 VM 时,cloud-init 将启动并尝试装载 ISO、建立网络连接、设置在创建 VM 期间传递的属性、装载临时磁盘(在受支持的 VM 大小上),并向 Azure 平台发出信号,指出初始 OS 配置已完成。

系统事件和密钥信息 串行日志 备注
内核版本和内核版本 [ 0.000000] Linux version 5.4.0-1031-azure (buildd@lcy01-amd64-021) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #32~18.04.1-Ubuntu SMP Tue Oct 6 10:03:22 UTC 2020 (Ubuntu 5.4.0-1031.32~18.04.1-azure 5.4.65) 显示在串行日志的开头。
内核命令行选项 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=UUID=8c0a4742-2f51-40b4-b659-357cfb0bb2a3 ro console=tty1 console=ttyS0 earlyprintk=ttyS0
[ 0.503399] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=UUID=8c0a4742-2f51-40b4-b659-357cfb0bb2a3 ro console=tty1 console=ttyS0 earlyprintk=ttyS0
显示在串行日志的开头。 搜索 command line:
Systemd 版本 [ 8.626739] systemd[1]: systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid) 搜索 systemd
已达到系统目标 [ [0;32m OK [0m] Reached target Swap.
[ [0;32m OK [0m] Reached target User and Group Name Lookups.
[ [0;32m OK [0m] Reached target Slices.
[ [0;32m OK [0m] Reached target Local File Systems (Pre).
[ [0;32m OK [0m] Reached target Local Encrypted Volumes.
[ [0;32m OK [0m] Reached target Local File Systems.
[ [0;32m OK [0m] Reached target System Time Synchronized.
[ [0;32m OK [0m] Reached target Network (Pre).
[ [0;32m OK [0m] Reached target Network.
[ [0;32m OK [0m] Reached target Host and Network Name Lookups.
[ [0;32m OK [0m] Reached target Cloud-config availability.
[ [0;32m OK [0m] Reached target System Initialization
[ [0;32m OK [0m] Reached target Timers.
[ [0;32m OK [0m] Reached target Paths.
[ [0;32m OK [0m] Reached target Network is Online.
[ [0;32m OK [0m] Reached target Remote File Systems (Pre).
[ [0;32m OK [0m] Reached target Remote File Systems.
[ [0;32m OK [0m] Reached target Sockets.
[ [0;32m OK [0m] Reached target Basic System.
[ [0;32m OK [0m] Reached target Login Prompts.
搜索 Reached target
跨不同发行版的通用系统网络目标 [ [0;32m OK [0m] Reached target Network (Pre).
[ [0;32m OK [0m] Reached target Network.
[ [0;32m OK [0m] Reached target Network is Online.
搜索 Reached target Network
Ubuntu 的深入网络状态和网络目标,以及系统网络由 systemd-network其管理的发行版。 Starting Network Time Synchronization...
[ [0;32m OK [0m] Started Network Time Synchronization.
Starting Initial cloud-init job (pre-networking)...
[ [0;32m OK [0m] Started Initial cloud-init job (pre-networking).
[ [0;32m OK [0m] Reached target Network (Pre).
Starting Network Service...
[ [0;32m OK [0m] Started Network Service.
Starting Wait for Network to be Configured...
Starting Network Name Resolution...
[ [0;32m OK [0m] Started Network Name Resolution.
[ [0;32m OK [0m] Reached target Network.
[ [0;32m OK [0m] Reached target Host and Network Name Lookups.
[ [0;32m OK [0m] Started Wait for Network to be Configured.
[ [0;32m OK [0m] Reached target Network is Online.
Starting Dispatcher daemon for systemd-networkd...
[ [0;32m OK [0m] Started Dispatcher daemon for systemd-networkd.
搜索 networknetworkd
RHEL/CentOS 的深入网络状态和网络目标,以及系统网络由 Network Manager其管理的发行版。 Starting Read and set NIS domainname from /etc/sysconfig/network...
[ [32m OK [0m] Started Read and set NIS domainname from /etc/sysconfig/network.
Starting Import network configuration from initramfs...
[ [32m OK [0m] Started Import network configuration from initramfs.
Starting Initial cloud-init job (pre-networking)...
[ [32m OK [0m] Started Initial cloud-init job (pre-networking).
[ [32m OK [0m] Reached target Network (Pre).
Starting Network Manager...
[ [32m OK [0m] Started Network Manager.
Starting Network Manager Wait Online...
Starting Network Manager Script Dispatcher Service...
[ [32m OK [0m] Started Network Manager Script Dispatcher Service.
[ [32m OK [0m] Started Network Manager Wait Online.
Starting LSB: Bring up/down networking...
[ [32m OK [0m] Started LSB: Bring up/down networking.
[ [32m OK [0m] Reached target Network.
[ [32m OK [0m] Reached target Network is Online.
搜索 networkNetwork Manager
SUSE/SLES 的深入网络状态和网络目标,以及系统网络由 Wicked其管理的发行版。 Starting Initial cloud-init job (pre-networking)...
[ [0;32m OK [0m] Reached target Host and Network Name Lookups.
[ [0;32m OK [0m] Started Initial cloud-init job (pre-networking).
[ [0;32m OK [0m] Reached target Network (Pre).
Starting wicked DHCPv6 supplicant service...
Starting wicked DHCPv4 supplicant service...
Starting wicked AutoIPv4 supplicant service...
[ [0;32m OK [0m] Started wicked DHCPv6 supplicant service.
[ [0;32m OK [0m] Started wicked DHCPv4 supplicant service.
[ [0;32m OK [0m] Started wicked AutoIPv4 supplicant service.
Starting wicked network management service daemon...
[ [0;32m OK [0m] Started wicked network management service daemon.
Starting wicked network nanny service...
[ [0;32m OK [0m] Started wicked network nanny service.
Starting wicked managed network interfaces...
[ [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for wicked m…etwork interfaces (22s / no limit)
[K[ [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for wicked m…etwork interfaces (28s / no limit)
[K[ [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for wicked m…etwork interfaces (32s / no limit)
[K[ [0;32m OK [0m] Started wicked managed network interfaces.
[ [0;32m OK [0m] Reached target Network.
[ [0;32m OK [0m] Reached target Network is Online.
搜索 networkwicked
启动范围是否足以让 cloud-init 开始? Starting Initial cloud-init job (pre-networking)...
Starting Initial cloud-init job (metadata service crawler)...
搜索 Starting Initial cloud-init job
已达到 Cloud-init 版本和 cloud-init 阶段 [ 22.446387] cloud-init[703]: Cloud-init v. 20.3-2-g371b392c-0ubuntu1~18.04.1 running 'init-local' at Wed, 28 Oct 2020 17:46:30 +0000. Up 21.23 seconds.
[ 28.357120] cloud-init[837]: Cloud-init v. 20.3-2-g371b392c-0ubuntu1~18.04.1 running 'init' at Wed, 28 Oct 2020 17:46:34 +0000. Up 24.52 seconds.
[ 50.421009] cloud-init[1445]: Cloud-init v. 20.3-2-g371b392c-0ubuntu1~18.04.1 running 'modules:config' at Wed, 28 Oct 2020 17:46:57 +0000. Up 48.21 seconds.
[ 51.338792] cloud-init[1541]: Cloud-init v. 20.3-2-g371b392c-0ubuntu1~18.04.1 running 'modules:final' at Wed, 28 Oct 2020 17:47:00 +0000. Up 51.01 seconds.
[ 51.366837] cloud-init[1541]: Cloud-init v. 20.3-2-g371b392c-0ubuntu1~18.04.1 finished at Wed, 28 Oct 2020 17:47:01 +0000. Datasource DataSourceAzure [seed=/dev/sr0]. Up 51.32 seconds
搜索 Cloud-init v
网络接口(NIC)、NIC 状态(向上/向下)和 NIC IP 地址。 显示是否已正确配置和分配 NIC IP 地址。 IP 地址分配可以通过 DHCP 动态分配,也可以通过静态配置。 [ 28.381544] cloud-init[837]: ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
[ 28.396781] cloud-init[837]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[ 28.416501] cloud-init[837]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
[ 28.427493] cloud-init[837]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[ 28.446544] cloud-init[837]: ci-info: | eth0 | True | 10.0.0.4 | 255.255.255.0 | global | 00:0d:3a:c6:17:d5 |
[ 28.460031] cloud-init[837]: ci-info: | eth0 | True | fe80::20d:3aff:fec6:17d5/64 | . | link | 00:0d:3a:c6:17:d5 |
[ 28.476415] cloud-init[837]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
[ 28.487962] cloud-init[837]: ci-info: | lo | True | ::1/128 | . | host | . |
[ 28.498191] cloud-init[837]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
搜索 ci-infoNet device info
IP 路由(IPv4 和 IPv6)。 显示各种终结点的 IP 路由,例如 VNet 子网、Azure 终结点(168.63.129.16)和 Azure 实例元数据服务器/IMDS 终结点(169.254.169.254)。 [ 28.508190] cloud-init[837]: ci-info: ++++++++++++++++++++++++++++++Route IPv4 info+++++++++++++++++++++++++++++++
[ 28.522189] cloud-init[837]: ci-info: +-------+-----------------+----------+-----------------+-----------+-------+
[ 28.531173] cloud-init[837]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
[ 28.549782] cloud-init[837]: ci-info: +-------+-----------------+----------+-----------------+-----------+-------+
[ 28.562896] cloud-init[837]: ci-info: | 0 | 0.0.0.0 | 10.0.0.1 | 0.0.0.0 | eth0 | UG |
[ 28.571653] cloud-init[837]: ci-info: | 1 | 10.0.0.0 | 0.0.0.0 | 255.255.255.0 | eth0 | U |
[ 28.580192] cloud-init[837]: ci-info: | 2 | 168.63.129.16 | 10.0.0.1 | 255.255.255.255 | eth0 | UGH |
[ 28.587633] cloud-init[837]: ci-info: | 3 | 169.254.169.254 | 10.0.0.1 | 255.255.255.255 | eth0 | UGH |
[ 28.600728] cloud-init[837]: ci-info: +-------+-----------------+----------+-----------------+-----------+-------+
[ 28.611117] cloud-init[837]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
[ 28.619534] cloud-init[837]: ci-info: +-------+-------------+---------+-----------+-------+
[ 28.629292] cloud-init[837]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[ 28.638596] cloud-init[837]: ci-info: +-------+-------------+---------+-----------+-------+
[ 28.647791] cloud-init[837]: ci-info: | 1 | fe80::/64 | :: | eth0 | U |
[ 28.660622] cloud-init[837]: ci-info: | 3 | local | :: | eth0 | U |
[ 28.670776] cloud-init[837]: ci-info: | 4 | ff00::/8 | :: | eth0 | U |
[ 28.691506] cloud-init[837]: ci-info: +-------+-------------+---------+-----------+-------+
搜索 ci-infoRoute IPv4 infoRoute IPv6 info
VM 上用户的 SSH 授权密钥。 authorized_keys SSH 中的文件指定了可用于登录到为其配置文件的用户帐户的 SSH 密钥。 ci-info: ++++++++++++++++++++++++++Authorized keys from /home/azureuser/.ssh/authorized_keys for user azureuser+++++++++++++++++++++++++++
ci-info: +---------+-------------------------------------------------------------------------------------------------+---------+---------+
ci-info: | Keytype | Fingerprint (sha256) | Options | Comment |
ci-info: +---------+-------------------------------------------------------------------------------------------------+---------+---------+
ci-info: | ssh-rsa | 88:b0:2a:ce:f5:91:49:a2:01:07:a4:e5:db:b3:8c:3e:7e:1f:52:83:53:3c:83:4f:a3:a7:17:13:65:a3:47:e2 | - | - |
ci-info: +---------+-------------------------------------------------------------------------------------------------+---------+---------+
搜索 Authorized keys
SSH 主机密钥生成。 主机密钥是用于对 SSH 协议中的计算机进行身份验证的加密密钥。 主机密钥是密钥对,通常使用 RSA、DSA 或 ECDSA 算法。 公共主机密钥存储在 SSH 客户端上和/或分发到 SSH 客户端,私钥存储在 SSH 服务器上。 Starting OpenSSH Server Key Generation...
[ [32m OK [0m] Started OpenSSH Server Key Generation.
[ 40.437735] cloud-init[837]: Generating public/private rsa key pair.
[ 40.451048] cloud-init[837]: Your identification has been saved in /etc/ssh/ssh_host_rsa_key.
[ 40.473777] cloud-init[837]: Your public key has been saved in /etc/ssh/ssh_host_rsa_key.pub.
[ 40.489730] cloud-init[837]: The key fingerprint is:
[ 40.501705] cloud-init[837]: SHA256:NGxA6sf9EAMtczaFSBSJqiGkafEZuPUykNLxefbXofM root@myVmName
[ 40.686610] cloud-init[837]: Generating public/private dsa key pair.
[ 40.712350] cloud-init[837]: Your identification has been saved in /etc/ssh/ssh_host_dsa_key.
[ 40.721901] cloud-init[837]: Your public key has been saved in /etc/ssh/ssh_host_dsa_key.pub.
[ 40.721966] cloud-init[837]: The key fingerprint is:
[ 40.722011] cloud-init[837]: SHA256:QjoxEw9PNOg0P3LW6wnSZzjsfQQ4vhW8S0dAuNWkWHM root@myVmName
[ 40.722606] cloud-init[837]: Generating public/private ecdsa key pair.
[ 40.722650] cloud-init[837]: Your identification has been saved in /etc/ssh/ssh_host_ecdsa_key.
[ 40.722690] cloud-init[837]: Your public key has been saved in /etc/ssh/ssh_host_ecdsa_key.pub.
[ 40.722734] cloud-init[837]: The key fingerprint is:
[ 40.722774] cloud-init[837]: SHA256:BaFqan71k4blzY8TQrLQOavMWoKHgUDgxEAuB0ouJCo root@myVmName
[ 41.063239] cloud-init[837]: Generating public/private ed25519 key pair.
[ 41.091125] cloud-init[837]: Your identification has been saved in /etc/ssh/ssh_host_ed25519_key.
[ 41.120794] cloud-init[837]: Your public key has been saved in /etc/ssh/ssh_host_ed25519_key.pub.
[ 41.154126] cloud-init[837]: The key fingerprint is:
[ 41.157135] cloud-init[837]: SHA256:KsKfIKjwGpMgbYYved5v5oNE6v6eeUwI4AxeeigXk14 root@myVmName
搜索Generating public/privateYour identification has been saved inThe key fingerprint is:SHA
密钥指纹的 ssh 主机转储。 <14>Oct 28 17:47:00 ec2: #############################################################
<14>Oct 28 17:47:00 ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
<14>Oct 28 17:47:00 ec2: 1024 SHA256:QjoxEw9PNOg0P3LW6wnSZzjsfQQ4vhW8S0dAuNWkWHM root@myVmName (DSA)
<14>Oct 28 17:47:00 ec2: 256 SHA256:BaFqan71k4blzY8TQrLQOavMWoKHgUDgxEAuB0ouJCo root@myVmName (ECDSA)
<14>Oct 28 17:47:00 ec2: 256 SHA256:KsKfIKjwGpMgbYYved5v5oNE6v6eeUwI4AxeeigXk14 root@myVmName (ED25519)
<14>Oct 28 17:47:00 ec2: 2048 SHA256:NGxA6sf9EAMtczaFSBSJqiGkafEZuPUykNLxefbXofM root@myVmName (RSA)
<14>Oct 28 17:47:00 ec2: -----END SSH HOST KEY FINGERPRINTS-----
<14>Oct 28 17:47:00 ec2: #############################################################
BEGIN SSH HOST KEY FINGERPRINTS搜索和 END SSH HOST KEY FINGERPRINTS
转储 ssh 主机密钥。 -----BEGIN SSH HOST KEY KEYS-----
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFzu1pBMlq3g/8ztkQo+ZukigmLzQ02/ogL7Xe8aKjbuM8q4ibo1kWnXB0UuGkGE0DotVyBQsoyUNorTj96G2Xo= root@myVmName
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIbGOVk/IMfL+RZBDo6YlfbKncVTIBy7wSrqL5ixX6yZ root@myVmName
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDnH5sIIEFi2ne6CMk1jscVQ289i4idOMJt3WwzHR1lOgJf9kPY+WzmFw71Ai9ZEpqSTpYWxgt+z26ujxAE3R1LvOn1QKetlsPLT5FH8oIZESXmYDb/KL/4k81aDelzko1Xipk5SSai8LeX1qglKUEyGevht9S+QQTHK8Ed++UDzNidCk02iAdhpX/0E5d09NE4r+v5wAojOazLnq6JaESYV07SI7rBOGO7hCdSrQwWodYnhyTQRP3FbqjUeNRqBl3uqlH3+rgMAAPsCpToFTCperTRmyBrCbspzpxIpQSEFbf639EL/7Cst/Ff2ND0D0zVAaSdrmFZisYUcO+VRanZ root@myVmName
-----END SSH HOST KEY KEYS-----
BEGIN SSH HOST KEY KEYS搜索和 END SSH HOST KEY KEYS
SSH 服务器是否启动? Starting OpenBSD Secure Shell server...
[ [0;32m OK [0m] Started OpenBSD Secure Shell server.
Starting OpenSSH server daemon...
[ [32m OK [0m] Started OpenSSH server daemon.
Starting OpenSSH Daemon...
[ [0;32m OK [0m] Started OpenSSH Daemon.
搜索 Secure Shell serverOpenSSH server daemonOpenSSH Daemon
是否允许用户会话和用户登录? VM 是否显示用户登录提示? Starting Accounts Service...
Starting Permit User Sessions...
Starting Login Service...
[ [0;32m OK [0m] Started Permit User Sessions.
[ [0;32m OK [0m] Started Login Service.
[ [0;32m OK [0m] Reached target Login Prompts.
[ [0;32m OK [0m] Started Accounts Service.
Ubuntu 18.04.5 LTS myVmName ttyS0
myVmName login:
搜索Accounts ServicePermit User SessionsLogin ServiceLogin Promptslogin:
Azure Linux 代理是否已成功启动? [ [0;32m OK [0m] Started Azure Linux Agent.
2020/10/28 17:46:52.082569 INFO Daemon Azure Linux Agent Version:2.2.45
搜索 Azure Linux Agent
从 Azure Linux 代理的角度来看,VM 是否已成功完成预配? 预配成功后,Azure Linux 代理是否启动 VM 扩展处理程序? 如果 Azure Linux 代理检测到 VM 预配成功,则仅启动 VM 扩展处理程序。 2020/10/28 17:46:52.586765 INFO Daemon Finished provisioning 搜索 INFO Daemon Finished provisioning
串行日志中是否有任何错误、失败或异常? 在串行日志中搜索failerrorwarn搜索和exception查找。

常见错误

禁用的 UDF 模块

串行日志中的错误

[   10.855501] cloud-init[732]: Cloud-init v. 20.4.1-0ubuntu1~18.04.1 running 'init-local' at Thu, 28 Jan 2021 23:43:02 +0000. Up 10.68 seconds.
[   10.869581] cloud-init[732]: 2021-01-28 23:43:03,097 - azure.py[WARNING]: /dev/sr0 was not mountable
[   10.875608] cloud-init[732]: 2021-01-28 23:43:03,106 - azure.py[ERROR]: No Azure metadata found
[   10.885776] cloud-init[732]: 2021-01-28 23:43:03,107 - azure.py[ERROR]: Could not crawl Azure metadata: No Azure metadata found
[   14.634117] cloud-init[732]: 2021-01-28 23:43:06,876 - azure.py[WARNING]: Reported failure to Azure fabric.

waagent.log错误

"UDF driver Blocklisted 2020/09/11 19:16:40.240016 ERROR Daemon Provisioning failed: [ProtocolError] [CopyOvfEnv] Error mounting dvd: [OSUtilError] Failed to mount dvd deviceInner error: [mount -o ro -t udf,iso9660 /dev/sr0 /mnt/cdrom/secure] returned 32: mount: /mnt/cdrom/secure: wrong fs type, bad option, bad superblock on /dev/sr0, missing codepage or helper program, or other error."

原因:UDF 驱动程序未加载到内核中,这是 VM 预配所必需的,请参阅 映像要求

首次在 Azure 上预配 VM 时,Azure 主机向 VM 提供“预配 cdrom iso 磁盘”。 此预配磁盘通常通过 /dev/sr0 提供给 VM。 在预配磁盘中,有一个预配清单,其中包含 VM 的预配信息。 VM 内预配代理应装载预配磁盘、读取预配清单并相应地预配 VM。

由于预配磁盘是一个 cdrom iso disk,因此内核需要 Linux UDF 驱动程序才能成功装载此磁盘。 Linux 映像上的Microsoft文档中引用了这一点。 对于此 VM,日志指示预配磁盘无法装载,这导致 VM 预配失败。 最有可能的原因是 UDF 驱动程序缺失或被阻止。

解决方案:确保 UDF 驱动程序配置为在内核中加载。

阻止 UDF 驱动程序的一种常见方法是通过内部 /etc/modprobe.d/的配置。 请与客户/映像所有者合作,确保 Linux UDF 驱动程序存在且未阻止。 有关阻止/取消阻止内核驱动程序,请参阅本文。

VM 标记中的 Unicode 字符问题

cloud-init.log错误

  File "/usr/lib/python2.7/site-packages/cloudinit/sources/DataSourceAzure.py", line 1316, in _get_metadata_from_imds
    except json.decoder.JSONDecodeError:
AttributeError: 'module' object has no attribute 'JSONDecodeError'

原因:发生这种情况是因为 VM 标记具有非 ascii 字符,cloud-init 的版本早于 20.3。

解决方案:使用或确保映像支持 cloud-init 20.3 或更高版本,或者从 VM 标记中删除非 ascii 字符。

包含 unicode 字符的密码

cloud-init.log错误

File "/usr/lib/python2.7/site-packages/cloudinit/sources/DataSourceAzure.py", line 1153, in encrypt_pass
    return crypt.crypt(password, salt_id + util.rand_str(strlen=16))
  File "/usr/lib64/python2.7/crypt.py", line 55, in crypt
    return _crypt.crypt(word, salt)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-11: ordinal not in range(128)

原因:发生这种情况是因为提供的密码具有不受支持的字符(非 ascii)。

解决方案:提供仅包含 ascii 字符的密码。

Dhclient 权限

cloud-init.log错误:

Command: ['/var/tmp/cloud-init/cloud-init-dhcp-yd8mvxud/dhclient', '-1', '-v', '-lf', '/var/tmp/cloud-init/cloud-init-dhcp-yd8mvxud/dhcp.leases', '-pf', '/var/tmp/cloud-init/cloud-init-dhcp-yd8mvxud/dhclient.pid', 'eth0', '-sf', '/bin/true']
Exit code: -
Reason: [Errno 13] Permission denied: b'/var/tmp/cloud-init/cloud-init-dhcp-yd8mvxud/dhclient'

原因:较旧版本的 cloud-init(版本 20.3 之前)通过复制和执行 dhclient 内部 /var/tmp执行 DHCP。 如果 /var/tmp VM 装载为noexec(无执行),则 DHCP 将因无权在内部/var/tmp执行而失败dhclient

Cloud-init 版本 >= 20.3 包含一个修复程序,它回退并执行 dhclient “原样”(如果不复制并执行它, /var/tmp 如果存在权限问题)。

解决方案:对于运行低于版本 20.3 的 cloud-init 的 VM,请配置 VM, /var/tmp 使其未装载为 noexec。 或者,将 VM 的 cloud-init 包升级到版本 >= 20.3。

注意

dhclient cloud-init 22.4 及更高版本中解决了权限问题。 有关详细信息,请参阅 cloud-init 问题 3956

获取更多日志

如果发现需要 VM 中的更多日志来了解问题,则可以使用 烘焙到映像的用户通过串行控制台 通过 SSH 连接到 VM。 如果没有用户烘焙,则可以使用用户重新创建映像,或使用 AZ VM 修复工具 将无法预配的 VM 的 OS 磁盘装载到另一个 VM。

az vm repair create  \
    --resource-group $resourceGroup \
    --name $vmName \
    --repair-username repairadm \
    --repair-password AnotherPassword123! \
    --repair-vm-name repairVM \
    --verbose

了解cloud-init.log

当有权访问 cloud-init 日志时,请查看 cloud-init 故障排除文档

收集活动日志

若要开始故障排除,请收集活动日志,以识别与问题相关的错误。 以下链接包含有关要遵循的过程的详细信息。

查看部署操作

通过查看活动日志管理 Azure 资源

获取支持

如果你已经参考了本指南,面仍无法排查出问题,则可以开启支持案例。 执行此操作时,请选择正确的产品和支持主题,这样做将吸引正确的支持团队。

选择案例产品:

Product Family: Azure
Product: Virtual Machine Running (Window\Linux)
Support Topic: <COMPLETE>
Support Subtopic: <COMPLETE>

联系我们寻求帮助

如果你有任何疑问或需要帮助,请创建支持请求联系 Azure 社区支持。 你还可以将产品反馈提交到 Azure 反馈社区