[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1106378: marked as done (azure-vm-utils: Request to Adjust NVMe Timeout Defaults for Azure Compatibility)



Your message dated Sat, 24 May 2025 20:45:59 +0000
with message-id <[email protected]>
and subject line Bug#1106378: fixed in azure-vm-utils 0.6.0-3
has caused the Debian Bug report #1106378,
regarding azure-vm-utils: Request to Adjust NVMe Timeout Defaults for Azure Compatibility
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
1106378: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1106378
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: cloud.debian.org
Severity: normal
User: [email protected]

Microsoft would like us to adjust the default NVMe timeout settings on our
bookworm images to improve reliability on Azure.

Azure VMs use NVMe for ephemeral storage, and newer VM sizes use it for their
root volumes.  Microsoft has received reports of Linux systems using the
kernel default settings failing under certain circumstances with dmesg
containing messages such as those shown below.

Microsoft recommends 240 seconds as as the timeout at
https://github.com/Azure/SAP-on-Azure-Scripts-and-Utilities/blob/432d8b3ccd1061aeb95552afc645f5390f1449d1/NVMe-Preflight-Check/azure-nvme-preflight-check.sh#L121-L161

Additional details on nvme in azure are at
https://learn.microsoft.com/en-us/azure/virtual-machines/nvme-linux

I thought azure-vm-utils was aready doing this, but apparently it's not. I've
requested that feature upstream at
https://github.com/Azure/azure-vm-utils/issues/80.  If it gets implemented
upstream in the near term, we should be able to get the change into trixie.
However, since that package isn't present in bookworm, we'd need to come up
with another approach there.

dmesg symptoms:
[169365.182748] nvme nvme0: I/O tag 246 (60f6) opcode 0x2 (Read) QID 21 timeout, aborting req_op:READ(0) size:262144
[169365.183193] nvme nvme0: Abort status: 0x0
[169365.183506] nvme nvme0: I/O tag 249 (80f9) opcode 0x2 (Read) QID 21 timeout, aborting req_op:READ(0) size:262144
[169365.183880] nvme nvme0: Abort status: 0x0
[169365.184197] nvme nvme0: I/O tag 250 (e0fa) opcode 0x2 (Read) QID 21 timeout, aborting req_op:READ(0) size:262144
[169365.184564] nvme nvme0: Abort status: 0x0
[169365.184893] nvme nvme0: I/O tag 251 (d0fb) opcode 0x2 (Read) QID 21 timeout, aborting req_op:READ(0) size:262144
[169365.185313] nvme nvme0: Abort status: 0x0
[169365.185627] nvme nvme0: I/O tag 252 (f0fc) opcode 0x2 (Read) QID 21 timeout, aborting req_op:READ(0) size:262144
[169365.186019] nvme nvme0: Abort status: 0x0
[169365.186335] nvme nvme0: I/O tag 253 (90fd) opcode 0x2 (Read) QID 21 timeout, aborting req_op:READ(0) size:69632
[169365.186697] nvme nvme0: Abort status: 0x0
[169365.497993] nvme nvme0: I/O tag 164 (e0a4) opcode 0x2 (Read) QID 9 timeout, reset controller
[169368.888085] nvme_log_error: 108 callbacks suppressed
[169368.888551] nvme0n9: Read(0x2) @ LBA 1179738368, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.888995] I/O error, dev nvme0n9, sector 9437906944 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.889723] nvme0n9: Read(0x2) @ LBA 1179738432, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.890119] I/O error, dev nvme0n9, sector 9437907456 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.890439] nvme0n9: Read(0x2) @ LBA 1179738496, 30 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.890757] I/O error, dev nvme0n9, sector 9437907968 op 0x0:(READ) flags 0x84700 phys_seg 30 prio class 2
[169368.891124] nvme0n9: Read(0x2) @ LBA 1179738526, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.891444] I/O error, dev nvme0n9, sector 9437908208 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.891764] nvme0n9: Read(0x2) @ LBA 1179738590, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.892124] I/O error, dev nvme0n9, sector 9437908720 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.892443] nvme0n9: Read(0x2) @ LBA 1179738654, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.892759] I/O error, dev nvme0n9, sector 9437909232 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.893118] nvme0n9: Read(0x2) @ LBA 1179738718, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.893438] I/O error, dev nvme0n9, sector 9437909744 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.893766] nvme0n9: Read(0x2) @ LBA 1179738782, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.894126] I/O error, dev nvme0n9, sector 9437910256 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.894448] nvme0n9: Read(0x2) @ LBA 1179738846, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.894768] I/O error, dev nvme0n9, sector 9437910768 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169368.895126] nvme0n9: Read(0x2) @ LBA 1179738910, 64 blocks, Host Aborted Command (sct 0x3 / sc 0x71) 
[169368.895448] I/O error, dev nvme0n9, sector 9437911280 op 0x0:(READ) flags 0x84700 phys_seg 64 prio class 2
[169369.255892] nvme nvme0: 48/0/0 default/read/poll queues
[169399.478284] nvme nvme0: I/O tag 43 (a02b) QID 15 timeout, disable controller

--- End Message ---
--- Begin Message ---
Source: azure-vm-utils
Source-Version: 0.6.0-3
Done: Noah Meyerhans <[email protected]>

We believe that the bug you reported is fixed in the latest version of
azure-vm-utils, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to [email protected],
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Noah Meyerhans <[email protected]> (supplier of updated azure-vm-utils package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing [email protected])


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Format: 1.8
Date: Fri, 23 May 2025 12:49:19 -0400
Source: azure-vm-utils
Architecture: source
Version: 0.6.0-3
Distribution: unstable
Urgency: medium
Maintainer: Debian Cloud Team <[email protected]>
Changed-By: Noah Meyerhans <[email protected]>
Closes: 1106378
Changes:
 azure-vm-utils (0.6.0-3) unstable; urgency=medium
 .
   * Set nvme parameters per Microsoft recommendations (Closes: #1106378)
Checksums-Sha1:
 81c616e4121c095f536414761bae2075213a8656 2109 azure-vm-utils_0.6.0-3.dsc
 b20299701bad0b1c4ae5de34b890f7662e952278 3052 azure-vm-utils_0.6.0-3.debian.tar.xz
 94f363f67124ab54a3b3f00f22fadc7b4702d81c 7688 azure-vm-utils_0.6.0-3_source.buildinfo
Checksums-Sha256:
 cf2286fa95f6f4574a5854ad9aa4a1a940adce37ee9daf2b651efddbf20c1005 2109 azure-vm-utils_0.6.0-3.dsc
 c7ed2cc080f95e7f524344fc32051e4566210cf430679e6f41e3e7818df3cd21 3052 azure-vm-utils_0.6.0-3.debian.tar.xz
 9a25e6edda7d135cab9fd07ea916fb34ad01669e66c13c869ac747e9fee71545 7688 azure-vm-utils_0.6.0-3_source.buildinfo
Files:
 a1957e6d53e45e1b425b432564275c3c 2109 admin optional azure-vm-utils_0.6.0-3.dsc
 bbbeb7711e2c92784395ef94a7c88d16 3052 admin optional azure-vm-utils_0.6.0-3.debian.tar.xz
 2c8df125dcbb95285765f2a4b8c801bb 7688 admin optional azure-vm-utils_0.6.0-3_source.buildinfo

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEE5G+E0xEKhJuZ7RJ34+c1IpshdTUFAmgyJAcACgkQ4+c1Ipsh
dTXm3hAArP+TwYsS3xKmzxZFwno5sZ+qSw7iw4svsQM1rxYyeW1Y16rwk3rxBzEW
OAMCG/Py+a35FUt3Uj+NxaTypqcrxyZKo71d3JahsLi99hJfQrSi4UZinJO+WjcS
+1KF0lOpuHGTR5fpOP7MjRk1IGTMWLkYtW7Nv32yoM7/QqmL0BIRNJKvoOzCLzhv
g/APs2Qs6niHQhZ512G+SPF4t+gED0F3Eylf07hctFoydsoEkBZW6DMhGFXTHDTX
LyVc4KOZVMDMK3DRRIe6TBL9q/mfl/T1rhTGKPz7Xs2Hfq1UbwShIU3nkDiPqy2M
1b01OurpuIViJhOBjsyYtDFGINIZMx+WWE8rC/AQl8zQDpIXSw5ZYd3l6eizEfld
XvlYUuQbyNwsG+rMpAXPniGC+CdjIhXbM1GQUX7FafwfjtcLDB4V+9K+vaj9oi/+
R894bGPrloGTusPcxaXYnIryTYst67j/IbXmyEavTKj4Zk3b3xL9X5wr4EVjbfu8
uEiKt7cYHiq5ZAWeZEutxvStKcDkUKjvyP3DZnkdKgFgVlJDC/xgCLH15aCedqjQ
+2bcZ47tzNh/SmnvfZf1fug+htwM9qW0V64NW+Zp3qMdHe//mF1cyd/5olm+mylJ
xatN2Cq0BAvUPJOUfn0qpRCmoZOFfm9KAR0NvW1hB4thS7JL9KE=
=YVCp
-----END PGP SIGNATURE-----

Attachment: pgpvQZ2vyI3dD.pgp
Description: PGP signature


--- End Message ---

Reply to:
OSZAR »