* [dm-devel] Device Mapper being derailed in tboot launch
@ 2022-06-06 15:43 Tony Camuso
2022-06-07 9:57 ` Bryn M. Reeves
0 siblings, 1 reply; 4+ messages in thread
From: Tony Camuso @ 2022-06-06 15:43 UTC (permalink / raw)
To: dm-devel
Only one system I've encountered exhibits this problem, the
lenovo st250v2. When I boot normally, all the devices are
found and boot succeeds.
When I boot through tboot, DM does not seem to be able to
find the root and swap devices.
Below are bootlog snippets from a successful boot and a failing
boot.
Can somebody tell me where to look for the source of this failure?
Is there debug code or DM utilities to help identify the problem?
Is there source code I can implement to isolate the failure point?
Successful bootlog snippet:
[ 3.843911] sd 5:0:0:0: [sda] Attached SCSI disk
[ 3.848370] sd 6:0:0:0: [sdb] Attached SCSI disk
[ 3.925639] md126: detected capacity change from 0 to 1900382519296
[ 3.946307] md126: p1 p2 p3
[ OK ] Found device /dev/mapper/rhel_lenovo--st250v2--02-root.
[ OK ] Reached target Initrd Root Device.
[ OK ] Found device /dev/mapper/rhel_lenovo--st250v2--02-swap.
Starting Resume from hibernation us…r/rhel_lenovo--st250v2--02-swap...
[ OK ] Started Resume from hibernation usi…per/rhel_lenovo--st250v2--02-swap.
[ OK ] Reached target Local File Systems (Pre).
Failing bootlog snippet:
[ 4.578205] sd 5:0:0:0: [sda] Attached SCSI disk
[ 4.581000] sd 6:0:0:0: [sdb] Attached SCSI disk
[ TIME ] Timed out waiting for device dev-ma…dst250v2\x2d\x2d02\x2dswap.device.
[DEPEND] Dependency failed for Resume from h…per/rhel_lenovo--st250v2--02-swap.
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dm-devel] Device Mapper being derailed in tboot launch
2022-06-06 15:43 [dm-devel] Device Mapper being derailed in tboot launch Tony Camuso
@ 2022-06-07 9:57 ` Bryn M. Reeves
2022-06-07 12:15 ` Tony Camuso
0 siblings, 1 reply; 4+ messages in thread
From: Bryn M. Reeves @ 2022-06-07 9:57 UTC (permalink / raw)
To: Tony Camuso; +Cc: dm-devel
On Mon, Jun 06, 2022 at 11:43:58AM -0400, Tony Camuso wrote:
> Successful bootlog snippet:
>
> [ 3.843911] sd 5:0:0:0: [sda] Attached SCSI disk
> [ 3.848370] sd 6:0:0:0: [sdb] Attached SCSI disk
> [ 3.925639] md126: detected capacity change from 0 to 1900382519296
> [ 3.946307] md126: p1 p2 p3
Are the MD array partitions being used as the PVs for the rhel_lenovo
volume group? It's the major difference in the two snippets other than
timing, and would account for why the volume group cannot be discovered
in the tboot case.
> [ OK ] Found device /dev/mapper/rhel_lenovo--st250v2--02-root.
> [ OK ] Reached target Initrd Root Device.
> [ OK ] Found device /dev/mapper/rhel_lenovo--st250v2--02-swap.
> Starting Resume from hibernation us…r/rhel_lenovo--st250v2--02-swap...
> [ OK ] Started Resume from hibernation usi…per/rhel_lenovo--st250v2--02-swap.
> [ OK ] Reached target Local File Systems (Pre).
>
> Failing bootlog snippet:
>
> [ 4.578205] sd 5:0:0:0: [sda] Attached SCSI disk
> [ 4.581000] sd 6:0:0:0: [sdb] Attached SCSI disk
> [ TIME ] Timed out waiting for device dev-ma…dst250v2\x2d\x2d02\x2dswap.device.
> [DEPEND] Dependency failed for Resume from h…per/rhel_lenovo--st250v2--02-swap.
Any differences in kernel command line/dracut arguments between the two
cases? Especially the rd.md.* bits?
Regards,
Bryn.
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dm-devel] Device Mapper being derailed in tboot launch
2022-06-07 9:57 ` Bryn M. Reeves
@ 2022-06-07 12:15 ` Tony Camuso
2022-06-08 13:25 ` Bryn M. Reeves
0 siblings, 1 reply; 4+ messages in thread
From: Tony Camuso @ 2022-06-07 12:15 UTC (permalink / raw)
To: Bryn M. Reeves; +Cc: dm-devel
On 6/7/2022 5:57 AM, Bryn M. Reeves wrote:
Many thanks for the reply.
> On Mon, Jun 06, 2022 at 11:43:58AM -0400, Tony Camuso wrote:
>
>> Successful bootlog snippet:
>>
>> [ 3.843911] sd 5:0:0:0: [sda] Attached SCSI disk
>> [ 3.848370] sd 6:0:0:0: [sdb] Attached SCSI disk
>> [ 3.925639] md126: detected capacity change from 0 to 1900382519296
>> [ 3.946307] md126: p1 p2 p3
>
> Are the MD array partitions being used as the PVs for the rhel_lenovo
> volume group? It's the major difference in the two snippets other than
> timing, and would account for why the volume group cannot be discovered
> in the tboot case.
It would appear from the respective grub command lines that they are.
See below.
>
>> [ OK ] Found device /dev/mapper/rhel_lenovo--st250v2--02-root.
>> [ OK ] Reached target Initrd Root Device.
>> [ OK ] Found device /dev/mapper/rhel_lenovo--st250v2--02-swap.
>> Starting Resume from hibernation us…r/rhel_lenovo--st250v2--02-swap...
>> [ OK ] Started Resume from hibernation usi…per/rhel_lenovo--st250v2--02-swap.
>> [ OK ] Reached target Local File Systems (Pre).
>>
>> Failing bootlog snippet:
>>
>> [ 4.578205] sd 5:0:0:0: [sda] Attached SCSI disk
>> [ 4.581000] sd 6:0:0:0: [sdb] Attached SCSI disk
>> [ TIME ] Timed out waiting for device dev-ma…dst250v2\x2d\x2d02\x2dswap.device.
>> [DEPEND] Dependency failed for Resume from h…per/rhel_lenovo--st250v2--02-swap.
>
> Any differences in kernel command line/dracut arguments between the two
> cases? Especially the rd.md.* bits?
======================================================================
Here is the kernel command line in grub for the normal boot (succeeds)
----------------------------------------------------------------------
set gfx_payload=keep
insmod gzio
linux ($root)/vmlinuz-4.18.0-348.el8.x86_64 root=/dev/mapper/rhel_lenovo--st25\
0v2--02-root ro crashkernel=auto resume=/dev/mapper/rhel_lenovo--st250v2--02-s\
wap rd.md.uuid=8061c4cf:06de8a59:a9eefb7e:3edb011a rd.md.uuid=549c2ba4:1e03463\
b:d429e75b:398c67a3 rd.lvm.lv=rhel_lenovo-st250v2-02/root rd.lvm.lv=rhel_lenov\
o-st250v2-02/swap console=ttyS0,115200N81
initrd ($root)/initramfs-4.18.0-348.el8.x86_64.img $tuned_initrd
=============================================================
And here is the kernel command line in grub for tboot (fails)
-------------------------------------------------------------
echo 'Loading tboot 1.10.5 ...'
multiboot2 /tboot.gz logging=serial,memory,vga
echo 'Loading Linux 4.18.0-348.el8.x86_64 ...'
module2 /vmlinuz-4.18.0-348.el8.x86_64 root=/dev/mapper/rhel_lenovo--s\
t250v2--02-root ro crashkernel=auto resume=/dev/mapper/rhel_lenovo--st250v2--0\
2-swap rd.md.uuid=8061c4cf:06de8a59:a9eefb7e:3edb011a rd.md.uuid=549c2ba4:1e03\
463b:d429e75b:398c67a3 rd.lvm.lv=rhel_lenovo-st250v2-02/root rd.lvm.lv=rhel_le\
novo-st250v2-02/swap console=ttyS0,115200N81 intel_iommu=on noefi
echo 'Loading initial ramdisk ...'
module2 /initramfs-4.18.0-348.el8.x86_64.img
===========================================
Here is the RAID info
-------------------------------------------
# cat /etc/mdadm.conf
# mdadm.conf written out by anaconda
MAILADDR root
AUTO +imsm +1.x -all
ARRAY /dev/md/Volume0_0 UUID=8061c4cf:06de8a59:a9eefb7e:3edb011a
ARRAY /dev/md/imsm UUID=549c2ba4:1e03463b:d429e75b:398c67a3
# cat /proc/mdstat
Personalities : [raid0]
md126 : active raid0 sda[1] sdb[0]
1855842304 blocks super external:/md127/0 128k chunks
md127 : inactive sdb[1](S) sda[0](S)
10402 blocks super external:imsm
unused devices: <none>
===========================================
Here is the disk,pv,vg,lv info
-------------------------------------------
# lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
sda isw_raid_member
└─md126
├─md126p1 vfat 881C-8097 /boot/efi
├─md126p2 xfs bbe36a4d-8f22-4bd0-828c-f45a174b37ea /boot
└─md126p3 LVM2_member xrxpri-0EMi-idX3-4xwI-PqEe-wB3k-oVGcNd
├─rhel_lenovo--st250v2--02-root xfs bbead1a6-603b-441c-a99f-665a534012f0 /
├─rhel_lenovo--st250v2--02-swap swap 66f4dd05-466e-4a1f-8b78-57864c3aa328 [SWAP]
├─rhel_lenovo--st250v2--02-home xfs fd974fa9-484a-4ed0-b32e-69fc62ca52a3 /home
└─rhel_lenovo--st250v2--02-work xfs 1594e6cc-b170-49d2-8000-eb0b13e8d2c4 /work
sdb isw_raid_member
└─md126
├─md126p1 vfat 881C-8097 /boot/efi
├─md126p2 xfs bbe36a4d-8f22-4bd0-828c-f45a174b37ea /boot
└─md126p3 LVM2_member xrxpri-0EMi-idX3-4xwI-PqEe-wB3k-oVGcNd
├─rhel_lenovo--st250v2--02-root xfs bbead1a6-603b-441c-a99f-665a534012f0 /
├─rhel_lenovo--st250v2--02-swap swap 66f4dd05-466e-4a1f-8b78-57864c3aa328 [SWAP]
├─rhel_lenovo--st250v2--02-home xfs fd974fa9-484a-4ed0-b32e-69fc62ca52a3 /home
└─rhel_lenovo--st250v2--02-work xfs 1594e6cc-b170-49d2-8000-eb0b13e8d2c4 /work
-----------------------------------------------------
# pvs
PV VG Fmt Attr PSize PFree
/dev/md126p3 rhel_lenovo-st250v2-02 lvm2 a-- <1.73t 0
-----------------------------------------------------
# vgs
VG #PV #LV #SN Attr VSize VFree
rhel_lenovo-st250v2-02 1 4 0 wz--n- <1.73t 0
-----------------------------------------------------
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home rhel_lenovo-st250v2-02 -wi-ao---- 500.00g
root rhel_lenovo-st250v2-02 -wi-ao---- 70.00g
swap rhel_lenovo-st250v2-02 -wi-ao---- 31.47g
work rhel_lenovo-st250v2-02 -wi-ao---- <1.14t
-----------------------------------------------------
# pvdisplay
--- Physical volume ---
PV Name /dev/md126p3
VG Name rhel_lenovo-st250v2-02
PV Size <1.73 TiB / not usable 4.00 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 452679
Free PE 0
Allocated PE 452679
PV UUID xrxpri-0EMi-idX3-4xwI-PqEe-wB3k-oVGcNd
-----------------------------------------------------
# vgdisplay
--- Volume group ---
VG Name rhel_lenovo-st250v2-02
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 8
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 4
Open LV 4
Max PV 0
Cur PV 1
Act PV 1
VG Size <1.73 TiB
PE Size 4.00 MiB
Total PE 452679
Alloc PE / Size 452679 / <1.73 TiB
Free PE / Size 0 / 0
VG UUID WABoR8-sjXC-WWXv-UIJR-stt5-lX2y-HkhUxz
-----------------------------------------------------
# lvdisplay
--- Logical volume ---
LV Path /dev/rhel_lenovo-st250v2-02/swap
LV Name swap
VG Name rhel_lenovo-st250v2-02
LV UUID 1A22Bb-FlN7-6fcs-v0JB-Mpqt-oZIT-1l9e3T
LV Write Access read/write
LV Creation host, time lenovo-st250v2-02.ml3.eng.bos.redhat.com, 2022-06-02 12:42:36 -0400
LV Status available
# open 2
LV Size 31.47 GiB
Current LE 8057
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:1
--- Logical volume ---
LV Path /dev/rhel_lenovo-st250v2-02/root
LV Name root
VG Name rhel_lenovo-st250v2-02
LV UUID UyMrdL-Qcwf-6T9X-wGJj-VcjI-6m6A-UBMG2e
LV Write Access read/write
LV Creation host, time lenovo-st250v2-02.ml3.eng.bos.redhat.com, 2022-06-02 12:42:41 -0400
LV Status available
# open 1
LV Size 70.00 GiB
Current LE 17920
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:0
--- Logical volume ---
LV Path /dev/rhel_lenovo-st250v2-02/home
LV Name home
VG Name rhel_lenovo-st250v2-02
LV UUID Y82eN4-rQGL-D6oe-72wC-zO2C-GKep-XnZ1bA
LV Write Access read/write
LV Creation host, time lenovo-st250v2-02.ml3.eng.bos.redhat.com, 2022-06-02 13:01:25 -0400
LV Status available
# open 1
LV Size 500.00 GiB
Current LE 128000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:2
--- Logical volume ---
LV Path /dev/rhel_lenovo-st250v2-02/work
LV Name work
VG Name rhel_lenovo-st250v2-02
LV UUID U09E7P-eS5R-fn0V-CwcR-PRdm-J8ip-07I8b9
LV Write Access read/write
LV Creation host, time lenovo-st250v2-02.ml3.eng.bos.redhat.com, 2022-06-02 13:01:31 -0400
LV Status available
# open 1
LV Size <1.14 TiB
Current LE 298702
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:3
-----------------------------------------------------
>
> Regards,
> Bryn.
>
>
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dm-devel] Device Mapper being derailed in tboot launch
2022-06-07 12:15 ` Tony Camuso
@ 2022-06-08 13:25 ` Bryn M. Reeves
0 siblings, 0 replies; 4+ messages in thread
From: Bryn M. Reeves @ 2022-06-08 13:25 UTC (permalink / raw)
To: Tony Camuso; +Cc: dm-devel
On Tue, Jun 07, 2022 at 08:15:16AM -0400, Tony Camuso wrote:
> On 6/7/2022 5:57 AM, Bryn M. Reeves wrote:
> > On Mon, Jun 06, 2022 at 11:43:58AM -0400, Tony Camuso wrote:
> > > Successful bootlog snippet:
> > >
> > > [ 3.843911] sd 5:0:0:0: [sda] Attached SCSI disk
> > > [ 3.848370] sd 6:0:0:0: [sdb] Attached SCSI disk
> > > [ 3.925639] md126: detected capacity change from 0 to 1900382519296
> > > [ 3.946307] md126: p1 p2 p3
> >
> > Are the MD array partitions being used as the PVs for the rhel_lenovo
> > volume group? It's the major difference in the two snippets other than
> > timing, and would account for why the volume group cannot be discovered
> > in the tboot case.
>
> It would appear from the respective grub command lines that they are.
> See below.
OK great - that explains why the LVM devices are timing out in the tboot
case.
> ======================================================================
> Here is the kernel command line in grub for the normal boot (succeeds)
> ----------------------------------------------------------------------
>
> set gfx_payload=keep
> insmod gzio
> linux ($root)/vmlinuz-4.18.0-348.el8.x86_64 root=/dev/mapper/rhel_lenovo--st25\
> 0v2--02-root ro crashkernel=auto resume=/dev/mapper/rhel_lenovo--st250v2--02-s\
> wap rd.md.uuid=8061c4cf:06de8a59:a9eefb7e:3edb011a rd.md.uuid=549c2ba4:1e03463\
> b:d429e75b:398c67a3 rd.lvm.lv=rhel_lenovo-st250v2-02/root rd.lvm.lv=rhel_lenov\
> o-st250v2-02/swap console=ttyS0,115200N81
> initrd ($root)/initramfs-4.18.0-348.el8.x86_64.img $tuned_initrd
>
> =============================================================
> And here is the kernel command line in grub for tboot (fails)
> -------------------------------------------------------------
>
> echo 'Loading tboot 1.10.5 ...'
> multiboot2 /tboot.gz logging=serial,memory,vga
> echo 'Loading Linux 4.18.0-348.el8.x86_64 ...'
> module2 /vmlinuz-4.18.0-348.el8.x86_64 root=/dev/mapper/rhel_lenovo--s\
> t250v2--02-root ro crashkernel=auto resume=/dev/mapper/rhel_lenovo--st250v2--0\
> 2-swap rd.md.uuid=8061c4cf:06de8a59:a9eefb7e:3edb011a rd.md.uuid=549c2ba4:1e03\
> 463b:d429e75b:398c67a3 rd.lvm.lv=rhel_lenovo-st250v2-02/root rd.lvm.lv=rhel_le\
> novo-st250v2-02/swap console=ttyS0,115200N81 intel_iommu=on noefi
> echo 'Loading initial ramdisk ...'
> module2 /initramfs-4.18.0-348.el8.x86_64.img
There are some minor differences here particularly these two that are
only present in the tboot entry:
intel_iommu=on
noefi
The first doesn't seem likely to be involved - if forcing the IOMMU on
did anything to affect this I would expect it to break the SCSI driver
and prevent the disks from being disovered, but we see the sd log
messages in the tboot case so that isn't happening.
The noefi is a bit more interesting - a lot of modern systems ship the
motherboard RAID configuration tools as an EFI application now, and I
wonder if forcing EFI off with noefi is somehow breaking the discovery
of the imsm RAID set? The full dmesg for the two cases might give some
more hints about this.
My other possible guess was to check whether the initramfs image for the
tboot case was missing MD support, however from the above it looks as
though the two entries are using the same image (one has a ($root)
prefix, but that's where grub should look for / anyway).
Regards,
Bryn.
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-06-08 13:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-06 15:43 [dm-devel] Device Mapper being derailed in tboot launch Tony Camuso
2022-06-07 9:57 ` Bryn M. Reeves
2022-06-07 12:15 ` Tony Camuso
2022-06-08 13:25 ` Bryn M. Reeves
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.