* GRUB bug with Btrfs multiple devices @ 2019-11-26 4:05 Chris Murphy 2019-11-26 21:11 ` Goffredo Baroncelli 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-26 4:05 UTC (permalink / raw) To: Btrfs BTRFS grub2-efi-x64-2.02-100.fc31.x86_64 kernel-5.3.13-300.fc31.x86_64 I've seen this before, so it isn't a regression in either of the above versions. But I'm also not certain when the regression occurred, because the last time I tested Btrfs multiple devices (specifically data single profile), was years ago and I didn't run into this. The gist to reproduce: 1. btrfs single device, single profile data, single profile metadata 2. device starts to run out of space; no problem 'btrfs device add /dev/' voila it works, reboots, keeps on working for a while, but then... 3. install a kernel or two or three or four I suspect that at some point kernels end up on the newly added device due to new block groups eventually being created there, and GRUB subsequently gets confused, starts spewing a bunch of error information which I have to page through. Eventually it does find everything and does boot. But it's kinda ugly and I'm not really sure how to gather more information. Shaky cam video of the boot is here: https://photos.app.goo.gl/wvJbB6kBEFzNwogo6 -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-26 4:05 GRUB bug with Btrfs multiple devices Chris Murphy @ 2019-11-26 21:11 ` Goffredo Baroncelli 2019-11-26 23:53 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-26 21:11 UTC (permalink / raw) To: Chris Murphy, Btrfs BTRFS On 26/11/2019 05.05, Chris Murphy wrote: > grub2-efi-x64-2.02-100.fc31.x86_64 > kernel-5.3.13-300.fc31.x86_64 > > I've seen this before, so it isn't a regression in either of the above > versions. But I'm also not certain when the regression occurred, > because the last time I tested Btrfs multiple devices (specifically > data single profile), was years ago and I didn't run into this. From the video, it seems that GRUB complaints about a "failure reading". However GRUB is capable to perform the boot and because the profiles are "single (no redundancy), it seems a "false positive" error. When I added the RADID5/6 support to grub, I remember errors like what you showed. However it happened 1 year ago, so my remember may be wrong. I noticed that GRUB test a lot of disks (hd0 ... hd3) . Could you be so kindly to share the disks layout ? Most error is something like "failure reading sector 0xXX". However I can't read the XX number: could you be so kindly to tell us which number is "XX" ? It seems 0x80... but my eyes are bad and your video is even worse :-) I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ? > The gist to reproduce: > 1. btrfs single device, single profile data, single profile metadata > 2. device starts to run out of space; no problem 'btrfs device add > /dev/' voila it works, reboots, keeps on working for a while, but > then... > 3. install a kernel or two or three or four > > I suspect that at some point kernels end up on the newly added device > due to new block groups eventually being created there, and GRUB > subsequently gets confused, starts spewing a bunch of error > information which I have to page through. Eventually it does find > everything and does boot. But it's kinda ugly and I'm not really sure > how to gather more information. > > Shaky cam video of the boot is here: > https://photos.app.goo.gl/wvJbB6kBEFzNwogo6 > > [1] http://git.savannah.gnu.org/cgit/grub.git/commit/grub-core/fs/btrfs.c?id=fd5a1d82f1d6a0482f5fe201ce646ddba8574bab -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-26 21:11 ` Goffredo Baroncelli @ 2019-11-26 23:53 ` Chris Murphy 2019-11-27 1:35 ` Chris Murphy ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Chris Murphy @ 2019-11-26 23:53 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Chris Murphy, Btrfs BTRFS On Tue, Nov 26, 2019 at 2:11 PM Goffredo Baroncelli <kreijack@libero.it> wrote: > > On 26/11/2019 05.05, Chris Murphy wrote: > > grub2-efi-x64-2.02-100.fc31.x86_64 > > kernel-5.3.13-300.fc31.x86_64 > > > > I've seen this before, so it isn't a regression in either of the above > > versions. But I'm also not certain when the regression occurred, > > because the last time I tested Btrfs multiple devices (specifically > > data single profile), was years ago and I didn't run into this. > > From the video, it seems that GRUB complaints about a "failure reading". However GRUB is capable to perform the boot and because the profiles are "single (no redundancy), it seems a "false positive" error. > > When I added the RADID5/6 support to grub, I remember errors like what you showed. However it happened 1 year ago, so my remember may be wrong. > I noticed that GRUB test a lot of disks (hd0 ... hd3) . Could you be so kindly to share the disks layout ? Most error is something like "failure reading sector 0xXX". However I can't read the XX number: could you be so kindly to tell us which number is "XX" ? It seems 0x80... but my eyes are bad and your video is even worse :-) It was a dark room and shaky cam was seeking for focus :-D It's 0x80. The storage is one CD-ROM drive and one SSD drive. That's it. So I don't know why there's hd2 and hd3, it seems like GRUB is confused about how many drives there are, but that pre-dates this problem. > I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ? Yes Fedora Rawhide has 2.04 in it, so I'll give that a shot next time I rebuild this particular laptop, which should be relatively soon; or even maybe I can reproduce this problem in a VM with two virtio devices. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-26 23:53 ` Chris Murphy @ 2019-11-27 1:35 ` Chris Murphy 2019-11-27 6:07 ` Goffredo Baroncelli 2019-11-27 6:09 ` Goffredo Baroncelli 2019-11-29 20:50 ` Andrei Borzenkov 2 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-27 1:35 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Btrfs BTRFS On Tue, Nov 26, 2019 at 4:53 PM Chris Murphy <lists@colorremedies.com> wrote: > > On Tue, Nov 26, 2019 at 2:11 PM Goffredo Baroncelli <kreijack@libero.it> wrote: > > > > I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ? > > Yes Fedora Rawhide has 2.04 in it, so I'll give that a shot next time > I rebuild this particular laptop, which should be relatively soon; or > even maybe I can reproduce this problem in a VM with two virtio > devices. I was able to just update to the Fedora 2.04-4.fc32 packages. It's not upstream's but it's a quick and dirty way to give it a shot. Turns out, the same errors happen, although the line number for efidisk.c has changed: https://photos.app.goo.gl/aKWRYhJkkJRDtC1W7 For grins, I dropped to a grub prompt, and issued ls and get a different result: https://photos.app.goo.gl/MvL9QZa6zGsiktAf9 Also for what it's worth, the Btrfs in question is on hd5,gpt4 and hd5gpt5 - same physical device, different partitions. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-27 1:35 ` Chris Murphy @ 2019-11-27 6:07 ` Goffredo Baroncelli 2019-11-28 0:42 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-27 6:07 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On 27/11/2019 02.35, Chris Murphy wrote: > On Tue, Nov 26, 2019 at 4:53 PM Chris Murphy <lists@colorremedies.com> wrote: >> >> On Tue, Nov 26, 2019 at 2:11 PM Goffredo Baroncelli <kreijack@libero.it> wrote: >>> >>> I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ? >> >> Yes Fedora Rawhide has 2.04 in it, so I'll give that a shot next time >> I rebuild this particular laptop, which should be relatively soon; or >> even maybe I can reproduce this problem in a VM with two virtio >> devices. > > I was able to just update to the Fedora 2.04-4.fc32 packages. It's not > upstream's but it's a quick and dirty way to give it a shot. Turns > out, the same errors happen, although the line number for efidisk.c > has changed: > https://photos.app.goo.gl/aKWRYhJkkJRDtC1W7 > > For grins, I dropped to a grub prompt, and issued ls and get a different result: > https://photos.app.goo.gl/MvL9QZa6zGsiktAf9 Looking at the second picture, it seems that grub had problem to access the disk 0..3 not only when is doing a btrfs activity. No problem accessing hd4 and hd5* Could you enable the debug, doing set pager=1 set debug=all ? > > Also for what it's worth, the Btrfs in question is on hd5,gpt4 and > hd5gpt5 - same physical device, different partitions. > > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-27 6:07 ` Goffredo Baroncelli @ 2019-11-28 0:42 ` Chris Murphy 2019-11-28 17:58 ` Goffredo Baroncelli 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-28 0:42 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Btrfs BTRFS On Tue, Nov 26, 2019 at 11:07 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: > > On 27/11/2019 02.35, Chris Murphy wrote: > > On Tue, Nov 26, 2019 at 4:53 PM Chris Murphy <lists@colorremedies.com> wrote: > >> > >> On Tue, Nov 26, 2019 at 2:11 PM Goffredo Baroncelli <kreijack@libero.it> wrote: > >>> > >>> I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ? > >> > >> Yes Fedora Rawhide has 2.04 in it, so I'll give that a shot next time > >> I rebuild this particular laptop, which should be relatively soon; or > >> even maybe I can reproduce this problem in a VM with two virtio > >> devices. > > > > I was able to just update to the Fedora 2.04-4.fc32 packages. It's not > > upstream's but it's a quick and dirty way to give it a shot. Turns > > out, the same errors happen, although the line number for efidisk.c > > has changed: > > https://photos.app.goo.gl/aKWRYhJkkJRDtC1W7 > > > > For grins, I dropped to a grub prompt, and issued ls and get a different result: > > https://photos.app.goo.gl/MvL9QZa6zGsiktAf9 > > Looking at the second picture, it seems that grub had problem to access the disk 0..3 not only when is doing a btrfs activity. > No problem accessing hd4 and hd5* > > Could you enable the debug, doing > > set pager=1 > set debug=all I need to narrow the scope. Adding 'set debug=all', there's just way too much to video, minutes of pages just holding down space bar full time which is even too fast to video. There must be over 1000 pages, a tiny minority contain efidisk.c references, the vast majority are btrfs.c references. As many pages as there are, I was never able to stop right on a boundary between efidisk.c and btrfs.c. So I gave up on that approach. Since the errors happen with efidisk.c I've enabled 'set debug=efidisk' and captured 74 photos, available at the link below (they are in pager order) https://photos.app.goo.gl/nuDH5hFMRxUVKXpX6 It does seem that the errors only happen in efidisk.c and only when trying to read from what might be phantom devices; I do not know how a second device in a Btrfs volume triggers this though. There must be some interaction between efidisk.c and btrfs.c? The grubx64.efi, grubenv, grub.cfg, and grub modules are all on an HFS+ (no journal) file system acting as the EFI System partition (as is the default behavior in Fedora on Macs for many years now). Only vmlinuz and initramfs are on Btrfs. So I'm not really even sure why btrfs.c gets called before the GRUB menu is displayed. I'll see about reproducing this with a VM using edk2 UEFI and two virtio devices, at least get to a cleaner environment so we're not confusing multiple system specific weird things. And I can also leave this particular Mac laptop as it is for further study. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-28 0:42 ` Chris Murphy @ 2019-11-28 17:58 ` Goffredo Baroncelli 2019-11-28 20:05 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-28 17:58 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On 28/11/2019 01.42, Chris Murphy wrote: > On Tue, Nov 26, 2019 at 11:07 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: >> [...] >> Could you enable the debug, doing >> >> set pager=1 >> set debug=all > > I need to narrow the scope. Adding 'set debug=all', there's just way > too much to video, minutes of pages just holding down space bar full > time which is even too fast to video. There must be over 1000 pages, a > tiny minority contain efidisk.c references, the vast majority are > btrfs.c references. As many pages as there are, I was never able to > stop right on a boundary between efidisk.c and btrfs.c. So I gave up > on that approach. If I remember correctly, in the previous email you reports that even a simple "ls" at the grub prompt raises an error. So you could watch what happens when doing something simpler like "ls" or "ls (hd0)" > > Since the errors happen with efidisk.c I've enabled 'set > debug=efidisk' and captured 74 photos, available at the link below > (they are in pager order) > > > > It does seem that the errors only happen in efidisk.c and only when > trying to read from what might be phantom devices; I do not know how a > second device in a Btrfs volume triggers this though. There must be > some interaction between efidisk.c and btrfs.c? The grubx64.efi, > grubenv, grub.cfg, and grub modules are all on an HFS+ (no journal) > file system acting as the EFI System partition (as is the default > behavior in Fedora on Macs for many years now). Only vmlinuz and > initramfs are on Btrfs. So I'm not really even sure why btrfs.c gets > called before the GRUB menu is displayed. > > I'll see about reproducing this with a VM using edk2 UEFI and two > virtio devices, at least get to a cleaner environment so we're not > confusing multiple system specific weird things. And I can also leave > this particular Mac laptop as it is for further study. > > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-28 17:58 ` Goffredo Baroncelli @ 2019-11-28 20:05 ` Chris Murphy 2019-11-28 21:57 ` Goffredo Baroncelli 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-28 20:05 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Chris Murphy, Btrfs BTRFS On Thu, Nov 28, 2019 at 10:58 AM Goffredo Baroncelli <kreijack@inwind.it> wrote: > > On 28/11/2019 01.42, Chris Murphy wrote: > > On Tue, Nov 26, 2019 at 11:07 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: > >> > [...] > >> Could you enable the debug, doing > >> > >> set pager=1 > >> set debug=all > > > > I need to narrow the scope. Adding 'set debug=all', there's just way > > too much to video, minutes of pages just holding down space bar full > > time which is even too fast to video. There must be over 1000 pages, a > > tiny minority contain efidisk.c references, the vast majority are > > btrfs.c references. As many pages as there are, I was never able to > > stop right on a boundary between efidisk.c and btrfs.c. So I gave up > > on that approach. > > If I remember correctly, in the previous email you reports that even a simple "ls" at the grub prompt raises an error. > So you could watch what happens when doing something simpler like "ls" or "ls (hd0)" Errors with only ls. https://photos.app.goo.gl/BJpsLvwpL6yf19uj6 Errors with ls per device https://photos.app.goo.gl/pgxQDdj1JDjq86mZ9 But without rebooting, just repeating the ls for the same devices, I don't get the error for hd4 again. https://photos.app.goo.gl/M6yraHfgfAsMigaP8 From the first ls, it shows GPT on hd5, shouldn't 'ls (hd5)' report GPT rather than no file system? gdisk finds no problem with the GPT on /dev/sda which is hd5. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-28 20:05 ` Chris Murphy @ 2019-11-28 21:57 ` Goffredo Baroncelli 2019-11-29 17:57 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-28 21:57 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On 28/11/2019 21.05, Chris Murphy wrote: > On Thu, Nov 28, 2019 at 10:58 AM Goffredo Baroncelli <kreijack@inwind.it> wrote: >> >> On 28/11/2019 01.42, Chris Murphy wrote: >>> On Tue, Nov 26, 2019 at 11:07 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: >>>> >> [...] >>>> Could you enable the debug, doing >>>> >>>> set pager=1 >>>> set debug=all >>> >>> I need to narrow the scope. Adding 'set debug=all', there's just way >>> too much to video, minutes of pages just holding down space bar full >>> time which is even too fast to video. There must be over 1000 pages, a >>> tiny minority contain efidisk.c references, the vast majority are >>> btrfs.c references. As many pages as there are, I was never able to >>> stop right on a boundary between efidisk.c and btrfs.c. So I gave up >>> on that approach. >> >> If I remember correctly, in the previous email you reports that even a simple "ls" at the grub prompt raises an error. >> So you could watch what happens when doing something simpler like "ls" or "ls (hd0)" > > Errors with only ls. > https://photos.app.goo.gl/BJpsLvwpL6yf19uj6 It seems that my supposition is true: the problem exists independently of btrfs. It would be useful to see the debug (set debug=all + set pager=1) when doing "ls". It is a not so huge set of information (however it is composed by few pages). > > Errors with ls per device > https://photos.app.goo.gl/pgxQDdj1JDjq86mZ9 Grub sees hd0..hd3 as disks of ~120GB; to be exactly, the size is 125753602048 bytes. The error is reported as unable to access sector 0xea3bfc8, which is locate at 0xea3bf00*512=125753491456 byte, which is less than the previous value... It seems that GRUB is correct in complaining. It is trying to access a valid disk location which return an error. Why grub is trying to access this location ? My supposition is that grub is trying to probe a filesystem (or a partition type...) The problem seems to be related to the first 4 disks, which have all the same size and are "phantom" disks... May be that the problem is that GRUB incorrectly detects disks ? > > But without rebooting, just repeating the ls for the same devices, I > don't get the error for hd4 again. > https://photos.app.goo.gl/M6yraHfgfAsMigaP8 My understanding is that GRUB tried to load some external modules (zfs, ufs2, ...) without success. However this tentative was attempted only the first time. This could explain the fact that the error appeared only one time. > >>From the first ls, it shows GPT on hd5, shouldn't 'ls (hd5)' report > GPT rather than no file system? gdisk finds no problem with the GPT on > /dev/sda which is hd5. It seems no ----------------------------------------------------------------------- GNU GRUB version 2.03 Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists possible device or file completions. grub> ls (proc) (hd11) (hd13) (hd14) (hd15) (hd19) (hd20) (hd31) (hd31,msdos1) (hd32) (h d32,msdos2) (hd32,msdos1) (hd51) (hd52) (hd53) (hd61) (hd62) (hd63) (hd64) (hd7 1) (hd72) (hd73) (hd74) (hd99) (hd99,gpt2) (hd99,gpt1) (host) (md/0) grub> ls (hd99) Device hd99: No known filesystem detected - Sector size 512B - Total size 10485760KiB grub> ----------------------------------------------------------------------- > > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-28 21:57 ` Goffredo Baroncelli @ 2019-11-29 17:57 ` Chris Murphy 2019-11-29 19:54 ` Goffredo Baroncelli 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-29 17:57 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Chris Murphy, Btrfs BTRFS On Thu, Nov 28, 2019 at 2:57 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: > > It seems that my supposition is true: the problem exists independently of btrfs. > It would be useful to see the debug (set debug=all + set pager=1) when doing "ls". It is a not so huge set of information (however it is composed by few pages). OK I did debug=all on the grub command line instead of in the grub.cfg, and it's much more manageable. https://photos.app.goo.gl/75Lbobg39R4D9QUk6 It's a very strange coincidence that these errors only began soon after the Btrfs volume becomes a two device fs. I forgot to mention that while grub.cfg is on hfsplus, Fedora GRUB now uses blscfg.mod by default which goes looking for BLS snippets, which happen to be on /boot/loader/entries, which is on Btrfs. So even drawing the GRUB menu does in fact need to read from the 2 device Btrfs. > Grub sees hd0..hd3 as disks of ~120GB; to be exactly, the size is 125753602048 bytes. The error is reported as unable to access sector 0xea3bfc8, which is locate at 0xea3bf00*512=125753491456 byte, which is less than the previous value... Looks to me that hd0, hd1, hd2, hd3, hd4 are all phantom devices. hd5 is the SSD, /dev/sda. cd0 is the empty dvd-rom drive. > > It seems that GRUB is correct in complaining. It is trying to access a valid disk location which return an error. > Why grub is trying to access this location ? My supposition is that grub is trying to probe a filesystem (or a partition type...) > > The problem seems to be related to the first 4 disks, which have all the same size and are "phantom" disks... > May be that the problem is that GRUB incorrectly detects disks ? > > > > But without rebooting, just repeating the ls for the same devices, I > > don't get the error for hd4 again. > > https://photos.app.goo.gl/M6yraHfgfAsMigaP8 > > My understanding is that GRUB tried to load some external modules (zfs, ufs2, ...) without success. However this tentative was attempted only the first time. This could explain the fact that the error appeared only one time. These errors may be misleading because the Fedora grubx64.efi doesn't contain them, and I've only copied a few GRUB modules from /usr/lib/grub/x86_64-efi to /boot/efi/EFI/fedora/x86_64-efi The default installation on Fedora doesn't copy external modules to the ESP at all, so only the ones already in the grubx64.efi are available. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-29 17:57 ` Chris Murphy @ 2019-11-29 19:54 ` Goffredo Baroncelli 2019-11-29 21:17 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-29 19:54 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On 29/11/2019 18.57, Chris Murphy wrote: > On Thu, Nov 28, 2019 at 2:57 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: >> >> It seems that my supposition is true: the problem exists independently of btrfs. >> It would be useful to see the debug (set debug=all + set pager=1) when doing "ls". It is a not so huge set of information (however it is composed by few pages). > > OK I did debug=all on the grub command line instead of in the > grub.cfg, and it's much more manageable. > https://photos.app.goo.gl/75Lbobg39R4D9QUk6 > > It's a very strange coincidence that these errors only began soon > after the Btrfs volume becomes a two device fs. I forgot to mention > that while grub.cfg is on hfsplus, Fedora GRUB now uses blscfg.mod by > default which goes looking for BLS snippets, which happen to be on > /boot/loader/entries, which is on Btrfs. So even drawing the GRUB menu > does in fact need to read from the 2 device Btrfs. > >> Grub sees hd0..hd3 as disks of ~120GB; to be exactly, the size is 125753602048 bytes. The error is reported as unable to access sector 0xea3bfc8, which is locate at 0xea3bf00*512=125753491456 byte, which is less than the previous value... > > Looks to me that hd0, hd1, hd2, hd3, hd4 are all phantom devices. hd5 > is the SSD, /dev/sda. cd0 is the empty dvd-rom drive. On the basis of these info, it seems that when "ls" is run the errors come from the fact that: - hd0..hd3 return errors when read (even before the end of device) - hd4 returns error, because its size is 0 (as reported by grub) However for these error btrfs seems not to be related. Regarding the error at boot time; my hypothesis is that during the loading of the kernel, grub tries (but I don't know why) to read from hd0..hd4 returning an error. Unfortunately the videos is not available anymore. Could you be so kindly to share the picture of the loading of the kernel/initramdisk ? Something like: grub> set debug=all grub> initrd /boot/initrd.... I hope that the errors come quickly. I don't think that we need the pictuers of all the download. It would be sufficient the pictures until the first (or better second) error.... BR G.Baroncelli > >> >> It seems that GRUB is correct in complaining. It is trying to access a valid disk location which return an error. >> Why grub is trying to access this location ? My supposition is that grub is trying to probe a filesystem (or a partition type...) >> >> The problem seems to be related to the first 4 disks, which have all the same size and are "phantom" disks... >> May be that the problem is that GRUB incorrectly detects disks ? >>> >>> But without rebooting, just repeating the ls for the same devices, I >>> don't get the error for hd4 again. >>> https://photos.app.goo.gl/M6yraHfgfAsMigaP8 >> >> My understanding is that GRUB tried to load some external modules (zfs, ufs2, ...) without success. However this tentative was attempted only the first time. This could explain the fact that the error appeared only one time. > > These errors may be misleading because the Fedora grubx64.efi doesn't > contain them, and I've only copied a few GRUB modules from > /usr/lib/grub/x86_64-efi to /boot/efi/EFI/fedora/x86_64-efi > > The default installation on Fedora doesn't copy external modules to > the ESP at all, so only the ones already in the grubx64.efi are > available. > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-29 19:54 ` Goffredo Baroncelli @ 2019-11-29 21:17 ` Chris Murphy 2019-11-30 7:33 ` Andrei Borzenkov 2019-11-30 8:12 ` Goffredo Baroncelli 0 siblings, 2 replies; 23+ messages in thread From: Chris Murphy @ 2019-11-29 21:17 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Btrfs BTRFS On Fri, Nov 29, 2019 at 12:54 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: > Could you be so kindly to share the picture of the loading of the kernel/initramdisk ? Something like: > > grub> set debug=all > grub> initrd /boot/initrd.... > > I hope that the errors come quickly. I don't think that we need the pictuers of all the download. It would be sufficient the pictures until the first (or better second) error.... I paged through it for minutes, hundreds of pages and never found any errors. But these are the first pages. This might actually be some kind of search, not load of the kernel, because I pressed tab to autocomplete. But it didn't autocomplete it immediately started spitting out debug pages. https://photos.app.goo.gl/kpa7dJ9spAy29yj26 Is it possible to redirect grub debug output to a FAT file? -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-29 21:17 ` Chris Murphy @ 2019-11-30 7:33 ` Andrei Borzenkov 2019-11-30 8:12 ` Goffredo Baroncelli 1 sibling, 0 replies; 23+ messages in thread From: Andrei Borzenkov @ 2019-11-30 7:33 UTC (permalink / raw) To: Chris Murphy, Goffredo Baroncelli; +Cc: Btrfs BTRFS 30.11.2019 00:17, Chris Murphy пишет: > > Is it possible to redirect grub debug output to a FAT file? > No. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-29 21:17 ` Chris Murphy 2019-11-30 7:33 ` Andrei Borzenkov @ 2019-11-30 8:12 ` Goffredo Baroncelli 2019-11-30 16:38 ` Chris Murphy 1 sibling, 1 reply; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-30 8:12 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On 29/11/2019 22.17, Chris Murphy wrote: > On Fri, Nov 29, 2019 at 12:54 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: >> Could you be so kindly to share the picture of the loading of the kernel/initramdisk ? Something like: >> >> grub> set debug=all >> grub> initrd /boot/initrd.... >> >> I hope that the errors come quickly. I don't think that we need the pictuers of all the download. It would be sufficient the pictures until the first (or better second) error.... > > I paged through it for minutes, hundreds of pages and never found any > errors. But these are the first pages. This might actually be some > kind of search, not load of the kernel, because I pressed tab to > autocomplete. But it didn't autocomplete it immediately started > spitting out debug pages. > > https://photos.app.goo.gl/kpa7dJ9spAy29yj26 > > Is it possible to redirect grub debug output to a FAT file? It is possible to redirect to a serial console .. Did the machine has a serial port ? > > > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-30 8:12 ` Goffredo Baroncelli @ 2019-11-30 16:38 ` Chris Murphy 0 siblings, 0 replies; 23+ messages in thread From: Chris Murphy @ 2019-11-30 16:38 UTC (permalink / raw) To: Goffredo Baroncelli; +Cc: Chris Murphy, Btrfs BTRFS On Sat, Nov 30, 2019 at 1:12 AM Goffredo Baroncelli <kreijack@inwind.it> wrote: > > On 29/11/2019 22.17, Chris Murphy wrote: > > On Fri, Nov 29, 2019 at 12:54 PM Goffredo Baroncelli <kreijack@inwind.it> wrote: > >> Could you be so kindly to share the picture of the loading of the kernel/initramdisk ? Something like: > >> > >> grub> set debug=all > >> grub> initrd /boot/initrd.... > >> > >> I hope that the errors come quickly. I don't think that we need the pictuers of all the download. It would be sufficient the pictures until the first (or better second) error.... > > > > I paged through it for minutes, hundreds of pages and never found any > > errors. But these are the first pages. This might actually be some > > kind of search, not load of the kernel, because I pressed tab to > > autocomplete. But it didn't autocomplete it immediately started > > spitting out debug pages. > > > > https://photos.app.goo.gl/kpa7dJ9spAy29yj26 > > > > Is it possible to redirect grub debug output to a FAT file? > > It is possible to redirect to a serial console .. > Did the machine has a serial port ? USB and wired ethernet. So far I'm unable to reproduce in a VM with 2 partitions used for 2 device Btrfs. It might be a multi-layer bug where the 1st bug must happen before the 2nd one has a chance of being revealed. The 1st bug being the issue of phantom devices, which *are* present when the Btrfs is a single device volume, but none of the errors show up in the GRUB/pre-boot environment until the 2nd device was added (and new kernel installed). It's too bad GRUB doesn't have a debug option to write a file to a FAT file system. The btrfs debug output is extremely long. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-26 23:53 ` Chris Murphy 2019-11-27 1:35 ` Chris Murphy @ 2019-11-27 6:09 ` Goffredo Baroncelli 2019-11-29 20:50 ` Andrei Borzenkov 2 siblings, 0 replies; 23+ messages in thread From: Goffredo Baroncelli @ 2019-11-27 6:09 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS On 27/11/2019 00.53, Chris Murphy wrote: > On Tue, Nov 26, 2019 at 2:11 PM Goffredo Baroncelli <kreijack@libero.it> wrote: >> >> On 26/11/2019 05.05, Chris Murphy wrote: >>> grub2-efi-x64-2.02-100.fc31.x86_64 >>> kernel-5.3.13-300.fc31.x86_64 >>> >>> I've seen this before, so it isn't a regression in either of the above >>> versions. But I'm also not certain when the regression occurred, >>> because the last time I tested Btrfs multiple devices (specifically >>> data single profile), was years ago and I didn't run into this. >> >> From the video, it seems that GRUB complaints about a "failure reading". However GRUB is capable to perform the boot and because the profiles are "single (no redundancy), it seems a "false positive" error. >> >> When I added the RADID5/6 support to grub, I remember errors like what you showed. However it happened 1 year ago, so my remember may be wrong. >> I noticed that GRUB test a lot of disks (hd0 ... hd3) . Could you be so kindly to share the disks layout ? Most error is something like "failure reading sector 0xXX". However I can't read the XX number: could you be so kindly to tell us which number is "XX" ? It seems 0x80... but my eyes are bad and your video is even worse :-) > > It was a dark room and shaky cam was seeking for focus :-D It's 0x80. > > The storage is one CD-ROM drive and one SSD drive. That's it. So I > don't know why there's hd2 and hd3, it seems like GRUB is confused > about how many drives there are, but that pre-dates this problem. If these drives are phantom ones, these could be the root of the problem... > > >> I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ? > > Yes Fedora Rawhide has 2.04 in it, so I'll give that a shot next time > I rebuild this particular laptop, which should be relatively soon; or > even maybe I can reproduce this problem in a VM with two virtio > devices. > > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-26 23:53 ` Chris Murphy 2019-11-27 1:35 ` Chris Murphy 2019-11-27 6:09 ` Goffredo Baroncelli @ 2019-11-29 20:50 ` Andrei Borzenkov 2019-11-29 21:11 ` Chris Murphy 2 siblings, 1 reply; 23+ messages in thread From: Andrei Borzenkov @ 2019-11-29 20:50 UTC (permalink / raw) To: Chris Murphy, Goffredo Baroncelli; +Cc: Btrfs BTRFS 27.11.2019 02:53, Chris Murphy пишет: > > The storage is one CD-ROM drive and one SSD drive. That's it. So I > don't know why there's hd2 and hd3, it seems like GRUB is confused > about how many drives there are, but that pre-dates this problem. > grub enumerates what EFI provides. What "lsefi" in grub says? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-29 20:50 ` Andrei Borzenkov @ 2019-11-29 21:11 ` Chris Murphy 2019-11-30 7:31 ` Andrei Borzenkov 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-29 21:11 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Chris Murphy, Goffredo Baroncelli, Btrfs BTRFS On Fri, Nov 29, 2019 at 1:50 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > 27.11.2019 02:53, Chris Murphy пишет: > > > > The storage is one CD-ROM drive and one SSD drive. That's it. So I > > don't know why there's hd2 and hd3, it seems like GRUB is confused > > about how many drives there are, but that pre-dates this problem. > > > > grub enumerates what EFI provides. What "lsefi" in grub says? https://photos.app.goo.gl/pBxLJNdbzz6J9Vo56 -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-29 21:11 ` Chris Murphy @ 2019-11-30 7:31 ` Andrei Borzenkov 2019-11-30 16:31 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Andrei Borzenkov @ 2019-11-30 7:31 UTC (permalink / raw) To: Chris Murphy; +Cc: Goffredo Baroncelli, Btrfs BTRFS 30.11.2019 00:11, Chris Murphy пишет: > On Fri, Nov 29, 2019 at 1:50 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >> >> 27.11.2019 02:53, Chris Murphy пишет: >>> >>> The storage is one CD-ROM drive and one SSD drive. That's it. So I >>> don't know why there's hd2 and hd3, it seems like GRUB is confused >>> about how many drives there are, but that pre-dates this problem. >>> >> >> grub enumerates what EFI provides. What "lsefi" in grub says? > > https://photos.app.goo.gl/pBxLJNdbzz6J9Vo56 > These are vendor media device paths handles that are children of (some) disk partitions. GRUB already tries to skip such handles: /* Ghosts proudly presented by Apple. */ if (GRUB_EFI_DEVICE_PATH_TYPE (dp) == GRUB_EFI_MEDIA_DEVICE_PATH_TYPE && GRUB_EFI_DEVICE_PATH_SUBTYPE (dp) == GRUB_EFI_VENDOR_MEDIA_DEVICE_PATH_SUBTYPE) { grub_efi_vendor_device_path_t *vendor = (grub_efi_vendor_device_path_t *) dp; const struct grub_efi_guid apple = GRUB_EFI_VENDOR_APPLE_GUID; if (vendor->header.length == sizeof (*vendor) && grub_memcmp (&vendor->vendor_guid, &apple, sizeof (vendor->vendor_guid)) == 0 && find_parent_device (devices, d)) continue; } but these have different GUID. Google search comes with something hinting on Apple still (like https://www.macos86.it/topic/1136-asus-x202e-hm76-vs-n56vb-hm76/page/2/?tab=comments#comment-31186). Device paths look like PciRoot(0x0)\Pci(0x1F,0x2)\Sata(0x0,0xFFFF,0x0)\HD(4,GPT,A640EF60-F7E9-4945-81A9-B04CCE53EE97,0x176F4800,0x482FC88)\VenMedia(BE74FCF7-0B7C-49F3-9147-01F4042E6842,4F20CFA89785973FAAF730597BFC41BA) where vendor GUID is BE74FCF7-0B7C-49F3-9147-01F4042E6842 So we have hard disk, then partition as child and then this vendor media as child of partition. This should certainly be reported to grub list. What system is it - is it Apple? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-30 7:31 ` Andrei Borzenkov @ 2019-11-30 16:31 ` Chris Murphy 2019-11-30 17:02 ` Andrei Borzenkov 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-30 16:31 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Chris Murphy, Goffredo Baroncelli, Btrfs BTRFS On Sat, Nov 30, 2019 at 12:31 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > 30.11.2019 00:11, Chris Murphy пишет: > > On Fri, Nov 29, 2019 at 1:50 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > >> > >> 27.11.2019 02:53, Chris Murphy пишет: > >>> > >>> The storage is one CD-ROM drive and one SSD drive. That's it. So I > >>> don't know why there's hd2 and hd3, it seems like GRUB is confused > >>> about how many drives there are, but that pre-dates this problem. > >>> > >> > >> grub enumerates what EFI provides. What "lsefi" in grub says? > > > > https://photos.app.goo.gl/pBxLJNdbzz6J9Vo56 > > > > These are vendor media device paths handles that are children of (some) > disk partitions. GRUB already tries to skip such handles: > > > /* Ghosts proudly presented by Apple. */ > if (GRUB_EFI_DEVICE_PATH_TYPE (dp) == GRUB_EFI_MEDIA_DEVICE_PATH_TYPE > && GRUB_EFI_DEVICE_PATH_SUBTYPE (dp) > == GRUB_EFI_VENDOR_MEDIA_DEVICE_PATH_SUBTYPE) > { > grub_efi_vendor_device_path_t *vendor = > (grub_efi_vendor_device_path_t *) dp; > const struct grub_efi_guid apple = GRUB_EFI_VENDOR_APPLE_GUID; > > if (vendor->header.length == sizeof (*vendor) > && grub_memcmp (&vendor->vendor_guid, &apple, > sizeof (vendor->vendor_guid)) == 0 > && find_parent_device (devices, d)) > continue; > } > > but these have different GUID. Google search comes with something > hinting on Apple still (like > https://www.macos86.it/topic/1136-asus-x202e-hm76-vs-n56vb-hm76/page/2/?tab=comments#comment-31186). > Device paths look like > > PciRoot(0x0)\Pci(0x1F,0x2)\Sata(0x0,0xFFFF,0x0)\HD(4,GPT,A640EF60-F7E9-4945-81A9-B04CCE53EE97,0x176F4800,0x482FC88)\VenMedia(BE74FCF7-0B7C-49F3-9147-01F4042E6842,4F20CFA89785973FAAF730597BFC41BA) > > where vendor GUID is BE74FCF7-0B7C-49F3-9147-01F4042E6842 > > So we have hard disk, then partition as child and then this vendor media > as child of partition. > > This should certainly be reported to grub list. What system is it - is > it Apple? Yes. Macbook Pro 8,2 (2011). I'll report the phantom device problem to grub-devel@ But still an open question is what's the instigator or secondary factor because this wasn't happening before adding an unused but already existing partition as a 2nd Btrfs device. Last time this happened, all I did was remove the 2nd device and the problem went away. I'm ready to try that again (remove the 2nd device) and see if the problem goes away, but has enough information been collected about the present state? -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-30 16:31 ` Chris Murphy @ 2019-11-30 17:02 ` Andrei Borzenkov 2019-11-30 17:14 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Andrei Borzenkov @ 2019-11-30 17:02 UTC (permalink / raw) To: Chris Murphy; +Cc: Goffredo Baroncelli, Btrfs BTRFS 30.11.2019 19:31, Chris Murphy пишет: > On Sat, Nov 30, 2019 at 12:31 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >> >> 30.11.2019 00:11, Chris Murphy пишет: >>> On Fri, Nov 29, 2019 at 1:50 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >>>> >>>> 27.11.2019 02:53, Chris Murphy пишет: >>>>> >>>>> The storage is one CD-ROM drive and one SSD drive. That's it. So I >>>>> don't know why there's hd2 and hd3, it seems like GRUB is confused >>>>> about how many drives there are, but that pre-dates this problem. >>>>> >>>> >>>> grub enumerates what EFI provides. What "lsefi" in grub says? >>> >>> https://photos.app.goo.gl/pBxLJNdbzz6J9Vo56 >>> >> >> These are vendor media device paths handles that are children of (some) >> disk partitions. GRUB already tries to skip such handles: >> >> >> /* Ghosts proudly presented by Apple. */ >> if (GRUB_EFI_DEVICE_PATH_TYPE (dp) == GRUB_EFI_MEDIA_DEVICE_PATH_TYPE >> && GRUB_EFI_DEVICE_PATH_SUBTYPE (dp) >> == GRUB_EFI_VENDOR_MEDIA_DEVICE_PATH_SUBTYPE) >> { >> grub_efi_vendor_device_path_t *vendor = >> (grub_efi_vendor_device_path_t *) dp; >> const struct grub_efi_guid apple = GRUB_EFI_VENDOR_APPLE_GUID; >> >> if (vendor->header.length == sizeof (*vendor) >> && grub_memcmp (&vendor->vendor_guid, &apple, >> sizeof (vendor->vendor_guid)) == 0 >> && find_parent_device (devices, d)) >> continue; >> } >> >> but these have different GUID. Google search comes with something >> hinting on Apple still (like >> https://www.macos86.it/topic/1136-asus-x202e-hm76-vs-n56vb-hm76/page/2/?tab=comments#comment-31186). >> Device paths look like >> >> PciRoot(0x0)\Pci(0x1F,0x2)\Sata(0x0,0xFFFF,0x0)\HD(4,GPT,A640EF60-F7E9-4945-81A9-B04CCE53EE97,0x176F4800,0x482FC88)\VenMedia(BE74FCF7-0B7C-49F3-9147-01F4042E6842,4F20CFA89785973FAAF730597BFC41BA) >> >> where vendor GUID is BE74FCF7-0B7C-49F3-9147-01F4042E6842 >> >> So we have hard disk, then partition as child and then this vendor media >> as child of partition. >> >> This should certainly be reported to grub list. What system is it - is >> it Apple? > > Yes. Macbook Pro 8,2 (2011). I'll report the phantom device problem to > grub-devel@ > > But still an open question is what's the instigator or secondary > factor because this wasn't happening before adding an unused but > already existing partition as a 2nd Btrfs device. GRUB is normally using hints - grub-install (and grub-mkconfig) tries to guess firmware device name. At boot time grub tries to access hinted device first, if it succeeds, it does not try anything else. With second btrfs partition grub needs to find second device at boot time so it now probes everything and hits those vendor media devices. At least this explains what you see as well as ... > Last time this > happened, all I did was remove the 2nd device and the problem went > away. ... this. If you go in grub shell in this state (without errors), do you see those ghost devices? > I'm ready to try that again (remove the 2nd device) and see if > the problem goes away, but has enough information been collected about > the present state? > > If you are reasonably sure that all errors are related to those phantom devices - I would say yes, the reason for these phantom devices to exist is already clear. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-30 17:02 ` Andrei Borzenkov @ 2019-11-30 17:14 ` Chris Murphy 2019-11-30 17:34 ` Chris Murphy 0 siblings, 1 reply; 23+ messages in thread From: Chris Murphy @ 2019-11-30 17:14 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Chris Murphy, Goffredo Baroncelli, Btrfs BTRFS On Sat, Nov 30, 2019 at 10:02 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > GRUB is normally using hints - grub-install (and grub-mkconfig) tries to > guess firmware device name. At boot time grub tries to access hinted > device first, if it succeeds, it does not try anything else. With second > btrfs partition grub needs to find second device at boot time so it now > probes everything and hits those vendor media devices. > > At least this explains what you see as well as ... > > > Last time this > > happened, all I did was remove the 2nd device and the problem went > > away. > > ... this. Ahhh, that makes complete sense. So it is Btrfs multiple device related, but not a bug in btrfs.c per se. > > If you go in grub shell in this state (without errors), do you see those > ghost devices? Uncertain. My vague memory recall is that yes they are there, because I found their existence strange and different compared to pre-GRUB 2.02 where on this same system I'd see only either hd0 or hd1 (one without the other), along with cd0. But something changed either with a firmware update from Apple, or GRUB, that resulted in additional GRUB devices, hd2, hd3, hd4, hd5. > > I'm ready to try that again (remove the 2nd device) and see if > > the problem goes away, but has enough information been collected about > > the present state? > > > > > > If you are reasonably sure that all errors are related to those phantom > devices - I would say yes, the reason for these phantom devices to exist > is already clear. I'll give it a shot in a bit. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: GRUB bug with Btrfs multiple devices 2019-11-30 17:14 ` Chris Murphy @ 2019-11-30 17:34 ` Chris Murphy 0 siblings, 0 replies; 23+ messages in thread From: Chris Murphy @ 2019-11-30 17:34 UTC (permalink / raw) To: Chris Murphy; +Cc: Andrei Borzenkov, Goffredo Baroncelli, Btrfs BTRFS On Sat, Nov 30, 2019 at 10:14 AM Chris Murphy <lists@colorremedies.com> wrote: > > On Sat, Nov 30, 2019 at 10:02 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > > > GRUB is normally using hints - grub-install (and grub-mkconfig) tries to > > guess firmware device name. At boot time grub tries to access hinted > > device first, if it succeeds, it does not try anything else. With second > > btrfs partition grub needs to find second device at boot time so it now > > probes everything and hits those vendor media devices. > > > > At least this explains what you see as well as ... > > > > > Last time this > > > happened, all I did was remove the 2nd device and the problem went > > > away. > > > > ... this. > > Ahhh, that makes complete sense. So it is Btrfs multiple device > related, but not a bug in btrfs.c per se. > > > > > If you go in grub shell in this state (without errors), do you see those > > ghost devices? > > Uncertain. My vague memory recall is that yes they are there, because > I found their existence strange and different compared to pre-GRUB > 2.02 where on this same system I'd see only either hd0 or hd1 (one > without the other), along with cd0. But something changed either with > a firmware update from Apple, or GRUB, that resulted in additional > GRUB devices, hd2, hd3, hd4, hd5. OK my vague memory is correct with respect to phantom devices still present after Brfs device removal. > > > > I'm ready to try that again (remove the 2nd device) and see if > > > the problem goes away, but has enough information been collected about > > > the present state? > > > > > > > > > > If you are reasonably sure that all errors are related to those phantom > > devices - I would say yes, the reason for these phantom devices to exist > > is already clear. > > I'll give it a shot in a bit. Yep, the errors no longer happen; but phantom devices still there. I've posted to grub-devel@ and updated it with this latest information. -- Chris Murphy ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2019-11-30 17:36 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-26 4:05 GRUB bug with Btrfs multiple devices Chris Murphy 2019-11-26 21:11 ` Goffredo Baroncelli 2019-11-26 23:53 ` Chris Murphy 2019-11-27 1:35 ` Chris Murphy 2019-11-27 6:07 ` Goffredo Baroncelli 2019-11-28 0:42 ` Chris Murphy 2019-11-28 17:58 ` Goffredo Baroncelli 2019-11-28 20:05 ` Chris Murphy 2019-11-28 21:57 ` Goffredo Baroncelli 2019-11-29 17:57 ` Chris Murphy 2019-11-29 19:54 ` Goffredo Baroncelli 2019-11-29 21:17 ` Chris Murphy 2019-11-30 7:33 ` Andrei Borzenkov 2019-11-30 8:12 ` Goffredo Baroncelli 2019-11-30 16:38 ` Chris Murphy 2019-11-27 6:09 ` Goffredo Baroncelli 2019-11-29 20:50 ` Andrei Borzenkov 2019-11-29 21:11 ` Chris Murphy 2019-11-30 7:31 ` Andrei Borzenkov 2019-11-30 16:31 ` Chris Murphy 2019-11-30 17:02 ` Andrei Borzenkov 2019-11-30 17:14 ` Chris Murphy 2019-11-30 17:34 ` Chris Murphy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).