Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* Does GRUB btrfs support log tree?
@ 2019-10-25  9:47 Chris Murphy
  2019-10-25  9:50 ` Chris Murphy
  2019-10-26  7:12 ` Andrei Borzenkov
  0 siblings, 2 replies; 12+ messages in thread
From: Chris Murphy @ 2019-10-25  9:47 UTC (permalink / raw)
  To: Btrfs BTRFS

I see references to root and chunk trees, but not the log tree.

If boot related files: kernel, initramfs, bootloader configuration
files, are stored on Btrfs; and if they are changed in such a way as
to rely on the log tree; and then there's a crash; what's the worse
case scenario effect?

At first glance, if the bootloader doesn't support log tree, it would
have a stale view of the file system. Since log tree writes means a
full file system update hasn't happened, the old file system state
hasn't been dereferenced, so even in an SSD + discard case, the system
should still be bootable. And at that point Btrfs kernel code does log
replay, and catches the system up, and the next update will boot the
new state.

Correct?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-10-25  9:47 Does GRUB btrfs support log tree? Chris Murphy
@ 2019-10-25  9:50 ` Chris Murphy
  2019-10-26  7:12 ` Andrei Borzenkov
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Murphy @ 2019-10-25  9:50 UTC (permalink / raw)
  To: Btrfs BTRFS

On Fri, Oct 25, 2019 at 11:47 AM Chris Murphy <lists@colorremedies.com> wrote:
>
> I see references to root and chunk trees, but not the log tree.
>
> If boot related files: kernel, initramfs, bootloader configuration
> files, are stored on Btrfs; and if they are changed in such a way as
> to rely on the log tree; and then there's a crash; what's the worse
> case scenario effect?
>
> At first glance, if the bootloader doesn't support log tree, it would
> have a stale view of the file system. Since log tree writes means a
> full file system update hasn't happened, the old file system state
> hasn't been dereferenced, so even in an SSD + discard case, the system
> should still be bootable. And at that point Btrfs kernel code does log
> replay, and catches the system up, and the next update will boot the
> new state.
>
> Correct?

Pretty sure this is the current and self-contained Btrfs code for GRUB
http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/fs/btrfs.c

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-10-25  9:47 Does GRUB btrfs support log tree? Chris Murphy
  2019-10-25  9:50 ` Chris Murphy
@ 2019-10-26  7:12 ` Andrei Borzenkov
  2019-10-27 20:05   ` Chris Murphy
  1 sibling, 1 reply; 12+ messages in thread
From: Andrei Borzenkov @ 2019-10-26  7:12 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

25.10.2019 12:47, Chris Murphy пишет:
> I see references to root and chunk trees, but not the log tree.
> 
> If boot related files: kernel, initramfs, bootloader configuration
> files, are stored on Btrfs; and if they are changed in such a way as
> to rely on the log tree; and then there's a crash; what's the worse
> case scenario effect?
> 
> At first glance, if the bootloader doesn't support log tree, it would
> have a stale view of the file system.

Yes, happened to me several times on ext4.

> Since log tree writes means a
> full file system update hasn't happened, the old file system state
> hasn't been dereferenced, so even in an SSD + discard case, the system
> should still be bootable. And at that point Btrfs kernel code does log
> replay, and catches the system up, and the next update will boot the
> new state.
> 
> Correct?
> 

Yes. If we speak about grub here, it actually tries very hard to ensure
writes has hit disk (it fsyncs files as it writes them and it flushes
raw devices). But I guess that fsync on btrfs just goes into log and
does not force transaction. Is it possible to force transaction on btrfs
from user space?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-10-26  7:12 ` Andrei Borzenkov
@ 2019-10-27 20:05   ` Chris Murphy
  2019-11-04 19:34     ` David Sterba
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Murphy @ 2019-10-27 20:05 UTC (permalink / raw)
  To: Andrei Borzenkov; +Cc: Chris Murphy, Btrfs BTRFS

On Sat, Oct 26, 2019 at 9:12 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
>
> 25.10.2019 12:47, Chris Murphy пишет:
> > I see references to root and chunk trees, but not the log tree.
> >
> > If boot related files: kernel, initramfs, bootloader configuration
> > files, are stored on Btrfs; and if they are changed in such a way as
> > to rely on the log tree; and then there's a crash; what's the worse
> > case scenario effect?
> >
> > At first glance, if the bootloader doesn't support log tree, it would
> > have a stale view of the file system.
>
> Yes, happened to me several times on ext4.

Yeah I have a reproducer on XFS, but was only able to get it to happen
once on ext4 and not again after 1/2 dozen attempts.

>
> > Since log tree writes means a
> > full file system update hasn't happened, the old file system state
> > hasn't been dereferenced, so even in an SSD + discard case, the system
> > should still be bootable. And at that point Btrfs kernel code does log
> > replay, and catches the system up, and the next update will boot the
> > new state.
> >
> > Correct?
> >
>
> Yes. If we speak about grub here, it actually tries very hard to ensure
> writes has hit disk (it fsyncs files as it writes them and it flushes
> raw devices). But I guess that fsync on btrfs just goes into log and
> does not force transaction. Is it possible to force transaction on btrfs
> from user space?

The only fsync I ever see Fedora's grub2-mkconfig do is for grubenv.
The grub.cfg is not fsync'd. When I do a strace of grub2-mkconfig,
it's so incredibly complicated. Using -ff -o options, I get over 1800
separate PID files exported. From what I can tell, it creates a brand
new file "grub.cfg.new" and writes to that. Then does a cat from
"grub.cfg.new" into "grub.cfg" - maybe it's file system specific
behavior, I'm not sure.

I'm pretty sure "sync" will do what you want, it calls syncfs() and
best as I can tell it does a full file system sync, doesn't use the
log tree. I'd argue grub-mkconfig should write all of its files, and
then sync that file system, rather than doing any fsync at all.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-10-27 20:05   ` Chris Murphy
@ 2019-11-04 19:34     ` David Sterba
  2019-11-11 19:37       ` Chris Murphy
  0 siblings, 1 reply; 12+ messages in thread
From: David Sterba @ 2019-11-04 19:34 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Andrei Borzenkov, Btrfs BTRFS

On Sun, Oct 27, 2019 at 09:05:54PM +0100, Chris Murphy wrote:
> > > Since log tree writes means a
> > > full file system update hasn't happened, the old file system state
> > > hasn't been dereferenced, so even in an SSD + discard case, the system
> > > should still be bootable. And at that point Btrfs kernel code does log
> > > replay, and catches the system up, and the next update will boot the
> > > new state.
> > >
> > > Correct?
> > >
> >
> > Yes. If we speak about grub here, it actually tries very hard to ensure
> > writes has hit disk (it fsyncs files as it writes them and it flushes
> > raw devices). But I guess that fsync on btrfs just goes into log and
> > does not force transaction. Is it possible to force transaction on btrfs
> > from user space?

* sync/syncfs
* the ioctl BTRFS_IOC_SYNC (calls syncfs)
* ioctls BTRFS_IOC_START_SYNC + BTRFS_IOC_WAIT_SYNC

> The only fsync I ever see Fedora's grub2-mkconfig do is for grubenv.
> The grub.cfg is not fsync'd. When I do a strace of grub2-mkconfig,
> it's so incredibly complicated. Using -ff -o options, I get over 1800
> separate PID files exported. From what I can tell, it creates a brand
> new file "grub.cfg.new" and writes to that. Then does a cat from
> "grub.cfg.new" into "grub.cfg" - maybe it's file system specific
> behavior, I'm not sure.
> 
> I'm pretty sure "sync" will do what you want, it calls syncfs() and
> best as I can tell it does a full file system sync, doesn't use the
> log tree. I'd argue grub-mkconfig should write all of its files, and
> then sync that file system, rather than doing any fsync at all.

This would work in most cases. I'm not sure, but the update does not
seem to be atomic. Ie. all old kernels match the old grub.cfg, or there
are new kernels that match the new cfg.

Even if there's not fsyncs and just the final sync, some other activity
in the filesystem can do the sync before between updates of kernels and
grub.cfg. Like this

start:

- kernel1
- grub.cfg (v1)

update:

- add kernel2
- remove kernel1
- <something calls sync>
- update grub.cfg (v2)
- grub calls sync

If the crash happens after sync and before update, kernel1 won't be
reachable and kernel2 won't be in the grub.cfg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-04 19:34     ` David Sterba
@ 2019-11-11 19:37       ` Chris Murphy
  2019-11-12 20:04         ` Goffredo Baroncelli
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Murphy @ 2019-11-11 19:37 UTC (permalink / raw)
  To: David Sterba, Chris Murphy, Andrei Borzenkov, Btrfs BTRFS

On Mon, Nov 4, 2019 at 7:34 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Sun, Oct 27, 2019 at 09:05:54PM +0100, Chris Murphy wrote:
> > > > Since log tree writes means a
> > > > full file system update hasn't happened, the old file system state
> > > > hasn't been dereferenced, so even in an SSD + discard case, the system
> > > > should still be bootable. And at that point Btrfs kernel code does log
> > > > replay, and catches the system up, and the next update will boot the
> > > > new state.
> > > >
> > > > Correct?
> > > >
> > >
> > > Yes. If we speak about grub here, it actually tries very hard to ensure
> > > writes has hit disk (it fsyncs files as it writes them and it flushes
> > > raw devices). But I guess that fsync on btrfs just goes into log and
> > > does not force transaction. Is it possible to force transaction on btrfs
> > > from user space?
>
> * sync/syncfs
> * the ioctl BTRFS_IOC_SYNC (calls syncfs)
> * ioctls BTRFS_IOC_START_SYNC + BTRFS_IOC_WAIT_SYNC
>
> > The only fsync I ever see Fedora's grub2-mkconfig do is for grubenv.
> > The grub.cfg is not fsync'd. When I do a strace of grub2-mkconfig,
> > it's so incredibly complicated. Using -ff -o options, I get over 1800
> > separate PID files exported. From what I can tell, it creates a brand
> > new file "grub.cfg.new" and writes to that. Then does a cat from
> > "grub.cfg.new" into "grub.cfg" - maybe it's file system specific
> > behavior, I'm not sure.
> >
> > I'm pretty sure "sync" will do what you want, it calls syncfs() and
> > best as I can tell it does a full file system sync, doesn't use the
> > log tree. I'd argue grub-mkconfig should write all of its files, and
> > then sync that file system, rather than doing any fsync at all.
>
> This would work in most cases. I'm not sure, but the update does not
> seem to be atomic. Ie. all old kernels match the old grub.cfg, or there
> are new kernels that match the new cfg.
>
> Even if there's not fsyncs and just the final sync, some other activity
> in the filesystem can do the sync before between updates of kernels and
> grub.cfg. Like this
>
> start:
>
> - kernel1
> - grub.cfg (v1)
>
> update:
>
> - add kernel2
> - remove kernel1
> - <something calls sync>
> - update grub.cfg (v2)
> - grub calls sync
>
> If the crash happens after sync and before update, kernel1 won't be
> reachable and kernel2 won't be in the grub.cfg.

Right. It's probably a bad practice to remove the fallback kernel,
which would be variably defined depending on the distribution, unless
the method of updating the kernel is atomic by design, proven by
testing.

In the single kernel case it could be done atomically with generic
filenaming, i.e. vmlinuz and initramfs, no versioning in the filename,
and a static bootloader configuration that's never updated, only ever
looks for vmlinuz and initramfs. The update would write out
vmlinuz.new and initramfs.new, and then sync. And then rename()
vmlinuz.new vmlinuz, and initramfs.new initramfs. Since it's two
files, it's not strictly atomic, likely more than one sector changes.
But it might be good enough?

I'm not really sure what the best practice is though. I asked about
this in a UEFI, EFI System partitioning (and thus FAT) context and it
seems like there really aren't any atomicity guarantees possible at
all which is a bit troubling. About the only way to do it is like on
Android with an A and B partition for the kernel and initramfs as
blobs, rather than being stored on file systems, and then indicate A
vs B by setting a partition attribute to indicate to the bootloader A
vs B priority with the other being fallback.

Anyway, the lack of a generic (file system independent) way to handle
this use case is actually a bit concerning.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-11 19:37       ` Chris Murphy
@ 2019-11-12 20:04         ` Goffredo Baroncelli
  2019-11-13 17:00           ` Chris Murphy
  0 siblings, 1 reply; 12+ messages in thread
From: Goffredo Baroncelli @ 2019-11-12 20:04 UTC (permalink / raw)
  To: Chris Murphy, David Sterba, Andrei Borzenkov, Btrfs BTRFS

On 11/11/2019 20.37, Chris Murphy wrote:
> Anyway, the lack of a generic (file system independent) way to handle
> this use case is actually a bit concerning.

I think that a more simpler approach would be developing a GRUB fs, where is the kernel to be adapted to the needing of GRUB...
So we can lowering the requirement...

The GRUB-fs should have the following main requirements:
- allow the atomicity guarantee
- allow molti-disk setup
- allow grub to update some file (grubenv come me as first)
- it should require a simple implementation (easy to porting to multiple system, which basically means linux, *bsd and solaris ?)
- the speed should be not important


Anyway GRUB on BTRFS suffers of a big limitation: GRUB can't update the grubenv file; and until GRUB will learn how update a COW filesystem, this limit will be impossible to avoid (*)

G.Baroncelli


(*) Even tough implementing the update of a NOCSUM file should be not so difficult...



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-12 20:04         ` Goffredo Baroncelli
@ 2019-11-13 17:00           ` Chris Murphy
  2019-11-13 18:54             ` Goffredo Baroncelli
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Murphy @ 2019-11-13 17:00 UTC (permalink / raw)
  To: Goffredo Baroncelli
  Cc: Chris Murphy, David Sterba, Andrei Borzenkov, Btrfs BTRFS

On Tue, Nov 12, 2019 at 8:04 PM Goffredo Baroncelli <kreijack@libero.it> wrote:
>
> On 11/11/2019 20.37, Chris Murphy wrote:
> > Anyway, the lack of a generic (file system independent) way to handle
> > this use case is actually a bit concerning.
>
> I think that a more simpler approach would be developing a GRUB fs, where is the kernel to be adapted to the needing of GRUB...
> So we can lowering the requirement...

I do really agree with this. It seems like a neat idea that a
bootloader can just read any file system, but when it cannot have a
true/complete view of the file system state because it just flat out
ignores critical parts of the file system? Egads.


> The GRUB-fs should have the following main requirements:
> - allow the atomicity guarantee
> - allow molti-disk setup
> - allow grub to update some file (grubenv come me as first)
> - it should require a simple implementation (easy to porting to multiple system, which basically means linux, *bsd and solaris ?)
> - the speed should be not important

Plausibly we're most of the way there already, adapting the existing
"BIOS Boot" partition.

>
>
> Anyway GRUB on BTRFS suffers of a big limitation: GRUB can't update the grubenv file; and until GRUB will learn how update a COW filesystem, this limit will be impossible to avoid (*)

Yep. And I've discussed it with XFS and ext4 devs and they're not keen
on anything writing into file system space outside of their (kernel or
user space repair too) code, which is a reasonable concern. XFS
doesn't have inline extents yet, but it's proposed. ext4 does have
inline extents I think but not enabled by default, and I also think it
takes a non-default inode size to support the ~1KiB typical grubenv
file size: but inline extents would be subject to metadata
checksumming, same as on Btrfs. So in effect, there are valid use
cases that are, or may soon become, invalid for grubenv use as
currently implemented, on the most common Linux file systems.

> (*) Even tough implementing the update of a NOCSUM file should be not so difficult...

So far I've seen 1KiB grubenv is pretty much always an inline extent
on Btrfs. Even if flagged nocow it ends up being subject to leaf
checksum. If GRUB modifies this grubenv, now that whole leaf is
invalid which could mean data loss for things not even related to the
grubenv, depending on what else is in the leaf.

I mean, GRUB is very cool in many ways, but it's so complicated that
maintaining it all I think is a real challenge and concern. And then
on top of that, the various distributions actively fork it into their
own mutually incompatible flavors. It's like GRUB is a set of LEGOs
and everyone can really optionally build their own whatever.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-13 17:00           ` Chris Murphy
@ 2019-11-13 18:54             ` Goffredo Baroncelli
  2019-11-13 21:50               ` Chris Murphy
  0 siblings, 1 reply; 12+ messages in thread
From: Goffredo Baroncelli @ 2019-11-13 18:54 UTC (permalink / raw)
  To: Chris Murphy; +Cc: David Sterba, Andrei Borzenkov, Btrfs BTRFS

On 13/11/2019 18.00, Chris Murphy wrote:
>> The GRUB-fs should have the following main requirements:
>> - allow the atomicity guarantee
>> - allow molti-disk setup
>> - allow grub to update some file (grubenv come me as first)
>> - it should require a simple implementation (easy to porting to multiple system, which basically means linux, *bsd and solaris ?)
>> - the speed should be not important
> Plausibly we're most of the way there already, adapting the existing
> "BIOS Boot" partition.
> 
Unfortunately the BIOS Boot partition (which means basically FAT), doesn't have support for "atomicity" nor multidisk..

BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-13 18:54             ` Goffredo Baroncelli
@ 2019-11-13 21:50               ` Chris Murphy
  2019-11-14  8:18                 ` Andrei Borzenkov
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Murphy @ 2019-11-13 21:50 UTC (permalink / raw)
  To: Goffredo Baroncelli
  Cc: Chris Murphy, David Sterba, Andrei Borzenkov, Btrfs BTRFS

On Wed, Nov 13, 2019 at 6:54 PM Goffredo Baroncelli <kreijack@inwind.it> wrote:
>
> On 13/11/2019 18.00, Chris Murphy wrote:
> >> The GRUB-fs should have the following main requirements:
> >> - allow the atomicity guarantee
> >> - allow molti-disk setup
> >> - allow grub to update some file (grubenv come me as first)
> >> - it should require a simple implementation (easy to porting to multiple system, which basically means linux, *bsd and solaris ?)
> >> - the speed should be not important
> > Plausibly we're most of the way there already, adapting the existing
> > "BIOS Boot" partition.
> >
> Unfortunately the BIOS Boot partition (which means basically FAT), doesn't have support for "atomicity" nor multidisk..

It's definitely not FAT. It's a blob of space owned by the bootloader.
No file system at all. As far as I know only the BIOS variant of GRUB
uses it. But GRUB does have a way of detecting core.img on it, and
avoids overwriting it by preferring to write in free space within that
partition, ostensibly to support multiple instances of GRUB (multiple
distributions), and some degree of atomicity as the core.img is
written first to this partition before the boot.img or "jump code" is
written in the first 440 bytes of the MBR.

Obviously this is BIOS specific, which is also x86 specific. So it
needs to grow to be more arch and firmware agnostic. But it's so
simple it might actually be more practical than alternatives like a
new file system or building a transactional based FAT.

I'm sorta annoyed with the UEFI spec using FAT, having not solved the
problem of atomic updating of the EFI System partition. But we could
agree to only use the EFI System partition for the sole purpose of the
firmware loading an EFI file system driver, immediately allowing the
firmware to read/write to a more reliable file system.

www.datalight.com/assets/files/secure/resources/Where%20Does%20FAT%20Fail%202016.pdf
https://elinux.org/images/5/54/Elc2011_munegowda.pdf

Those PDFs are kind interesting.




-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-13 21:50               ` Chris Murphy
@ 2019-11-14  8:18                 ` Andrei Borzenkov
  2019-11-17 23:24                   ` Chris Murphy
  0 siblings, 1 reply; 12+ messages in thread
From: Andrei Borzenkov @ 2019-11-14  8:18 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Goffredo Baroncelli, David Sterba, Btrfs BTRFS

On Thu, Nov 14, 2019 at 12:50 AM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Wed, Nov 13, 2019 at 6:54 PM Goffredo Baroncelli <kreijack@inwind.it> wrote:
> >
> > On 13/11/2019 18.00, Chris Murphy wrote:
> > >> The GRUB-fs should have the following main requirements:
> > >> - allow the atomicity guarantee
> > >> - allow molti-disk setup
> > >> - allow grub to update some file (grubenv come me as first)
> > >> - it should require a simple implementation (easy to porting to multiple system, which basically means linux, *bsd and solaris ?)
> > >> - the speed should be not important
> > > Plausibly we're most of the way there already, adapting the existing
> > > "BIOS Boot" partition.
> > >
> > Unfortunately the BIOS Boot partition (which means basically FAT), doesn't have support for "atomicity" nor multidisk..
>
> It's definitely not FAT. It's a blob of space owned by the bootloader.
> No file system at all. As far as I know only the BIOS variant of GRUB
> uses it.

And only on GPT.

> But GRUB does have a way of detecting core.img on it, and

No. GRUB does not "detect" core.img at all. On Legacy BIOS stage0 code
in MBR includes hardcoded absolute disk location of core.img (as list
of extents). Stage0 does not care whether this location is post-MBR
gap, BIOS boot partition or file inside another file system, it simply
loads absolute disk blocks and jumps to loaded code.

> avoids overwriting it by preferring to write in free space within that
> partition, ostensibly to support multiple instances of GRUB (multiple
> distributions),

Sorry? What are you talking about? grub itself (code executed at boot
time) does not write anything anywhere except very limited support for
environment block. grub-install simply writes either to post-MBR gap
or to BIOS Boot partition; it has absolutely no way to reliably detect
presence of "another" core.img there. BIOS Boot partition does not
have any metadata at all.

> and some degree of atomicity as the core.img is
> written first to this partition before the boot.img or "jump code" is
> written in the first 440 bytes of the MBR.
>

core.img must match block list recorded in MBR; as soon as core.img is
overwritten in-place you cannot guarantee that whatever stage0 will
read matches what has been written if stage0 update was aborted for
whatever reasons.

> Obviously this is BIOS specific, which is also x86 specific. So it
> needs to grow to be more arch and firmware agnostic. But it's so
> simple it might actually be more practical than alternatives like a
> new file system or building a transactional based FAT.
>
> I'm sorta annoyed with the UEFI spec using FAT, having not solved the
> problem of atomic updating of the EFI System partition. But we could
> agree to only use the EFI System partition for the sole purpose of the
> firmware loading an EFI file system driver, immediately allowing the
> firmware to read/write to a more reliable file system.
>

This is outside of scope of EFI, really. GRUB consists of two parts -
kernel (which is implicitly embedded in core.img/core.efi) and
loadable modules. They must match. So to ensure atomic update on any
architecture one has to

1. Write new core.img.
2. Write new /boot/grub/$platform content (new modules).
3. Switch boot information to use new version.

On EFI this would simple mean to write grubx64.efi with different name
or location on ESP and then update EFI boot variable to point to it.
Like

\EFI\vendor\image1\grubx64.efi
\EFI\vendor\image2\grubx64.efi

If you want make it alternate between two independent ESP for
additional redundancy.

/boot/grub/$platform is more involved, as a lot of code in grub2
assumes location is always under /boot/grub ($prefix more precisely).
SUSE had to introduce concept of "mounting" subvolumes on btrfs as
quick hack to overcome it.

On Legacy BIOS having two copy of core.img even more involved as it
likely really needs some primitive filesystem to manage multiple
copies.

> www.datalight.com/assets/files/secure/resources/Where%20Does%20FAT%20Fail%202016.pdf
> https://elinux.org/images/5/54/Elc2011_munegowda.pdf
>
> Those PDFs are kind interesting.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Does GRUB btrfs support log tree?
  2019-11-14  8:18                 ` Andrei Borzenkov
@ 2019-11-17 23:24                   ` Chris Murphy
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Murphy @ 2019-11-17 23:24 UTC (permalink / raw)
  To: Andrei Borzenkov
  Cc: Chris Murphy, Goffredo Baroncelli, David Sterba, Btrfs BTRFS

On Thu, Nov 14, 2019 at 9:19 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
>
> On Thu, Nov 14, 2019 at 12:50 AM Chris Murphy <lists@colorremedies.com> wrote:
> > But GRUB does have a way of detecting core.img on it, and
>
> No. GRUB does not "detect" core.img at all. On Legacy BIOS stage0 code
> in MBR includes hardcoded absolute disk location of core.img (as list
> of extents). Stage0 does not care whether this location is post-MBR
> gap, BIOS boot partition or file inside another file system, it simply
> loads absolute disk blocks and jumps to loaded code.
>
> > avoids overwriting it by preferring to write in free space within that
> > partition, ostensibly to support multiple instances of GRUB (multiple
> > distributions),
>
> Sorry? What are you talking about?

grub-install does this, at least it's what someone on grub-devel@ list
told me ages ago.


> > and some degree of atomicity as the core.img is
> > written first to this partition before the boot.img or "jump code" is
> > written in the first 440 bytes of the MBR.
> >
>
> core.img must match block list recorded in MBR; as soon as core.img is
> overwritten in-place you cannot guarantee that whatever stage0 will
> read matches what has been written if stage0 update was aborted for
> whatever reasons.

Yeah what I was told is grub-install tries not to overwrite core.img.
Obviously if it's unavoidable, the update can't be atomic.


> This is outside of scope of EFI, really. GRUB consists of two parts -
> kernel (which is implicitly embedded in core.img/core.efi) and
> loadable modules. They must match. So to ensure atomic update on any
> architecture one has to
>
> 1. Write new core.img.
> 2. Write new /boot/grub/$platform content (new modules).
> 3. Switch boot information to use new version.

The need to update modules is sort of a problem for atomicity too.


>
> On EFI this would simple mean to write grubx64.efi with different name
> or location on ESP and then update EFI boot variable to point to it.
> Like
>
> \EFI\vendor\image1\grubx64.efi
> \EFI\vendor\image2\grubx64.efi

Yes, although I'm not a fan of writes to NVRAM. They really should be
minimized. It's not high wear NVRAM, and no way to replace it if it
starts crapping out.

>
> If you want make it alternate between two independent ESP for
> additional redundancy.
>
> /boot/grub/$platform is more involved, as a lot of code in grub2
> assumes location is always under /boot/grub ($prefix more precisely).
> SUSE had to introduce concept of "mounting" subvolumes on btrfs as
> quick hack to overcome it.
>
> On Legacy BIOS having two copy of core.img even more involved as it
> likely really needs some primitive filesystem to manage multiple
> copies.

GRUB Is just too complicated. i'd really like to see it simplified and
for sure no longer depend on os-prober. When distributions are
considering things like petiboot, and linux as a bootloader,
something's gotten too complicated.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-25  9:47 Does GRUB btrfs support log tree? Chris Murphy
2019-10-25  9:50 ` Chris Murphy
2019-10-26  7:12 ` Andrei Borzenkov
2019-10-27 20:05   ` Chris Murphy
2019-11-04 19:34     ` David Sterba
2019-11-11 19:37       ` Chris Murphy
2019-11-12 20:04         ` Goffredo Baroncelli
2019-11-13 17:00           ` Chris Murphy
2019-11-13 18:54             ` Goffredo Baroncelli
2019-11-13 21:50               ` Chris Murphy
2019-11-14  8:18                 ` Andrei Borzenkov
2019-11-17 23:24                   ` Chris Murphy

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git