All of lore.kernel.org
 help / color / mirror / Atom feed
* raid1 root device with efi
@ 2024-03-08 20:39 Matt Zagrabelny
  2024-03-08 21:02 ` Qu Wenruo
  2024-03-08 21:46 ` Matthew Warren
  0 siblings, 2 replies; 14+ messages in thread
From: Matt Zagrabelny @ 2024-03-08 20:39 UTC (permalink / raw)
  To: Btrfs BTRFS

Greetings,

I've read some conflicting info online about the best way to have a
raid1 btrfs root device.

I've got two disks, with identical partitioning and I tried the
following scenario (call it scenario 1):

partition 1: EFI
partition 2: btrfs RAID1 (/)

There are some docs that claim that the above is possible and others
that say you need the following scenario, call it scenario 2:

partition 1: EFI
partition 2: MD RAID1 (/boot)
partition 3: btrfs RAID1 (/)

What do folks think? Is the first scenario setup possible? or is the
second setup the preferred way to achieve a btrfs RAID1 root
filesystem?

The reason I ask is that I followed a guide (for scenario 1) and
rebooted the computer after each step to verify that things worked.
After I finished the whole guide, I unplugged one of the disks (with
the system off) and the BIOS could no longer find the disk. I then
plugged the disk back in and the BIOS could still not find the disk,
so something is amiss.

Thanks for any commentary and help!

-m

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-08 20:39 raid1 root device with efi Matt Zagrabelny
@ 2024-03-08 21:02 ` Qu Wenruo
  2024-03-08 21:46 ` Matthew Warren
  1 sibling, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2024-03-08 21:02 UTC (permalink / raw)
  To: Matt Zagrabelny, Btrfs BTRFS



在 2024/3/9 07:09, Matt Zagrabelny 写道:
> Greetings,
>
> I've read some conflicting info online about the best way to have a
> raid1 btrfs root device.
>
> I've got two disks, with identical partitioning and I tried the
> following scenario (call it scenario 1):
>
> partition 1: EFI
> partition 2: btrfs RAID1 (/)

This really depends on how you want to setup your bootloader.

In my case, I hate GRUB2 so much that I always go with:

- Partition 1: EFI (also as /boot)
   This includes both the EFI bootloader (systemd-boot), along with
   kernel and its initramfs.

   The disadvantage is, if you go LUKS, your kernel/initramfs will never
   be encrypted, thus exporting some risks (e.g. some attacker with
   physical access to your system can implant some rootkit into your
   initramfs and steal your credential)

- Partition 2: Whatever you want
   It can be btrfs RAID*, or even other fs over LUKS over LVM etc.


>
> There are some docs that claim that the above is possible and others
> that say you need the following scenario, call it scenario 2:
>
> partition 1: EFI
> partition 2: MD RAID1 (/boot)
> partition 3: btrfs RAID1 (/)

At least OpenSUSE TumbleWeed doesn't need a dedicated /boot, and it is
using GRUB2.

So I don't think a dedicated /boot is mandatory.

Thanks,
Qu

>
> What do folks think? Is the first scenario setup possible? or is the
> second setup the preferred way to achieve a btrfs RAID1 root
> filesystem?
>
> The reason I ask is that I followed a guide (for scenario 1) and
> rebooted the computer after each step to verify that things worked.
> After I finished the whole guide, I unplugged one of the disks (with
> the system off) and the BIOS could no longer find the disk. I then
> plugged the disk back in and the BIOS could still not find the disk,
> so something is amiss.
>
> Thanks for any commentary and help!
>
> -m
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-08 20:39 raid1 root device with efi Matt Zagrabelny
  2024-03-08 21:02 ` Qu Wenruo
@ 2024-03-08 21:46 ` Matthew Warren
  2024-03-08 21:48   ` Matt Zagrabelny
  1 sibling, 1 reply; 14+ messages in thread
From: Matthew Warren @ 2024-03-08 21:46 UTC (permalink / raw)
  To: Matt Zagrabelny; +Cc: Btrfs BTRFS

On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
>
> Greetings,
>
> I've read some conflicting info online about the best way to have a
> raid1 btrfs root device.
>
> I've got two disks, with identical partitioning and I tried the
> following scenario (call it scenario 1):
>
> partition 1: EFI
> partition 2: btrfs RAID1 (/)
>
> There are some docs that claim that the above is possible...

This is definitely possible. I use it on both my server and desktop with GRUB.

Matthew Warren

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-08 21:46 ` Matthew Warren
@ 2024-03-08 21:48   ` Matt Zagrabelny
  2024-03-08 21:54     ` Matthew Warren
  0 siblings, 1 reply; 14+ messages in thread
From: Matt Zagrabelny @ 2024-03-08 21:48 UTC (permalink / raw)
  To: Matthew Warren; +Cc: Btrfs BTRFS

Hi Qu and Matthew,

On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
<matthewwarren101010@gmail.com> wrote:
>
> On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >
> > Greetings,
> >
> > I've read some conflicting info online about the best way to have a
> > raid1 btrfs root device.
> >
> > I've got two disks, with identical partitioning and I tried the
> > following scenario (call it scenario 1):
> >
> > partition 1: EFI
> > partition 2: btrfs RAID1 (/)
> >
> > There are some docs that claim that the above is possible...
>
> This is definitely possible. I use it on both my server and desktop with GRUB.

Are there any docs you follow for this setup?

Thanks for the info!

-m

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-08 21:48   ` Matt Zagrabelny
@ 2024-03-08 21:54     ` Matthew Warren
  2024-03-08 21:58       ` Matt Zagrabelny
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Warren @ 2024-03-08 21:54 UTC (permalink / raw)
  To: Matt Zagrabelny; +Cc: Btrfs BTRFS

On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
>
> Hi Qu and Matthew,
>
> On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
> <matthewwarren101010@gmail.com> wrote:
> >
> > On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> > >
> > > Greetings,
> > >
> > > I've read some conflicting info online about the best way to have a
> > > raid1 btrfs root device.
> > >
> > > I've got two disks, with identical partitioning and I tried the
> > > following scenario (call it scenario 1):
> > >
> > > partition 1: EFI
> > > partition 2: btrfs RAID1 (/)
> > >
> > > There are some docs that claim that the above is possible...
> >
> > This is definitely possible. I use it on both my server and desktop with GRUB.
>
> Are there any docs you follow for this setup?
>
> Thanks for the info!
>
> -m

The main important thing is that mdadm has several metadata versions.
Versions 0.9 and 1.0 store the metadata at the end of the partition
which allows UEFI to think the filesystem is EFI rather than mdadm
raid.
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock

I followed the arch wiki for setting it up, so here's what I followed.
https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1

Matthew Warren

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-08 21:54     ` Matthew Warren
@ 2024-03-08 21:58       ` Matt Zagrabelny
  2024-03-10 17:57         ` Forza
  0 siblings, 1 reply; 14+ messages in thread
From: Matt Zagrabelny @ 2024-03-08 21:58 UTC (permalink / raw)
  To: Matthew Warren; +Cc: Btrfs BTRFS

On Fri, Mar 8, 2024 at 3:54 PM Matthew Warren
<matthewwarren101010@gmail.com> wrote:
>
> On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >
> > Hi Qu and Matthew,
> >
> > On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
> > <matthewwarren101010@gmail.com> wrote:
> > >
> > > On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> > > >
> > > > Greetings,
> > > >
> > > > I've read some conflicting info online about the best way to have a
> > > > raid1 btrfs root device.
> > > >
> > > > I've got two disks, with identical partitioning and I tried the
> > > > following scenario (call it scenario 1):
> > > >
> > > > partition 1: EFI
> > > > partition 2: btrfs RAID1 (/)
> > > >
> > > > There are some docs that claim that the above is possible...
> > >
> > > This is definitely possible. I use it on both my server and desktop with GRUB.
> >
> > Are there any docs you follow for this setup?
> >
> > Thanks for the info!
> >
> > -m
>
> The main important thing is that mdadm has several metadata versions.
> Versions 0.9 and 1.0 store the metadata at the end of the partition
> which allows UEFI to think the filesystem is EFI rather than mdadm
> raid.
> https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock
>
> I followed the arch wiki for setting it up, so here's what I followed.
> https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1

Thanks for the hints. Hopefully there aren't any more unexpected issues.

Cheers!

-m

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-08 21:58       ` Matt Zagrabelny
@ 2024-03-10 17:57         ` Forza
  2024-03-11 14:34           ` Kai Krakow
  2024-03-27 20:21           ` Matt Zagrabelny
  0 siblings, 2 replies; 14+ messages in thread
From: Forza @ 2024-03-10 17:57 UTC (permalink / raw)
  To: Btrfs BTRFS; +Cc: Matthew Warren



On 2024-03-08 22:58, Matt Zagrabelny wrote:
> On Fri, Mar 8, 2024 at 3:54 PM Matthew Warren
> <matthewwarren101010@gmail.com> wrote:
>>
>> On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
>>>
>>> Hi Qu and Matthew,
>>>
>>> On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
>>> <matthewwarren101010@gmail.com> wrote:
>>>>
>>>> On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
>>>>>
>>>>> Greetings,
>>>>>
>>>>> I've read some conflicting info online about the best way to have a
>>>>> raid1 btrfs root device.
>>>>>
>>>>> I've got two disks, with identical partitioning and I tried the
>>>>> following scenario (call it scenario 1):
>>>>>
>>>>> partition 1: EFI
>>>>> partition 2: btrfs RAID1 (/)
>>>>>
>>>>> There are some docs that claim that the above is possible...
>>>>
>>>> This is definitely possible. I use it on both my server and desktop with GRUB.
>>>
>>> Are there any docs you follow for this setup?
>>>
>>> Thanks for the info!
>>>
>>> -m
>>
>> The main important thing is that mdadm has several metadata versions.
>> Versions 0.9 and 1.0 store the metadata at the end of the partition
>> which allows UEFI to think the filesystem is EFI rather than mdadm
>> raid.
>> https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock
>>
>> I followed the arch wiki for setting it up, so here's what I followed.
>> https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1
> 
> Thanks for the hints. Hopefully there aren't any more unexpected issues.
> 
> Cheers!
> 
> -m
> 

An alternative to mdadm is to simply have separate ESP partitions on 
each device. You can manually copy the contents between the two if you 
were to update the EFI bootloader. This way you can keep the 'other' ESP 
as backup during GRUB/EFI updates.

This solution is what I use on one of my servers. GRUB2 supports Btrfs 
RAID1 so you do not need to have the kernel and initramfs on the ESP, 
though that works very well too.

Good Luck!

~Forza

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-10 17:57         ` Forza
@ 2024-03-11 14:34           ` Kai Krakow
  2024-03-11 19:26             ` Goffredo Baroncelli
  2024-03-27 20:21           ` Matt Zagrabelny
  1 sibling, 1 reply; 14+ messages in thread
From: Kai Krakow @ 2024-03-11 14:34 UTC (permalink / raw)
  To: Forza; +Cc: Btrfs BTRFS, Matthew Warren

Hello!

Am So., 10. März 2024 um 19:18 Uhr schrieb Forza <forza@tnonline.net>:
>
>
>
> On 2024-03-08 22:58, Matt Zagrabelny wrote:
> > On Fri, Mar 8, 2024 at 3:54 PM Matthew Warren
> > <matthewwarren101010@gmail.com> wrote:
> >>
> >> On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >>>
> >>> Hi Qu and Matthew,
> >>>
> >>> On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
> >>> <matthewwarren101010@gmail.com> wrote:
> >>>>
> >>>> On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >>>>>
> >>>>> Greetings,
> >>>>>
> >>>>> I've read some conflicting info online about the best way to have a
> >>>>> raid1 btrfs root device.

I think the main issue here that leads to conflicting ideas is:

Grub records the locations (or extent index) of the boot files during
re-configuration for non-trivial filesystems. If you later move the
files, or need to switch to the mirror, it will no longer be able to
read the boot files. Grub doesn't have a full btrfs implementation to
read all the metadata, nor does it know or detect the member devices
of the pool. So in this context, it supports btrfs raid1 under certain
conditions, if, and only if, just two devices are used, and the grub
device remains the same. If you add a third device, both raid1 stripes
for boot files may end up on devices of the pool that grub doesn't
consider. As an example, bees is known to mess up grub boot on btrfs
because it relocates the boot files without letting grub know:
https://github.com/Zygo/bees/issues/249

I'd argue that grub can only boot reliably from single-device btrfs
unless you move boot file extents without re-configuring it. Grub only
has very basic support for btrfs.

mdadm for ESP is not supported for very similar reasons (because EFI
doesn't open the filesystem read-only): It will break the mirror.

The best way, as outlined in the thread already, is two have two ESP,
not put the kernel boot files in btrfs but in ESP instead, and adjust
your kernel-install plugins to mirror the boot files to the other ESP
partition.

Personally, I've got a USB stick where I keep a copy of my ESP created
with major configuration changes (e.g. major kernel update, boot
configuration changes), and the ESP is also included in my daily
backup. I keep blank reserve partitions on all other devices which I
can copy the ESP to in case of disaster. This serves an additional
purpose of keeping some part of the devices trimmed for wear-leveling.


> >>>>>
> >>>>> I've got two disks, with identical partitioning and I tried the
> >>>>> following scenario (call it scenario 1):
> >>>>>
> >>>>> partition 1: EFI
> >>>>> partition 2: btrfs RAID1 (/)
> >>>>>
> >>>>> There are some docs that claim that the above is possible...
> >>>>
> >>>> This is definitely possible. I use it on both my server and desktop with GRUB.
> >>>
> >>> Are there any docs you follow for this setup?
> >>>
> >>> Thanks for the info!
> >>>
> >>> -m
> >>
> >> The main important thing is that mdadm has several metadata versions.
> >> Versions 0.9 and 1.0 store the metadata at the end of the partition
> >> which allows UEFI to think the filesystem is EFI rather than mdadm
> >> raid.
> >> https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock
> >>
> >> I followed the arch wiki for setting it up, so here's what I followed.
> >> https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1
> >
> > Thanks for the hints. Hopefully there aren't any more unexpected issues.
> >
> > Cheers!
> >
> > -m
> >
>
> An alternative to mdadm is to simply have separate ESP partitions on
> each device. You can manually copy the contents between the two if you
> were to update the EFI bootloader. This way you can keep the 'other' ESP
> as backup during GRUB/EFI updates.
>
> This solution is what I use on one of my servers. GRUB2 supports Btrfs
> RAID1 so you do not need to have the kernel and initramfs on the ESP,
> though that works very well too.
>
> Good Luck!
>
> ~Forza

Regards,
Kai

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-11 14:34           ` Kai Krakow
@ 2024-03-11 19:26             ` Goffredo Baroncelli
  2024-03-12 12:39               ` Kai Krakow
  0 siblings, 1 reply; 14+ messages in thread
From: Goffredo Baroncelli @ 2024-03-11 19:26 UTC (permalink / raw)
  To: Kai Krakow, Forza; +Cc: Btrfs BTRFS, Matthew Warren

On 11/03/2024 15.34, Kai Krakow wrote:
> Hello!
> 
> Am So., 10. März 2024 um 19:18 Uhr schrieb Forza <forza@tnonline.net>:
>>
>>
>>
>> On 2024-03-08 22:58, Matt Zagrabelny wrote:
>>> On Fri, Mar 8, 2024 at 3:54 PM Matthew Warren
>>> <matthewwarren101010@gmail.com> wrote:
>>>>
>>>> On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
>>>>>
>>>>> Hi Qu and Matthew,
>>>>>
>>>>> On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
>>>>> <matthewwarren101010@gmail.com> wrote:
>>>>>>
>>>>>> On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
>>>>>>>
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I've read some conflicting info online about the best way to have a
>>>>>>> raid1 btrfs root device.
> 
> I think the main issue here that leads to conflicting ideas is:
> 
> Grub records the locations (or extent index) of the boot files during
> re-configuration for non-trivial filesystems. If you later move the
> files, or need to switch to the mirror, it will no longer be able to
> read the boot files. Grub doesn't have a full btrfs implementation to
> read all the metadata, nor does it know or detect the member devices
> of the pool.

I don't think that what you describe is really accurate. Grub (in the NON uefi version)
stores some code in the first 2MB of the disk (this is one of the reason why fdisk by
default starts the first partition at the 1st MB of the disk). This code is mapped as
you wrote. And if you mess with this disk area grub gets confused.

And the btrfs grub module is stored in this area. After this module is loaded, grub
has a full access to a btrfs partition.

The fact in some condition grub is not able to access anymore to a btrfs filesystem
is more related to a not mature btrfs implementation in grub.

I am quite sure that grub access a btrfs filesystem dynamically, without using a
pre-stored table with the location of a file.

To verify that, try to access a random file or directory in a btrfs location (e.g.
ls /bin) that is not related to a 'boot' process.

> So in this context, it supports btrfs raid1 under certain
> conditions, if, and only if, just two devices are used, and the grub
> device remains the same. If you add a third device, both raid1 stripes
> for boot files may end up on devices of the pool that grub doesn't
> consider. As an example, bees is known to mess up grub boot on btrfs
> because it relocates the boot files without letting grub know:
> https://github.com/Zygo/bees/issues/249
> 
> I'd argue that grub can only boot reliably from single-device btrfs
> unless you move boot file extents without re-configuring it. Grub only
> has very basic support for btrfs.
> 
> mdadm for ESP is not supported for very similar reasons (because EFI
> doesn't open the filesystem read-only): It will break the mirror.
> 
> The best way, as outlined in the thread already, is two have two ESP,
> not put the kernel boot files in btrfs but in ESP instead, and adjust
> your kernel-install plugins to mirror the boot files to the other ESP
> partition.
> 
> Personally, I've got a USB stick where I keep a copy of my ESP created
> with major configuration changes (e.g. major kernel update, boot
> configuration changes), and the ESP is also included in my daily
> backup. I keep blank reserve partitions on all other devices which I
> can copy the ESP to in case of disaster. This serves an additional
> purpose of keeping some part of the devices trimmed for wear-leveling.
> 
> 
>>>>>>>
>>>>>>> I've got two disks, with identical partitioning and I tried the
>>>>>>> following scenario (call it scenario 1):
>>>>>>>
>>>>>>> partition 1: EFI
>>>>>>> partition 2: btrfs RAID1 (/)
>>>>>>>
>>>>>>> There are some docs that claim that the above is possible...
>>>>>>
>>>>>> This is definitely possible. I use it on both my server and desktop with GRUB.
>>>>>
>>>>> Are there any docs you follow for this setup?
>>>>>
>>>>> Thanks for the info!
>>>>>
>>>>> -m
>>>>
>>>> The main important thing is that mdadm has several metadata versions.
>>>> Versions 0.9 and 1.0 store the metadata at the end of the partition
>>>> which allows UEFI to think the filesystem is EFI rather than mdadm
>>>> raid.
>>>> https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock
>>>>
>>>> I followed the arch wiki for setting it up, so here's what I followed.
>>>> https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1
>>>
>>> Thanks for the hints. Hopefully there aren't any more unexpected issues.
>>>
>>> Cheers!
>>>
>>> -m
>>>
>>
>> An alternative to mdadm is to simply have separate ESP partitions on
>> each device. You can manually copy the contents between the two if you
>> were to update the EFI bootloader. This way you can keep the 'other' ESP
>> as backup during GRUB/EFI updates.
>>
>> This solution is what I use on one of my servers. GRUB2 supports Btrfs
>> RAID1 so you do not need to have the kernel and initramfs on the ESP,
>> though that works very well too.
>>
>> Good Luck!
>>
>> ~Forza
> 
> Regards,
> Kai
> 

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-11 19:26             ` Goffredo Baroncelli
@ 2024-03-12 12:39               ` Kai Krakow
  2024-03-12 18:55                 ` Goffredo Baroncelli
  0 siblings, 1 reply; 14+ messages in thread
From: Kai Krakow @ 2024-03-12 12:39 UTC (permalink / raw)
  To: kreijack; +Cc: Forza, Btrfs BTRFS, Matthew Warren

Am Mo., 11. März 2024 um 20:26 Uhr schrieb Goffredo Baroncelli
<kreijack@libero.it>:
>
> On 11/03/2024 15.34, Kai Krakow wrote:
> > Hello!
> >
> > Am So., 10. März 2024 um 19:18 Uhr schrieb Forza <forza@tnonline.net>:
> >>
> >>
> >>
> >> On 2024-03-08 22:58, Matt Zagrabelny wrote:
> >>> On Fri, Mar 8, 2024 at 3:54 PM Matthew Warren
> >>> <matthewwarren101010@gmail.com> wrote:
> >>>>
> >>>> On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >>>>>
> >>>>> Hi Qu and Matthew,
> >>>>>
> >>>>> On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
> >>>>> <matthewwarren101010@gmail.com> wrote:
> >>>>>>
> >>>>>> On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >>>>>>>
> >>>>>>> Greetings,
> >>>>>>>
> >>>>>>> I've read some conflicting info online about the best way to have a
> >>>>>>> raid1 btrfs root device.
> >
> > I think the main issue here that leads to conflicting ideas is:
> >
> > Grub records the locations (or extent index) of the boot files during
> > re-configuration for non-trivial filesystems. If you later move the
> > files, or need to switch to the mirror, it will no longer be able to
> > read the boot files. Grub doesn't have a full btrfs implementation to
> > read all the metadata, nor does it know or detect the member devices
> > of the pool.
>
> I don't think that what you describe is really accurate. Grub (in the NON uefi version)
> stores some code in the first 2MB of the disk (this is one of the reason why fdisk by
> default starts the first partition at the 1st MB of the disk). This code is mapped as
> you wrote. And if you mess with this disk area grub gets confused.

I've looked into the source code, and it seems the btrfs code is very
basic. It looks like it could handle multiple devices. But it clearly
cannot handle some of the extent flags, doesn't handle compressed
extents (bees, as in my example, does create such extents), has
problems with holes and inline extents (indicated by the source code
comments) and requires extents to be contiguous to read data reliably.

> And the btrfs grub module is stored in this area. After this module is loaded, grub
> has a full access to a btrfs partition.
>
> The fact in some condition grub is not able to access anymore to a btrfs filesystem
> is more related to a not mature btrfs implementation in grub.

Agreed.

> I am quite sure that grub access a btrfs filesystem dynamically, without using a
> pre-stored table with the location of a file.

Yes, it probably can work its way through the various trees but the
extent resolver and reader is very basic.

This means, at least for now: do not let anything touch the boot files
grub is using, and do not use compression, then it SHOULD work well
most of the time.

I'd still avoid complex filesystems involved in the grub booting process.


> To verify that, try to access a random file or directory in a btrfs location (e.g.
> ls /bin) that is not related to a 'boot' process.
>
> > So in this context, it supports btrfs raid1 under certain
> > conditions, if, and only if, just two devices are used, and the grub
> > device remains the same. If you add a third device, both raid1 stripes
> > for boot files may end up on devices of the pool that grub doesn't
> > consider. As an example, bees is known to mess up grub boot on btrfs
> > because it relocates the boot files without letting grub know:
> > https://github.com/Zygo/bees/issues/249
> >
> > I'd argue that grub can only boot reliably from single-device btrfs
> > unless you move boot file extents without re-configuring it. Grub only
> > has very basic support for btrfs.
> >
> > mdadm for ESP is not supported for very similar reasons (because EFI
> > doesn't open the filesystem read-only): It will break the mirror.
> >
> > The best way, as outlined in the thread already, is two have two ESP,
> > not put the kernel boot files in btrfs but in ESP instead, and adjust
> > your kernel-install plugins to mirror the boot files to the other ESP
> > partition.
> >
> > Personally, I've got a USB stick where I keep a copy of my ESP created
> > with major configuration changes (e.g. major kernel update, boot
> > configuration changes), and the ESP is also included in my daily
> > backup. I keep blank reserve partitions on all other devices which I
> > can copy the ESP to in case of disaster. This serves an additional
> > purpose of keeping some part of the devices trimmed for wear-leveling.
> >
> >
> >>>>>>>
> >>>>>>> I've got two disks, with identical partitioning and I tried the
> >>>>>>> following scenario (call it scenario 1):
> >>>>>>>
> >>>>>>> partition 1: EFI
> >>>>>>> partition 2: btrfs RAID1 (/)
> >>>>>>>
> >>>>>>> There are some docs that claim that the above is possible...
> >>>>>>
> >>>>>> This is definitely possible. I use it on both my server and desktop with GRUB.
> >>>>>
> >>>>> Are there any docs you follow for this setup?
> >>>>>
> >>>>> Thanks for the info!
> >>>>>
> >>>>> -m
> >>>>
> >>>> The main important thing is that mdadm has several metadata versions.
> >>>> Versions 0.9 and 1.0 store the metadata at the end of the partition
> >>>> which allows UEFI to think the filesystem is EFI rather than mdadm
> >>>> raid.
> >>>> https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock
> >>>>
> >>>> I followed the arch wiki for setting it up, so here's what I followed.
> >>>> https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1
> >>>
> >>> Thanks for the hints. Hopefully there aren't any more unexpected issues.
> >>>
> >>> Cheers!
> >>>
> >>> -m
> >>>
> >>
> >> An alternative to mdadm is to simply have separate ESP partitions on
> >> each device. You can manually copy the contents between the two if you
> >> were to update the EFI bootloader. This way you can keep the 'other' ESP
> >> as backup during GRUB/EFI updates.
> >>
> >> This solution is what I use on one of my servers. GRUB2 supports Btrfs
> >> RAID1 so you do not need to have the kernel and initramfs on the ESP,
> >> though that works very well too.
> >>
> >> Good Luck!
> >>
> >> ~Forza
> >
> > Regards,
> > Kai
> >
>
> --
> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
> Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-12 12:39               ` Kai Krakow
@ 2024-03-12 18:55                 ` Goffredo Baroncelli
  0 siblings, 0 replies; 14+ messages in thread
From: Goffredo Baroncelli @ 2024-03-12 18:55 UTC (permalink / raw)
  To: Kai Krakow; +Cc: Forza, Btrfs BTRFS, Matthew Warren

On 12/03/2024 13.39, Kai Krakow wrote:
> Am Mo., 11. März 2024 um 20:26 Uhr schrieb Goffredo Baroncelli
> <kreijack@libero.it>:
[...]
>> I am quite sure that grub access a btrfs filesystem dynamically, without using a
>> pre-stored table with the location of a file.
> 
> Yes, it probably can work its way through the various trees but the
> extent resolver and reader is very basic.
> 
> This means, at least for now: do not let anything touch the boot files
> grub is using, and do not use compression, then it SHOULD work well
> most of the time.
> 
> I'd still avoid complex filesystems involved in the grub booting process.
> 
The other options are not better: using a dedicated filesystem is not without
shortcomings: do you think that the implementations of VFAT in the UEFI bioseS
are better ? The fact that the FAT filesystem is far simpler is a plus but...

As sd-boot user, I like the simplicity of this bootloader. But in some
context it is limited. My efi partition has a size of 1GB, but now the
kernel+initrd images are in the order of 100MB....so it can't contain more than
10-12 images. This is fine for the standard user, but it may be not enough
for some hobbyist...

grub is over-complicated, but it can withstand complex filesystem layout...

May be a kernel with a dedicated initrd which kexec the real kernel ?

BR

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-10 17:57         ` Forza
  2024-03-11 14:34           ` Kai Krakow
@ 2024-03-27 20:21           ` Matt Zagrabelny
  2024-03-28  4:03             ` Andrei Borzenkov
  2024-03-28 11:42             ` Forza
  1 sibling, 2 replies; 14+ messages in thread
From: Matt Zagrabelny @ 2024-03-27 20:21 UTC (permalink / raw)
  To: Forza; +Cc: Btrfs BTRFS, Matthew Warren

Hi Forza and others,

On Sun, Mar 10, 2024 at 1:20 PM Forza <forza@tnonline.net> wrote:
>
>
>
> On 2024-03-08 22:58, Matt Zagrabelny wrote:
> > On Fri, Mar 8, 2024 at 3:54 PM Matthew Warren
> > <matthewwarren101010@gmail.com> wrote:
> >>
> >> On Fri, Mar 8, 2024 at 4:48 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >>>
> >>> Hi Qu and Matthew,
> >>>
> >>> On Fri, Mar 8, 2024 at 3:46 PM Matthew Warren
> >>> <matthewwarren101010@gmail.com> wrote:
> >>>>
> >>>> On Fri, Mar 8, 2024 at 3:46 PM Matt Zagrabelny <mzagrabe@d.umn.edu> wrote:
> >>>>>
> >>>>> Greetings,
> >>>>>
> >>>>> I've read some conflicting info online about the best way to have a
> >>>>> raid1 btrfs root device.
> >>>>>
> >>>>> I've got two disks, with identical partitioning and I tried the
> >>>>> following scenario (call it scenario 1):
> >>>>>
> >>>>> partition 1: EFI
> >>>>> partition 2: btrfs RAID1 (/)
> >>>>>
> >>>>> There are some docs that claim that the above is possible...
> >>>>
> >>>> This is definitely possible. I use it on both my server and desktop with GRUB.
> >>>
> >>> Are there any docs you follow for this setup?
> >>>
> >>> Thanks for the info!
> >>>
> >>> -m
> >>
> >> The main important thing is that mdadm has several metadata versions.
> >> Versions 0.9 and 1.0 store the metadata at the end of the partition
> >> which allows UEFI to think the filesystem is EFI rather than mdadm
> >> raid.
> >> https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#Sub-versions_of_the_version-1_superblock
> >>
> >> I followed the arch wiki for setting it up, so here's what I followed.
> >> https://wiki.archlinux.org/title/EFI_system_partition#ESP_on_software_RAID1
> >
> > Thanks for the hints. Hopefully there aren't any more unexpected issues.
> >
> > Cheers!
> >
> > -m
> >
>
> An alternative to mdadm is to simply have separate ESP partitions on
> each device. You can manually copy the contents between the two if you
> were to update the EFI bootloader. This way you can keep the 'other' ESP
> as backup during GRUB/EFI updates.
>
> This solution is what I use on one of my servers. GRUB2 supports Btrfs
> RAID1 so you do not need to have the kernel and initramfs on the ESP,
> though that works very well too.

Are folks using the "degraded" option in /etc/fstab or the grub mount
options for the btrfs mount?

I've read online [0] that the degraded option can cause issues due to
timeouts being exceeded.

Also, I'm seeing some confusing results of looking at the UUID of my disks:

root@achilles:~# blkid | grep /dev/sd
/dev/sdb2: UUID="9a46a8ad-de37-48c0-ad96-2c54df42dd5a"
UUID_SUB="7737fc5f-036d-4126-9d7c-f1726d550444" BLOCK_SIZE="4096"
TYPE="btrfs" PARTUUID="3a22621c-a4e1-8641-aa0f-990a824fabf4"
/dev/sdb1: UUID="BD42-AEB1" BLOCK_SIZE="512" TYPE="vfat"
PARTUUID="43e432b1-6c68-4b5c-9c30-793fcc10a700"
/dev/sda2: UUID="9a46a8ad-de37-48c0-ad96-2c54df42dd5a"
UUID_SUB="9436f570-6d15-4c74-aff8-5bd85995d92d" BLOCK_SIZE="4096"
TYPE="btrfs" PARTUUID="e3b4b268-99e8-4043-a879-acfc8318232b"
/dev/sda1: UUID="BD42-AEB1" BLOCK_SIZE="512" TYPE="vfat"
PARTUUID="02568ee9-db21-4d03-a898-3d1a106ecbec"

...why does /dev/sdb2 show up in the following /dev/disk/by-uuid, but
not /dev/sda2:

root@achilles:~# ls -alh /dev/disk/by-uuid/
total 0
drwxr-xr-x 2 root root  80 Mar 25 21:16 .
drwxr-xr-x 7 root root 140 Mar 25 21:16 ..
lrwxrwxrwx 1 root root  10 Mar 25 21:16
9a46a8ad-de37-48c0-ad96-2c54df42dd5a -> ../../sdb2
lrwxrwxrwx 1 root root  10 Mar 25 21:16 BD42-AEB1 -> ../../sda1

What do folks think about the following fstab?

root@achilles:~# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# systemd generates mount units based on this file, see systemd.mount(5).
# Please run 'systemctl daemon-reload' after making changes here.
#
# <file system>                           <mount point>   <type>
<options>                      <dump>  <pass>
# / was on /dev/sda2 during installation
UUID=9a46a8ad-de37-48c0-ad96-2c54df42dd5a /               btrfs
defaults,degraded,subvol=@     0       0
UUID=9a46a8ad-de37-48c0-ad96-2c54df42dd5a /home           btrfs
defaults,degraded,subvol=@home 0       0
# /boot/efi was on /dev/sda1 during installation
UUID=BD42-AEB1                            /boot/efi       vfat
umask=0077                     0       1

Some extra info, in case it is useful...

root@achilles:~# mount | grep /dev/sd
/dev/sda2 on / type btrfs
(rw,relatime,degraded,ssd,space_cache=v2,subvolid=256,subvol=/@)
/dev/sda2 on /home type btrfs
(rw,relatime,degraded,ssd,space_cache=v2,subvolid=257,subvol=/@home)
/dev/sda1 on /boot/efi type vfat
(rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro)


root@achilles:~# btrfs device usage /
/dev/sda2, ID: 1
   Device size:           237.97GiB
   Device slack:              0.00B
   Data,RAID1:              2.00GiB
   Metadata,RAID1:          1.00GiB
   System,RAID1:           32.00MiB
   Unallocated:           234.94GiB

/dev/sdb2, ID: 2
   Device size:           237.97GiB
   Device slack:              0.00B
   Data,RAID1:              2.00GiB
   Metadata,RAID1:          1.00GiB
   System,RAID1:           32.00MiB
   Unallocated:           234.94GiB



Thanks for any help and/or advice!

-m


[0] https://www.reddit.com/r/btrfs/comments/kguqsg/degraded_boot_with_systemd/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-27 20:21           ` Matt Zagrabelny
@ 2024-03-28  4:03             ` Andrei Borzenkov
  2024-03-28 11:42             ` Forza
  1 sibling, 0 replies; 14+ messages in thread
From: Andrei Borzenkov @ 2024-03-28  4:03 UTC (permalink / raw)
  To: Matt Zagrabelny, Forza; +Cc: Btrfs BTRFS, Matthew Warren

On 27.03.2024 23:21, Matt Zagrabelny wrote:
> 
> Are folks using the "degraded" option in /etc/fstab or the grub mount
> options for the btrfs mount?
> 
> I've read online [0] that the degraded option can cause issues due to
> timeouts being exceeded.
> 

That is incorrect. The "degraded" option does not "cause" issue due to 
timeouts which happen long before this option is even seen. It does not 
fix them, that's true.

> Also, I'm seeing some confusing results of looking at the UUID of my disks:
> 
> root@achilles:~# blkid | grep /dev/sd
> /dev/sdb2: UUID="9a46a8ad-de37-48c0-ad96-2c54df42dd5a"
> UUID_SUB="7737fc5f-036d-4126-9d7c-f1726d550444" BLOCK_SIZE="4096"
> TYPE="btrfs" PARTUUID="3a22621c-a4e1-8641-aa0f-990a824fabf4"
> /dev/sdb1: UUID="BD42-AEB1" BLOCK_SIZE="512" TYPE="vfat"
> PARTUUID="43e432b1-6c68-4b5c-9c30-793fcc10a700"
> /dev/sda2: UUID="9a46a8ad-de37-48c0-ad96-2c54df42dd5a"
> UUID_SUB="9436f570-6d15-4c74-aff8-5bd85995d92d" BLOCK_SIZE="4096"
> TYPE="btrfs" PARTUUID="e3b4b268-99e8-4043-a879-acfc8318232b"
> /dev/sda1: UUID="BD42-AEB1" BLOCK_SIZE="512" TYPE="vfat"
> PARTUUID="02568ee9-db21-4d03-a898-3d1a106ecbec"
> 
> ...why does /dev/sdb2 show up in the following /dev/disk/by-uuid, but
> not /dev/sda2:
> 
> root@achilles:~# ls -alh /dev/disk/by-uuid/
> total 0
> drwxr-xr-x 2 root root  80 Mar 25 21:16 .
> drwxr-xr-x 7 root root 140 Mar 25 21:16 ..
> lrwxrwxrwx 1 root root  10 Mar 25 21:16
> 9a46a8ad-de37-48c0-ad96-2c54df42dd5a -> ../../sdb2
> lrwxrwxrwx 1 root root  10 Mar 25 21:16 BD42-AEB1 -> ../../sda1
> 

How do you expect the one symlink to point to multiple destinations?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid1 root device with efi
  2024-03-27 20:21           ` Matt Zagrabelny
  2024-03-28  4:03             ` Andrei Borzenkov
@ 2024-03-28 11:42             ` Forza
  1 sibling, 0 replies; 14+ messages in thread
From: Forza @ 2024-03-28 11:42 UTC (permalink / raw)
  To: Matt Zagrabelny; +Cc: Btrfs BTRFS, Matthew Warren



---- From: Matt Zagrabelny <mzagrabe@d.umn.edu> -- Sent: 2024-03-27 - 21:21 ----
> 
> Are folks using the "degraded" option in /etc/fstab or the grub mount
> options for the btrfs mount?
> 
> I've read online [0] that the degraded option can cause issues due to
> timeouts being exceeded.

No, i do not recommend it. The issue isn't with grub, but if a device is missing during boot (hotplug, cables, errors, etc), and the system continues to use the remaining disk and later the disk comes back online you would have a split-brain situation. As far as I know there is no specific handling in btrfs for this, and the recommended way is to wipefs a removed device before adding it back. 

For this reason it is better to be forced to manually handle the situation. You could, for example have a second boot entry for degraded mode, etc. 


> 
> Also, I'm seeing some confusing results of looking at the UUID of my disks:
> 
> root@achilles:~# blkid | grep /dev/sd
> /dev/sdb2: UUID="9a46a8ad-de37-48c0-ad96-2c54df42dd5a"
> UUID_SUB="7737fc5f-036d-4126-9d7c-f1726d550444" BLOCK_SIZE="4096"
> TYPE="btrfs" PARTUUID="3a22621c-a4e1-8641-aa0f-990a824fabf4"
> /dev/sdb1: UUID="BD42-AEB1" BLOCK_SIZE="512" TYPE="vfat"
> PARTUUID="43e432b1-6c68-4b5c-9c30-793fcc10a700"
> /dev/sda2: UUID="9a46a8ad-de37-48c0-ad96-2c54df42dd5a"
> UUID_SUB="9436f570-6d15-4c74-aff8-5bd85995d92d" BLOCK_SIZE="4096"
> TYPE="btrfs" PARTUUID="e3b4b268-99e8-4043-a879-acfc8318232b"
> /dev/sda1: UUID="BD42-AEB1" BLOCK_SIZE="512" TYPE="vfat"
> PARTUUID="02568ee9-db21-4d03-a898-3d1a106ecbec"
> 
> ...why does /dev/sdb2 show up in the following /dev/disk/by-uuid, but
> not /dev/sda2:
> 
> root@achilles:~# ls -alh /dev/disk/by-uuid/
> total 0
> drwxr-xr-x 2 root root  80 Mar 25 21:16 .
> drwxr-xr-x 7 root root 140 Mar 25 21:16 ..
> lrwxrwxrwx 1 root root  10 Mar 25 21:16
> 9a46a8ad-de37-48c0-ad96-2c54df42dd5a -> ../../sdb2
> lrwxrwxrwx 1 root root  10 Mar 25 21:16 BD42-AEB1 -> ../../sda1

I believe only the first device found can have a link. It is not possible to have multiple link from one UUID to several devices. 

> 
> What do folks think about the following fstab?
> 
> root@achilles:~# cat /etc/fstab

> #
> # <file system>                           <mount point>   <type>
> <options>                      <dump>  <pass>
> # / was on /dev/sda2 during installation
> UUID=9a46a8ad-de37-48c0-ad96-2c54df42dd5a /               btrfs
> defaults,degraded,subvol=@     0       0
> UUID=9a46a8ad-de37-48c0-ad96-2c54df42dd5a /home           btrfs
> defaults,degraded,subvol=@home 0       0
> # /boot/efi was on /dev/sda1 during installation
> UUID=BD42-AEB1                            /boot/efi       vfat
> umask=0077                     0       1
> 

I recommend you do not use degraded. Instead maybe you'd want noatime, unless you have strong reasons for using atime system-wide. The keyword "defaults" is  mostly a placeholder and isn't needed if you have other options. 

https://github.com/kdave/btrfs-progs/issues/377

~ Forza 



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-28 12:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-08 20:39 raid1 root device with efi Matt Zagrabelny
2024-03-08 21:02 ` Qu Wenruo
2024-03-08 21:46 ` Matthew Warren
2024-03-08 21:48   ` Matt Zagrabelny
2024-03-08 21:54     ` Matthew Warren
2024-03-08 21:58       ` Matt Zagrabelny
2024-03-10 17:57         ` Forza
2024-03-11 14:34           ` Kai Krakow
2024-03-11 19:26             ` Goffredo Baroncelli
2024-03-12 12:39               ` Kai Krakow
2024-03-12 18:55                 ` Goffredo Baroncelli
2024-03-27 20:21           ` Matt Zagrabelny
2024-03-28  4:03             ` Andrei Borzenkov
2024-03-28 11:42             ` Forza

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.