All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4] Add support for BTRFS raid5/6 to GRUB
@ 2018-05-16 18:48 Goffredo Baroncelli
  2018-05-16 18:48 ` [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
                   ` (8 more replies)
  0 siblings, 9 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel


Hi All,

the aim of this patches set is to provide support for a BTRFS raid5/6
filesystem in GRUB.

The first patch, implements the basic support for raid5/6. I.e this works when
all the disks are present.

The next 5 patches, are preparatory ones.

The 7th patch implements the raid5 recovery for btrfs (i.e. handling the
disappearing of 1 disk).
The 8th patch makes the code for handling the raid6 recovery more generic.
The last one implements the raid6 recovery for btrfs (i.e. handling the
disappearing up to two disks).

I tested the code in grub-emu, and it works (for me) both with all the disks,
and with some disks missing. I checked the crc32 calculated from grub and
from linux and these matched. Finally I checked if the support for md raid6
still works properly, and it does (with all the drives and with up to 2 drives
missing)

Comments are welcome.

Changelog
v1: initial support for btrfs raid5/6. No recovery allowed
v2: full support for btrfs raid5/6. Recovery allowed
v3: some minor cleanup suggested by Daniel Kiper; reusing the
    original raid6 recovery code of grub
v4: Several spell fix; better description of the RAID layout
    in btrfs, and the variables which describes the stripe
    positioning; split the patch #5 in two (#5 and #6)


BR
G.Baroncelli--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5




^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 10:07   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 2/9] btrfs: add helper to check the btrfs header Goffredo Baroncelli
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 68 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..394611cbb 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,72 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * Below an example of a RAID6 layout and the meaning of the
+	       * variables. The same apply to RAID5. The only differences is that
+	       * there is only one parity instead of two.
+	       *
+	       * A RAID6 6 layout consists in several stripes spread
+	       * on the disks, following a layout like the one below
+	       *
+	       *   Disk1  Disk2  Disk3  Ddisk4
+	       *
+	       *    A1     B1     P1     Q1
+	       *    Q2     A2     B2     P2  
+	       *    P3     Q3     A3     B3
+	       *  [...]
+	       *
+	       *  Note that the placement of the parities depends on row index;
+	       *  In the code below:
+	       *  stripe_nr -> is the stripe number not considering the parities
+	       *               (A1=0, B1=1, A2 = 2, B2 = 3, ...)
+	       *  high -> is the row number (0 for A1...Q1, 1 for Q2..P2, ...)
+	       *  stripen -> is the column number (or disk number)
+	       *  off -> logical address to read (from the beginning of the
+	       *         chunk space)
+	       *  chunk_stripe_length -> size of a stripe (typically 64k)
+	       *  nstripes -> number of disks
+	       *  low -> offset of the data inside a stripe
+	       * 
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * In the line below stripen is evaluated without considering
+	       * the parities (0 for A1, A2, A3... 1 for B1, B2...);
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * In the line below stripen, now consider also the parities (0
+	       * for A1, 1 for A2, 2 for A3....); the math is done "modulo"
+	       * number of disks
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/9] btrfs: add helper to check the btrfs header.
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
  2018-05-16 18:48 ` [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 10:12   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 3/9] btrfs: move the error logging from find_device() to its callee Goffredo Baroncelli
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

This helper will be used in few places to help the debugging. As
conservative approach, in case of error it is only logged. This
because I am not sure if this can change something in the error
handling of the currently existing code.

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 394611cbb..123bfbfc1 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -77,7 +77,8 @@ struct btrfs_header
 {
   grub_btrfs_checksum_t checksum;
   grub_btrfs_uuid_t uuid;
-  grub_uint8_t dummy[0x30];
+  grub_uint64_t bytenr;
+  grub_uint8_t dummy[0x28];
   grub_uint32_t nitems;
   grub_uint8_t level;
 } GRUB_PACKED;
@@ -286,6 +287,25 @@ free_iterator (struct grub_btrfs_leaf_descriptor *desc)
   grub_free (desc->data);
 }
 
+static grub_err_t
+check_btrfs_header (struct grub_btrfs_data *data, struct btrfs_header *header,
+                    grub_disk_addr_t addr)
+{
+  if (grub_le_to_cpu64 (header->bytenr) != addr)
+    {
+      grub_dprintf ("btrfs", "btrfs_header.bytenr is not equal node addr\n");
+      return grub_error (GRUB_ERR_BAD_FS,
+			 "header bytenr is not equal node addr");
+    }
+  if (grub_memcmp (data->sblock.uuid, header->uuid, sizeof(grub_btrfs_uuid_t)))
+    {
+      grub_dprintf ("btrfs", "btrfs_header.uuid doesn't match sblock uuid\n");
+      return grub_error (GRUB_ERR_BAD_FS,
+			 "header uuid doesn't match sblock uuid");
+    }
+  return GRUB_ERR_NONE;
+}
+
 static grub_err_t
 save_ref (struct grub_btrfs_leaf_descriptor *desc,
 	  grub_disk_addr_t addr, unsigned i, unsigned m, int l)
@@ -341,6 +361,7 @@ next (struct grub_btrfs_data *data,
 
       err = grub_btrfs_read_logical (data, grub_le_to_cpu64 (node.addr),
 				     &head, sizeof (head), 0);
+      check_btrfs_header (data, &head, grub_le_to_cpu64 (node.addr));
       if (err)
 	return -err;
 
@@ -402,6 +423,7 @@ lower_bound (struct grub_btrfs_data *data,
       /* FIXME: preread few nodes into buffer. */
       err = grub_btrfs_read_logical (data, addr, &head, sizeof (head),
 				     recursion_depth + 1);
+      check_btrfs_header (data, &head, addr);
       if (err)
 	return err;
       addr += sizeof (head);
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 3/9] btrfs: move the error logging from find_device() to its callee.
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
  2018-05-16 18:48 ` [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
  2018-05-16 18:48 ` [PATCH 2/9] btrfs: add helper to check the btrfs header Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 10:25   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 4/9] btrfs: avoiding a rescan for a device which was already not founded Goffredo Baroncelli
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

This is a preparatory patch. The callee knows better if this
error is fatal or not, i.e. another available disk.


Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 123bfbfc1..fd4225d11 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -604,9 +604,6 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id, int do_rescan)
     grub_device_iterate (find_device_iter, &ctx);
   if (!ctx.dev_found)
     {
-      grub_error (GRUB_ERR_BAD_FS,
-		  N_("couldn't find a necessary member device "
-		     "of multi-device filesystem"));
       return NULL;
     }
   data->n_devices_attached++;
@@ -900,6 +897,9 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 		dev = find_device (data, stripe->device_id, j);
 		if (!dev)
 		  {
+		    grub_dprintf ("btrfs",
+				  "couldn't find a necessary member device "
+				  "of multi-device filesystem\n");
 		    err = grub_errno;
 		    grub_errno = GRUB_ERR_NONE;
 		    continue;
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 4/9] btrfs: avoiding a rescan for a device which was already not founded.
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
                   ` (2 preceding siblings ...)
  2018-05-16 18:48 ` [PATCH 3/9] btrfs: move the error logging from find_device() to its callee Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 10:43   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 5/9] btrfs: move logging code in grub_btrfs_read_logical() Goffredo Baroncelli
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

If a device is not found, record this failure storing NULL in
data->devices_attached[]. This avoid un-necessary devices rescan,
speeding the reading in case of a degraded array.

Signed-off-by: Goffredo Baroncelli <kreikack@inwind.it>
---
 grub-core/fs/btrfs.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index fd4225d11..4920dd3b1 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -588,7 +588,7 @@ find_device_iter (const char *name, void *data)
 }
 
 static grub_device_t
-find_device (struct grub_btrfs_data *data, grub_uint64_t id, int do_rescan)
+find_device (struct grub_btrfs_data *data, grub_uint64_t id)
 {
   struct find_device_ctx ctx = {
     .data = data,
@@ -600,12 +600,9 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id, int do_rescan)
   for (i = 0; i < data->n_devices_attached; i++)
     if (id == data->devices_attached[i].id)
       return data->devices_attached[i].dev;
-  if (do_rescan)
-    grub_device_iterate (find_device_iter, &ctx);
-  if (!ctx.dev_found)
-    {
-      return NULL;
-    }
+
+  grub_device_iterate (find_device_iter, &ctx);
+
   data->n_devices_attached++;
   if (data->n_devices_attached > data->n_devices_allocated)
     {
@@ -617,7 +614,8 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id, int do_rescan)
 			* sizeof (data->devices_attached[0]));
       if (!data->devices_attached)
 	{
-	  grub_device_close (ctx.dev_found);
+	  if (ctx.dev_found)
+	    grub_device_close (ctx.dev_found);
 	  data->devices_attached = tmp;
 	  return NULL;
 	}
@@ -894,7 +892,7 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 			      " for laddr 0x%" PRIxGRUB_UINT64_T "\n", paddr,
 			      addr);
 
-		dev = find_device (data, stripe->device_id, j);
+		dev = find_device (data, stripe->device_id);
 		if (!dev)
 		  {
 		    grub_dprintf ("btrfs",
@@ -971,7 +969,8 @@ grub_btrfs_unmount (struct grub_btrfs_data *data)
   unsigned i;
   /* The device 0 is closed one layer upper.  */
   for (i = 1; i < data->n_devices_attached; i++)
-    grub_device_close (data->devices_attached[i].dev);
+    if (data->devices_attached[i].dev)
+        grub_device_close (data->devices_attached[i].dev);
   grub_free (data->devices_attached);
   grub_free (data->extent);
   grub_free (data);
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 5/9] btrfs: move logging code in grub_btrfs_read_logical()
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
                   ` (3 preceding siblings ...)
  2018-05-16 18:48 ` [PATCH 4/9] btrfs: avoiding a rescan for a device which was already not founded Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 10:55   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 6/9] btrfs: refactor the code that read from disk Goffredo Baroncelli
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

A portion of the logging code is moved outside a for(;;). The part that is
left inside is the one which depends by the for(;;) index.
This is a preparatory patch: in the next one it will be possible to refactor
the code inside the for(;;) in an another function.
---
 grub-core/fs/btrfs.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 4920dd3b1..51f300829 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -865,6 +865,18 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 
 	for (j = 0; j < 2; j++)
 	  {
+	    grub_dprintf ("btrfs", "chunk 0x%" PRIxGRUB_UINT64_T
+			  "+0x%" PRIxGRUB_UINT64_T
+			  " (%d stripes (%d substripes) of %"
+			  PRIxGRUB_UINT64_T ") \n",
+			  grub_le_to_cpu64 (key->offset),
+			  grub_le_to_cpu64 (chunk->size),
+			  grub_le_to_cpu16 (chunk->nstripes),
+			  grub_le_to_cpu16 (chunk->nsubstripes),
+			  grub_le_to_cpu64 (chunk->stripe_length));
+	    grub_dprintf ("btrfs", "reading laddr 0x%" PRIxGRUB_UINT64_T "\n",
+			  addr);
+
 	    for (i = 0; i < redundancy; i++)
 	      {
 		struct grub_btrfs_chunk_stripe *stripe;
@@ -877,20 +889,11 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 
 		paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
 
-		grub_dprintf ("btrfs", "chunk 0x%" PRIxGRUB_UINT64_T
-			      "+0x%" PRIxGRUB_UINT64_T
-			      " (%d stripes (%d substripes) of %"
-			      PRIxGRUB_UINT64_T ") stripe %" PRIxGRUB_UINT64_T
+		grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
 			      " maps to 0x%" PRIxGRUB_UINT64_T "\n",
-			      grub_le_to_cpu64 (key->offset),
-			      grub_le_to_cpu64 (chunk->size),
-			      grub_le_to_cpu16 (chunk->nstripes),
-			      grub_le_to_cpu16 (chunk->nsubstripes),
-			      grub_le_to_cpu64 (chunk->stripe_length),
 			      stripen, stripe->offset);
 		grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T
-			      " for laddr 0x%" PRIxGRUB_UINT64_T "\n", paddr,
-			      addr);
+			      "\n", paddr);
 
 		dev = find_device (data, stripe->device_id);
 		if (!dev)
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 6/9] btrfs: refactor the code that read from disk
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
                   ` (4 preceding siblings ...)
  2018-05-16 18:48 ` [PATCH 5/9] btrfs: move logging code in grub_btrfs_read_logical() Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 11:07   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles Goffredo Baroncelli
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

This is a preparatory patch, to help the adding of the RAID 5/6 recovery
code. In case of availability of all disks, this code is good for all the
RAID profiles. However in case of failure, the error handling is quite
different. Refactoring this code increases the general readability.

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 85 +++++++++++++++++++++++++-------------------
 1 file changed, 49 insertions(+), 36 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 51f300829..63651928b 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -625,6 +625,47 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id)
   return ctx.dev_found;
 }
 
+static grub_err_t
+btrfs_read_from_chunk (struct grub_btrfs_data *data,
+		       struct grub_btrfs_chunk_item *chunk,
+		       grub_uint64_t stripen, grub_uint64_t stripe_offset,
+		       int redundancy, grub_uint64_t csize,
+		       void *buf)
+{
+
+    struct grub_btrfs_chunk_stripe *stripe;
+    grub_disk_addr_t paddr;
+    grub_device_t dev;
+    grub_err_t err;
+
+    stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
+    /* Right now the redundancy handling is easy.
+       With RAID5-like it will be more difficult.  */
+    stripe += stripen + redundancy;
+
+    paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
+
+    grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
+		  " maps to 0x%" PRIxGRUB_UINT64_T "\n",
+		  stripen, stripe->offset);
+    grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T "\n", paddr);
+
+    dev = find_device (data, stripe->device_id);
+    if (!dev)
+      {
+	grub_dprintf ("btrfs",
+		      "couldn't find a necessary member device "
+		      "of multi-device filesystem\n");
+	grub_errno = GRUB_ERR_NONE;
+	return GRUB_ERR_READ_ERROR;
+      }
+
+    err = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
+			  paddr & (GRUB_DISK_SECTOR_SIZE - 1),
+			  csize, buf);
+    return err;
+}
+
 static grub_err_t
 grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 			 void *buf, grub_size_t size, int recursion_depth)
@@ -638,7 +679,6 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
       grub_err_t err = 0;
       struct grub_btrfs_key key_out;
       int challoc = 0;
-      grub_device_t dev;
       struct grub_btrfs_key key_in;
       grub_size_t chsize;
       grub_disk_addr_t chaddr;
@@ -879,42 +919,15 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 
 	    for (i = 0; i < redundancy; i++)
 	      {
-		struct grub_btrfs_chunk_stripe *stripe;
-		grub_disk_addr_t paddr;
-
-		stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
-		/* Right now the redundancy handling is easy.
-		   With RAID5-like it will be more difficult.  */
-		stripe += stripen + i;
-
-		paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
-
-		grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
-			      " maps to 0x%" PRIxGRUB_UINT64_T "\n",
-			      stripen, stripe->offset);
-		grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T
-			      "\n", paddr);
-
-		dev = find_device (data, stripe->device_id);
-		if (!dev)
-		  {
-		    grub_dprintf ("btrfs",
-				  "couldn't find a necessary member device "
-				  "of multi-device filesystem\n");
-		    err = grub_errno;
-		    grub_errno = GRUB_ERR_NONE;
-		    continue;
-		  }
-
-		err = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
-				      paddr & (GRUB_DISK_SECTOR_SIZE - 1),
-				      csize, buf);
-		if (!err)
-		  break;
-		grub_errno = GRUB_ERR_NONE;
+		err = btrfs_read_from_chunk (data, chunk, stripen,
+					     stripe_offset,
+					     i,     /* redundancy */
+					     csize, buf);
+		if (err == GRUB_ERR_NONE)
+		    break;
 	      }
-	    if (i != redundancy)
-	      break;
+	    if (err == GRUB_ERR_NONE)
+		break;
 	  }
 	if (err)
 	  return grub_errno = err;
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles.
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
                   ` (5 preceding siblings ...)
  2018-05-16 18:48 ` [PATCH 6/9] btrfs: refactor the code that read from disk Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 11:30   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding Goffredo Baroncelli
  2018-05-16 18:48 ` [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem Goffredo Baroncelli
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 174 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 170 insertions(+), 4 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 63651928b..5fcaad86f 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -29,6 +29,7 @@
 #include <minilzo.h>
 #include <grub/i18n.h>
 #include <grub/btrfs.h>
+#include <grub/crypto.h>
 
 GRUB_MOD_LICENSE ("GPLv3+");
 
@@ -666,6 +667,150 @@ btrfs_read_from_chunk (struct grub_btrfs_data *data,
     return err;
 }
 
+struct raid56_buffer {
+  void *buf;
+  int  data_is_valid;
+};
+
+static void
+rebuild_raid5 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
+               grub_uint64_t csize)
+{
+  grub_uint64_t target = 0, i;
+
+  while (buffers[target].data_is_valid && target < nstripes)
+    ++target;
+
+  if (target == nstripes)
+    {
+      grub_dprintf ("btrfs", "called rebuild_raid5(), but all disks are OK\n");
+      return;
+    }
+
+  grub_dprintf ("btrfs", "rebuilding raid5 stripe #%" PRIuGRUB_UINT64_T "\n",
+	        target);
+  for (i = 0; i < nstripes; i++)
+    if (i != target)
+      grub_crypto_xor (buffers[target].buf, buffers[target].buf, buffers[i].buf,
+                       csize);
+}
+
+static grub_err_t
+raid56_read_retry (struct grub_btrfs_data *data,
+		   struct grub_btrfs_chunk_item *chunk,
+		   grub_uint64_t stripe_offset, grub_uint64_t stripen,
+		   grub_uint64_t csize, void *buf)
+{
+
+  struct raid56_buffer *buffers = NULL;
+  grub_uint64_t nstripes = grub_le_to_cpu16 (chunk->nstripes);
+  grub_uint64_t chunk_type = grub_le_to_cpu64 (chunk->type);
+  grub_err_t ret = GRUB_ERR_NONE;
+  grub_uint64_t i, failed_devices;
+
+  buffers = grub_zalloc (sizeof(*buffers) * nstripes);
+  if (!buffers)
+    {
+      ret = GRUB_ERR_OUT_OF_MEMORY;
+      goto cleanup;
+    }
+
+  for (i = 0; i < nstripes; i++)
+    {
+      buffers[i].buf = grub_zalloc (csize);
+      if (!buffers[i].buf)
+	{
+	  ret = GRUB_ERR_OUT_OF_MEMORY;
+	  goto cleanup;
+	}
+    }
+
+  for (i = 0; i < nstripes; i++)
+    {
+      struct grub_btrfs_chunk_stripe *stripe;
+      grub_disk_addr_t paddr;
+      grub_device_t dev;
+      grub_err_t err2;
+
+      stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
+      stripe += i;
+
+      paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
+      grub_dprintf ("btrfs", "reading paddr %" PRIxGRUB_UINT64_T
+                    " from stripe ID %" PRIxGRUB_UINT64_T "\n", paddr,
+                    stripe->device_id);
+
+      dev = find_device (data, stripe->device_id);
+      if (!dev)
+	{
+	  buffers[i].data_is_valid = 0;
+	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " FAILED (dev ID %"
+			PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
+	  continue;
+	}
+
+      err2 = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
+			     paddr & (GRUB_DISK_SECTOR_SIZE - 1),
+			     csize, buffers[i].buf);
+      if (err2 == GRUB_ERR_NONE)
+	{
+	  buffers[i].data_is_valid = 1;
+	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " Ok (dev ID %"
+			PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
+	}
+      else
+	{
+	  buffers[i].data_is_valid = 0;
+	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T
+			" FAILED (dev ID %" PRIxGRUB_UINT64_T ")\n", i,
+			stripe->device_id);
+	}
+    }
+
+  failed_devices = 0;
+  for (i = 0; i < nstripes; i++)
+    if (!buffers[i].data_is_valid)
+      ++failed_devices;
+  if (failed_devices > 1 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5))
+    {
+      grub_dprintf ("btrfs",
+		    "not enough disks for raid5: total %" PRIuGRUB_UINT64_T
+		    ", missing %" PRIuGRUB_UINT64_T "\n",
+		    nstripes, failed_devices);
+      ret = GRUB_ERR_READ_ERROR;
+      goto cleanup;
+    }
+  else
+    {
+      grub_dprintf ("btrfs",
+                    "enough disks for raid5/6 rebuilding: total %"
+		    PRIuGRUB_UINT64_T ", missing %" PRIuGRUB_UINT64_T "\n",
+                    nstripes, failed_devices);
+    }
+
+  /* if these are enough, try to rebuild the data */
+  if (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+    {
+      rebuild_raid5 (buffers, nstripes, csize);
+      grub_memcpy (buf, buffers[stripen].buf, csize);
+    }
+  else
+    {
+      grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
+    }
+
+cleanup:
+  if (buffers)
+    {
+      for (i = 0; i < nstripes; i++)
+	if (buffers[i].buf)
+	  grub_free(buffers[i].buf);
+      grub_free(buffers);
+    }
+
+  return ret;
+}
+
 static grub_err_t
 grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 			 void *buf, grub_size_t size, int recursion_depth)
@@ -743,6 +888,11 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	grub_uint16_t nstripes;
 	unsigned redundancy = 1;
 	unsigned i, j;
+	int is_raid56;
+	grub_uint64_t parities_pos = 0;
+
+	is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
+		       GRUB_BTRFS_CHUNK_TYPE_RAID5);
 
 	if (grub_le_to_cpu64 (chunk->size) <= off)
 	  {
@@ -885,6 +1035,8 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	       * number of disks
 	       */
 	      grub_divmod64 (high + stripen, nstripes, &stripen);
+	      grub_divmod64 (high + nstripes - nparities, nstripes,
+			     &parities_pos);
 
 	      stripe_offset = low + chunk_stripe_length * high;
 	      csize = chunk_stripe_length - low;
@@ -917,15 +1069,29 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	    grub_dprintf ("btrfs", "reading laddr 0x%" PRIxGRUB_UINT64_T "\n",
 			  addr);
 
-	    for (i = 0; i < redundancy; i++)
+	    if (!is_raid56)
+	      {
+		for (i = 0; i < redundancy; i++)
+		  {
+		    err = btrfs_read_from_chunk (data, chunk, stripen,
+						 stripe_offset,
+						 i,     /* redundancy */
+						 csize, buf);
+		    if (err == GRUB_ERR_NONE)
+			break;
+		  }
+	      }
+	    else
 	      {
 		err = btrfs_read_from_chunk (data, chunk, stripen,
 					     stripe_offset,
-					     i,     /* redundancy */
+					     0,     /* no mirror */
 					     csize, buf);
-		if (err == GRUB_ERR_NONE)
-		    break;
+		if (err != GRUB_ERR_NONE)
+		  err = raid56_read_retry (data, chunk, stripe_offset,
+					   stripen, csize, buf);
 	      }
+
 	    if (err == GRUB_ERR_NONE)
 		break;
 	  }
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
                   ` (6 preceding siblings ...)
  2018-05-16 18:48 ` [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 12:05   ` Daniel Kiper
  2018-05-16 18:48 ` [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem Goffredo Baroncelli
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

The original code which handles the recovery of a RAID 6 disks array
assumes that all reads are multiple of 1 << GRUB_DISK_SECTOR_BITS and it
assumes that all the I/O is done via the struct grub_diskfilter_segment.
This is not true for the btrfs code. In order to reuse the native
grub_raid6_recover() code, it is modified to not call
grub_diskfilter_read_node() directly, but to call an handler passed
as argument.

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/disk/raid6_recover.c | 53 ++++++++++++++++++++++------------
 include/grub/diskfilter.h      |  9 ++++++
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/grub-core/disk/raid6_recover.c b/grub-core/disk/raid6_recover.c
index aa674f6ca..10ee495e2 100644
--- a/grub-core/disk/raid6_recover.c
+++ b/grub-core/disk/raid6_recover.c
@@ -74,14 +74,25 @@ mod_255 (unsigned x)
 }
 
 static grub_err_t
-grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
-                    char *buf, grub_disk_addr_t sector, grub_size_t size)
+raid6_recover_read_node (void *data, int disk_nr,
+				    grub_uint64_t sector,
+				    void *buf, grub_size_t size)
+{
+    struct grub_diskfilter_segment *array = data;
+    return grub_diskfilter_read_node (&array->nodes[disk_nr],
+				      (grub_disk_addr_t)sector,
+				      size >> GRUB_DISK_SECTOR_BITS, buf);
+}
+
+grub_err_t
+grub_raid6_recover_gen (void *data, grub_uint64_t nstripes, int disknr, int p,
+			    char *buf, grub_uint64_t sector, grub_size_t size,
+			    raid_recover_read_t read_func, int layout)
 {
   int i, q, pos;
   int bad1 = -1, bad2 = -1;
   char *pbuf = 0, *qbuf = 0;
 
-  size <<= GRUB_DISK_SECTOR_BITS;
   pbuf = grub_zalloc (size);
   if (!pbuf)
     goto quit;
@@ -91,17 +102,17 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
     goto quit;
 
   q = p + 1;
-  if (q == (int) array->node_count)
+  if (q == (int) nstripes)
     q = 0;
 
   pos = q + 1;
-  if (pos == (int) array->node_count)
+  if (pos == (int) nstripes)
     pos = 0;
 
-  for (i = 0; i < (int) array->node_count - 2; i++)
+  for (i = 0; i < (int) nstripes - 2; i++)
     {
       int c;
-      if (array->layout & GRUB_RAID_LAYOUT_MUL_FROM_POS)
+      if (layout & GRUB_RAID_LAYOUT_MUL_FROM_POS)
 	c = pos;
       else
 	c = i;
@@ -109,8 +120,7 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
         bad1 = c;
       else
         {
-          if (! grub_diskfilter_read_node (&array->nodes[pos], sector,
-					   size >> GRUB_DISK_SECTOR_BITS, buf))
+	  if (!read_func(data, pos, sector, buf, size))
             {
               grub_crypto_xor (pbuf, pbuf, buf, size);
               grub_raid_block_mulx (c, buf, size);
@@ -128,7 +138,7 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
         }
 
       pos++;
-      if (pos == (int) array->node_count)
+      if (pos == (int) nstripes)
         pos = 0;
     }
 
@@ -139,16 +149,14 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
   if (bad2 < 0)
     {
       /* One bad device */
-      if ((! grub_diskfilter_read_node (&array->nodes[p], sector,
-					size >> GRUB_DISK_SECTOR_BITS, buf)))
+      if (!read_func(data, p, sector, buf, size))
         {
           grub_crypto_xor (buf, buf, pbuf, size);
           goto quit;
         }
 
       grub_errno = GRUB_ERR_NONE;
-      if (grub_diskfilter_read_node (&array->nodes[q], sector,
-				     size >> GRUB_DISK_SECTOR_BITS, buf))
+      if (read_func(data, q, sector, buf, size))
         goto quit;
 
       grub_crypto_xor (buf, buf, qbuf, size);
@@ -160,14 +168,12 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
       /* Two bad devices */
       unsigned c;
 
-      if (grub_diskfilter_read_node (&array->nodes[p], sector,
-				     size >> GRUB_DISK_SECTOR_BITS, buf))
+      if (read_func(data, p, sector, buf, size))
         goto quit;
 
       grub_crypto_xor (pbuf, pbuf, buf, size);
 
-      if (grub_diskfilter_read_node (&array->nodes[q], sector,
-				     size >> GRUB_DISK_SECTOR_BITS, buf))
+      if (read_func(data, q, sector, buf, size))
         goto quit;
 
       grub_crypto_xor (qbuf, qbuf, buf, size);
@@ -190,6 +196,17 @@ quit:
   return grub_errno;
 }
 
+static grub_err_t
+grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
+                    char *buf, grub_disk_addr_t sector, grub_size_t size)
+{
+
+  return grub_raid6_recover_gen (array, array->node_count, disknr, p, buf,
+				     sector, size << GRUB_DISK_SECTOR_BITS,
+				     raid6_recover_read_node,
+				     array->layout);
+}
+
 GRUB_MOD_INIT(raid6rec)
 {
   grub_raid6_init_table ();
diff --git a/include/grub/diskfilter.h b/include/grub/diskfilter.h
index d89273c1b..911888e33 100644
--- a/include/grub/diskfilter.h
+++ b/include/grub/diskfilter.h
@@ -189,6 +189,15 @@ typedef grub_err_t (*grub_raid6_recover_func_t) (struct grub_diskfilter_segment
 extern grub_raid5_recover_func_t grub_raid5_recover_func;
 extern grub_raid6_recover_func_t grub_raid6_recover_func;
 
+typedef grub_err_t (* raid_recover_read_t)(void *data, int disk_nr,
+					   grub_uint64_t addr, void *dest,
+					   grub_size_t size);
+
+grub_err_t
+grub_raid6_recover_gen (void *data, grub_uint64_t nstripes, int disknr, int p,
+			    char *buf, grub_uint64_t sector, grub_size_t size,
+			    raid_recover_read_t read_func, int layout);
+
 grub_err_t grub_diskfilter_vg_register (struct grub_diskfilter_vg *vg);
 
 grub_err_t
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem.
  2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
                   ` (7 preceding siblings ...)
  2018-05-16 18:48 ` [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding Goffredo Baroncelli
@ 2018-05-16 18:48 ` Goffredo Baroncelli
  2018-05-30 12:15   ` Daniel Kiper
  8 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-05-16 18:48 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

Add the RAID 6 recovery, in order to use a RAID 6 filesystem even if some
disks (up to two) are missing. This code use the old md RAID 6 code already
present in grub.

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 45 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 5fcaad86f..3d71b954e 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -30,6 +30,7 @@
 #include <grub/i18n.h>
 #include <grub/btrfs.h>
 #include <grub/crypto.h>
+#include <grub/diskfilter.h>
 
 GRUB_MOD_LICENSE ("GPLv3+");
 
@@ -695,11 +696,35 @@ rebuild_raid5 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
                        csize);
 }
 
+static grub_err_t
+raid6_recover_read_buffer (void *data, int disk_nr,
+			   grub_uint64_t addr __attribute__ ((unused)),
+			   void *dest, grub_size_t size)
+{
+    struct raid56_buffer *buffers = data;
+
+    grub_memcpy(dest, buffers[disk_nr].buf, size);
+
+    grub_errno = buffers[disk_nr].data_is_valid ? GRUB_ERR_NONE : 
+		 GRUB_ERR_READ_ERROR;
+    return grub_errno;
+}
+
+static void
+rebuild_raid6 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
+               grub_uint64_t csize, grub_uint64_t parities_pos, void *dest,
+               grub_uint64_t stripen)
+
+{
+  grub_raid6_recover_gen (buffers, nstripes, stripen, parities_pos,
+                          dest, 0, csize, raid6_recover_read_buffer, 0);
+}
+
 static grub_err_t
 raid56_read_retry (struct grub_btrfs_data *data,
 		   struct grub_btrfs_chunk_item *chunk,
 		   grub_uint64_t stripe_offset, grub_uint64_t stripen,
-		   grub_uint64_t csize, void *buf)
+		   grub_uint64_t csize, void *buf, grub_uint64_t parities_pos)
 {
 
   struct raid56_buffer *buffers = NULL;
@@ -780,6 +805,15 @@ raid56_read_retry (struct grub_btrfs_data *data,
       ret = GRUB_ERR_READ_ERROR;
       goto cleanup;
     }
+  else if (failed_devices > 2 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID6))
+    {
+      grub_dprintf ("btrfs",
+		    "not enough disks for raid6: total %" PRIuGRUB_UINT64_T
+		    ", missing %" PRIuGRUB_UINT64_T "\n",
+		    nstripes, failed_devices);
+      ret = GRUB_ERR_READ_ERROR;
+      goto cleanup;
+    }
   else
     {
       grub_dprintf ("btrfs",
@@ -796,7 +830,7 @@ raid56_read_retry (struct grub_btrfs_data *data,
     }
   else
     {
-      grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
+      rebuild_raid6 (buffers, nstripes, csize, parities_pos, buf, stripen);
     }
 
 cleanup:
@@ -891,8 +925,9 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	int is_raid56;
 	grub_uint64_t parities_pos = 0;
 
-	is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
-		       GRUB_BTRFS_CHUNK_TYPE_RAID5);
+        is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
+		       (GRUB_BTRFS_CHUNK_TYPE_RAID5|
+		        GRUB_BTRFS_CHUNK_TYPE_RAID6));
 
 	if (grub_le_to_cpu64 (chunk->size) <= off)
 	  {
@@ -1089,7 +1124,7 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 					     csize, buf);
 		if (err != GRUB_ERR_NONE)
 		  err = raid56_read_retry (data, chunk, stripe_offset,
-					   stripen, csize, buf);
+					   stripen, csize, buf, parities_pos);
 	      }
 
 	    if (err == GRUB_ERR_NONE)
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-05-16 18:48 ` [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-05-30 10:07   ` Daniel Kiper
  2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 10:07 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:11PM +0200, Goffredo Baroncelli wrote:
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 68 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 68 insertions(+)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..394611cbb 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,72 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, stripe_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * Below an example of a RAID6 layout and the meaning of the

s/an example of a RAID6/is an example of a RAID 6/

> +	       * variables. The same apply to RAID5. The only differences is that

s/The same apply to RAID5/The same applies to RAID 5/

> +	       * there is only one parity instead of two.

s/parity/parity disk/

> +	       *
> +	       * A RAID6 6 layout consists in several stripes spread

s/RAID6 6/RAID 6/
s/consists in/consists of/

> +	       * on the disks, following a layout like the one below

s/on the disks/over the disks/

> +	       *
> +	       *   Disk1  Disk2  Disk3  Ddisk4
> +	       *
> +	       *    A1     B1     P1     Q1
> +	       *    Q2     A2     B2     P2
> +	       *    P3     Q3     A3     B3
> +	       *  [...]
> +	       *
> +	       *  Note that the placement of the parities depends on row index;

Please, please, please, do not abuse ";". Full stop is preffered at the
end of sentence. Please fix this in all patches.

> +	       *  In the code below:
> +	       *  stripe_nr -> is the stripe number not considering the parities
> +	       *               (A1=0, B1=1, A2 = 2, B2 = 3, ...)

May I ask you to do this like that.

The variables below have following meaning:
  - stripe_nr is the stripe number not considering the parities
    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
    ...
  - low is the offset of the data inside a stripe.

> +	       *  high -> is the row number (0 for A1...Q1, 1 for Q2..P2, ...)
> +	       *  stripen -> is the column number (or disk number)
> +	       *  off -> logical address to read (from the beginning of the
> +	       *         chunk space)

What do you mean by "chunk"? Is it e.g. A1 + B1 region? Please make it
clear what do you mean by "chunk".

> +	       *  chunk_stripe_length -> size of a stripe (typically 64k)
> +	       *  nstripes -> number of disks
> +	       *  low -> offset of the data inside a stripe

It looks that stripe_offset and csize explanation is missing here.

> +	       *
> +	       */
> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> +
> +	      /*
> +	       * In the line below stripen is evaluated without considering

s/In the line below //

> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...);

s/;/./

> +	       */
> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> +
> +	      /*
> +	       * In the line below stripen, now consider also the parities (0

s/In the line below stripen, now/Now/

> +	       * for A1, 1 for A2, 2 for A3....); the math is done "modulo"

s/; the/. The/
s/"modulo"/modulo/

> +	       * number of disks

Full stop at the end of sentence please.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/9] btrfs: add helper to check the btrfs header.
  2018-05-16 18:48 ` [PATCH 2/9] btrfs: add helper to check the btrfs header Goffredo Baroncelli
@ 2018-05-30 10:12   ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 10:12 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:12PM +0200, Goffredo Baroncelli wrote:
> This helper will be used in few places to help the debugging. As

s/in few/in a few/

> conservative approach, in case of error it is only logged. This

s/This/This is/

Otherwise LGTM.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/9] btrfs: move the error logging from find_device() to its callee.
  2018-05-16 18:48 ` [PATCH 3/9] btrfs: move the error logging from find_device() to its callee Goffredo Baroncelli
@ 2018-05-30 10:25   ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 10:25 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:13PM +0200, Goffredo Baroncelli wrote:
> This is a preparatory patch. The callee knows better if this

s/callee/caller/

> error is fatal or not, i.e. another available disk.

s/i.e. another available disk./e.g. another disk is available or not./

And I think that you can reverse order of the sentences above.

WRT the subject:
  s/Move the error logging logic from find_device() to the callee./
    Move the error logging from find_device() to its caller/

Otherwise LGTM.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 4/9] btrfs: avoiding a rescan for a device which was already not founded.
  2018-05-16 18:48 ` [PATCH 4/9] btrfs: avoiding a rescan for a device which was already not founded Goffredo Baroncelli
@ 2018-05-30 10:43   ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 10:43 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

WRT the subject:
s/btrfs: avoiding a rescan for a device which was already not founded./
  btrfs: Avoid a rescan for a device which was already not found/

On Wed, May 16, 2018 at 08:48:14PM +0200, Goffredo Baroncelli wrote:
> If a device is not found, record this failure storing NULL in

s/failure storing/failure by storing/

> data->devices_attached[]. This avoid un-necessary devices rescan,
> speeding the reading in case of a degraded array.

Last sentence should sound more or less like that:

  This way we avoid unnecessary devices rescan and speedup the reads
  in case of degraded array.

Otherwise LGTM.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] btrfs: move logging code in grub_btrfs_read_logical()
  2018-05-16 18:48 ` [PATCH 5/9] btrfs: move logging code in grub_btrfs_read_logical() Goffredo Baroncelli
@ 2018-05-30 10:55   ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 10:55 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:15PM +0200, Goffredo Baroncelli wrote:
> A portion of the logging code is moved outside a for(;;). The part that is

s/outside a for/outside of for/

> left inside is the one which depends by the for(;;) index.

s/depends by the for/depends on the for/

By the way, there are 2 for(;;). You have to be clear which one.

> This is a preparatory patch: in the next one it will be possible to refactor
> the code inside the for(;;) in an another function.

I think that you can drop these sentences. However, it seems to me that you
are no giving the reason why this move is needed for subsequent patches.
Please explain that in the commit message.

And missing SOB.

The code itself LGTM.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 6/9] btrfs: refactor the code that read from disk
  2018-05-16 18:48 ` [PATCH 6/9] btrfs: refactor the code that read from disk Goffredo Baroncelli
@ 2018-05-30 11:07   ` Daniel Kiper
  2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 11:07 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:16PM +0200, Goffredo Baroncelli wrote:
> This is a preparatory patch, to help the adding of the RAID 5/6 recovery

Please move "This is a preparatory patch" sentence below...

> code. In case of availability of all disks, this code is good for all the
> RAID profiles. However in case of failure, the error handling is quite
> different. Refactoring this code increases the general readability.

s/readability/readability too/?

You can put "This is a preparatory patch" in separate line here.
Same for the other patches.

> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 85 +++++++++++++++++++++++++-------------------
>  1 file changed, 49 insertions(+), 36 deletions(-)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index 51f300829..63651928b 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -625,6 +625,47 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id)
>    return ctx.dev_found;
>  }
>
> +static grub_err_t
> +btrfs_read_from_chunk (struct grub_btrfs_data *data,
> +		       struct grub_btrfs_chunk_item *chunk,
> +		       grub_uint64_t stripen, grub_uint64_t stripe_offset,
> +		       int redundancy, grub_uint64_t csize,
> +		       void *buf)
> +{
> +
> +    struct grub_btrfs_chunk_stripe *stripe;
> +    grub_disk_addr_t paddr;
> +    grub_device_t dev;
> +    grub_err_t err;
> +
> +    stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
> +    /* Right now the redundancy handling is easy.
> +       With RAID5-like it will be more difficult.  */
> +    stripe += stripen + redundancy;
> +
> +    paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
> +
> +    grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
> +		  " maps to 0x%" PRIxGRUB_UINT64_T "\n",
> +		  stripen, stripe->offset);
> +    grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T "\n", paddr);
> +
> +    dev = find_device (data, stripe->device_id);
> +    if (!dev)
> +      {
> +	grub_dprintf ("btrfs",
> +		      "couldn't find a necessary member device "
> +		      "of multi-device filesystem\n");
> +	grub_errno = GRUB_ERR_NONE;
> +	return GRUB_ERR_READ_ERROR;
> +      }
> +
> +    err = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
> +			  paddr & (GRUB_DISK_SECTOR_SIZE - 1),
> +			  csize, buf);
> +    return err;
> +}
> +
>  static grub_err_t
>  grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  			 void *buf, grub_size_t size, int recursion_depth)
> @@ -638,7 +679,6 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>        grub_err_t err = 0;
>        struct grub_btrfs_key key_out;
>        int challoc = 0;
> -      grub_device_t dev;
>        struct grub_btrfs_key key_in;
>        grub_size_t chsize;
>        grub_disk_addr_t chaddr;
> @@ -879,42 +919,15 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>
>  	    for (i = 0; i < redundancy; i++)
>  	      {
> -		struct grub_btrfs_chunk_stripe *stripe;
> -		grub_disk_addr_t paddr;
> -
> -		stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
> -		/* Right now the redundancy handling is easy.
> -		   With RAID5-like it will be more difficult.  */
> -		stripe += stripen + i;
> -
> -		paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
> -
> -		grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
> -			      " maps to 0x%" PRIxGRUB_UINT64_T "\n",
> -			      stripen, stripe->offset);
> -		grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T
> -			      "\n", paddr);
> -
> -		dev = find_device (data, stripe->device_id);
> -		if (!dev)
> -		  {
> -		    grub_dprintf ("btrfs",
> -				  "couldn't find a necessary member device "
> -				  "of multi-device filesystem\n");
> -		    err = grub_errno;
> -		    grub_errno = GRUB_ERR_NONE;
> -		    continue;
> -		  }
> -
> -		err = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
> -				      paddr & (GRUB_DISK_SECTOR_SIZE - 1),
> -				      csize, buf);
> -		if (!err)
> -		  break;
> -		grub_errno = GRUB_ERR_NONE;

If you drop this line then you change behavior of this function.
I have a feeling that this should stay as is. At least for now.
If you need this change then probably it should be a part of the
other patch.

> +		err = btrfs_read_from_chunk (data, chunk, stripen,
> +					     stripe_offset,
> +					     i,     /* redundancy */
> +					     csize, buf);
> +		if (err == GRUB_ERR_NONE)
> +		    break;
>  	      }
> -	    if (i != redundancy)
> -	      break;
> +	    if (err == GRUB_ERR_NONE)
> +		break;

Ditto.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles.
  2018-05-16 18:48 ` [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles Goffredo Baroncelli
@ 2018-05-30 11:30   ` Daniel Kiper
  2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 11:30 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:17PM +0200, Goffredo Baroncelli wrote:
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 174 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 170 insertions(+), 4 deletions(-)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index 63651928b..5fcaad86f 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -29,6 +29,7 @@
>  #include <minilzo.h>
>  #include <grub/i18n.h>
>  #include <grub/btrfs.h>
> +#include <grub/crypto.h>
>
>  GRUB_MOD_LICENSE ("GPLv3+");
>
> @@ -666,6 +667,150 @@ btrfs_read_from_chunk (struct grub_btrfs_data *data,
>      return err;
>  }
>
> +struct raid56_buffer {
> +  void *buf;
> +  int  data_is_valid;
> +};
> +
> +static void
> +rebuild_raid5 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
> +               grub_uint64_t csize)
> +{
> +  grub_uint64_t target = 0, i;
> +
> +  while (buffers[target].data_is_valid && target < nstripes)
> +    ++target;
> +
> +  if (target == nstripes)
> +    {
> +      grub_dprintf ("btrfs", "called rebuild_raid5(), but all disks are OK\n");
> +      return;
> +    }
> +
> +  grub_dprintf ("btrfs", "rebuilding raid5 stripe #%" PRIuGRUB_UINT64_T "\n",
> +	        target);
> +  for (i = 0; i < nstripes; i++)
> +    if (i != target)
> +      grub_crypto_xor (buffers[target].buf, buffers[target].buf, buffers[i].buf,
> +                       csize);
> +}
> +
> +static grub_err_t
> +raid56_read_retry (struct grub_btrfs_data *data,
> +		   struct grub_btrfs_chunk_item *chunk,
> +		   grub_uint64_t stripe_offset, grub_uint64_t stripen,
> +		   grub_uint64_t csize, void *buf)
> +{
> +
> +  struct raid56_buffer *buffers = NULL;
> +  grub_uint64_t nstripes = grub_le_to_cpu16 (chunk->nstripes);
> +  grub_uint64_t chunk_type = grub_le_to_cpu64 (chunk->type);
> +  grub_err_t ret = GRUB_ERR_NONE;
> +  grub_uint64_t i, failed_devices;
> +
> +  buffers = grub_zalloc (sizeof(*buffers) * nstripes);
> +  if (!buffers)
> +    {
> +      ret = GRUB_ERR_OUT_OF_MEMORY;
> +      goto cleanup;
> +    }
> +
> +  for (i = 0; i < nstripes; i++)
> +    {
> +      buffers[i].buf = grub_zalloc (csize);
> +      if (!buffers[i].buf)
> +	{
> +	  ret = GRUB_ERR_OUT_OF_MEMORY;
> +	  goto cleanup;
> +	}
> +    }
> +
> +  for (i = 0; i < nstripes; i++)
> +    {
> +      struct grub_btrfs_chunk_stripe *stripe;
> +      grub_disk_addr_t paddr;
> +      grub_device_t dev;
> +      grub_err_t err2;
> +
> +      stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
> +      stripe += i;
> +
> +      paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
> +      grub_dprintf ("btrfs", "reading paddr %" PRIxGRUB_UINT64_T
> +                    " from stripe ID %" PRIxGRUB_UINT64_T "\n", paddr,
> +                    stripe->device_id);
> +
> +      dev = find_device (data, stripe->device_id);
> +      if (!dev)
> +	{
> +	  buffers[i].data_is_valid = 0;
> +	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " FAILED (dev ID %"
> +			PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
> +	  continue;

What will happen if more than one stripe is broken?

> +	}
> +
> +      err2 = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
> +			     paddr & (GRUB_DISK_SECTOR_SIZE - 1),
> +			     csize, buffers[i].buf);
> +      if (err2 == GRUB_ERR_NONE)
> +	{
> +	  buffers[i].data_is_valid = 1;
> +	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " Ok (dev ID %"
> +			PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
> +	}
> +      else
> +	{
> +	  buffers[i].data_is_valid = 0;
> +	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T
> +			" FAILED (dev ID %" PRIxGRUB_UINT64_T ")\n", i,
> +			stripe->device_id);

Ditto?

> +	}
> +    }
> +
> +  failed_devices = 0;
> +  for (i = 0; i < nstripes; i++)
> +    if (!buffers[i].data_is_valid)
> +      ++failed_devices;
> +  if (failed_devices > 1 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5))
> +    {
> +      grub_dprintf ("btrfs",
> +		    "not enough disks for raid5: total %" PRIuGRUB_UINT64_T

s/raid5/RAID5/ Please fix this and the rest of messages in the other patches.

> +		    ", missing %" PRIuGRUB_UINT64_T "\n",
> +		    nstripes, failed_devices);
> +      ret = GRUB_ERR_READ_ERROR;
> +      goto cleanup;

Ahhh... Here it is! Perfect!

> +    }
> +  else
> +    {
> +      grub_dprintf ("btrfs",
> +                    "enough disks for raid5/6 rebuilding: total %"

s#raid5/6#RAID5# This is the patch for RAID 5. Right?
Please do not mix RAID 5 changes with RAID 6 changes.

> +		    PRIuGRUB_UINT64_T ", missing %" PRIuGRUB_UINT64_T "\n",
> +                    nstripes, failed_devices);
> +    }
> +
> +  /* if these are enough, try to rebuild the data */
> +  if (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +    {
> +      rebuild_raid5 (buffers, nstripes, csize);
> +      grub_memcpy (buf, buffers[stripen].buf, csize);

Hmmm... Do we need this grub_memcpy()? Why rebuild_raid5() could
not fill buf directly? If it is not possible then I think that
grub_memcpy() should be called from rebuild_raid5().

> +    }
> +  else
> +    {
> +      grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
> +    }

Please drop this curly brackets.

> +
> +cleanup:

Space before the label please.

> +  if (buffers)
> +    {
> +      for (i = 0; i < nstripes; i++)
> +	if (buffers[i].buf)
> +	  grub_free(buffers[i].buf);
> +      grub_free(buffers);
> +    }
> +
> +  return ret;
> +}
> +
>  static grub_err_t
>  grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  			 void *buf, grub_size_t size, int recursion_depth)
> @@ -743,6 +888,11 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	grub_uint16_t nstripes;
>  	unsigned redundancy = 1;
>  	unsigned i, j;
> +	int is_raid56;
> +	grub_uint64_t parities_pos = 0;
> +
> +	is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
> +		       GRUB_BTRFS_CHUNK_TYPE_RAID5);

OK, I would leave this as is. However, I would mention
in the commit message that this patch also do some minor
preparatory work for RAID 6 support.

>
>  	if (grub_le_to_cpu64 (chunk->size) <= off)
>  	  {
> @@ -885,6 +1035,8 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	       * number of disks
>  	       */
>  	      grub_divmod64 (high + stripen, nstripes, &stripen);
> +	      grub_divmod64 (high + nstripes - nparities, nstripes,
> +			     &parities_pos);
>
>  	      stripe_offset = low + chunk_stripe_length * high;
>  	      csize = chunk_stripe_length - low;
> @@ -917,15 +1069,29 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	    grub_dprintf ("btrfs", "reading laddr 0x%" PRIxGRUB_UINT64_T "\n",
>  			  addr);
>
> -	    for (i = 0; i < redundancy; i++)
> +	    if (!is_raid56)
> +	      {
> +		for (i = 0; i < redundancy; i++)
> +		  {
> +		    err = btrfs_read_from_chunk (data, chunk, stripen,
> +						 stripe_offset,
> +						 i,     /* redundancy */
> +						 csize, buf);
> +		    if (err == GRUB_ERR_NONE)
> +			break;
> +		  }
> +	      }
> +	    else
>  	      {
>  		err = btrfs_read_from_chunk (data, chunk, stripen,
>  					     stripe_offset,
> -					     i,     /* redundancy */
> +					     0,     /* no mirror */
>  					     csize, buf);

I do not understand this change... Why?

> -		if (err == GRUB_ERR_NONE)
> -		    break;
> +		if (err != GRUB_ERR_NONE)
> +		  err = raid56_read_retry (data, chunk, stripe_offset,
> +					   stripen, csize, buf);

I have a feeling that this change is somehow related to my comment
for patch #6.

>  	      }
> +

Drop this change please.

>  	    if (err == GRUB_ERR_NONE)
>  		break;
>  	  }

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding
  2018-05-16 18:48 ` [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding Goffredo Baroncelli
@ 2018-05-30 12:05   ` Daniel Kiper
  2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 12:05 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:18PM +0200, Goffredo Baroncelli wrote:
> The original code which handles the recovery of a RAID 6 disks array
> assumes that all reads are multiple of 1 << GRUB_DISK_SECTOR_BITS and it
> assumes that all the I/O is done via the struct grub_diskfilter_segment.
> This is not true for the btrfs code. In order to reuse the native
> grub_raid6_recover() code, it is modified to not call
> grub_diskfilter_read_node() directly, but to call an handler passed
> as argument.

s/as argument/as an argument/

> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/disk/raid6_recover.c | 53 ++++++++++++++++++++++------------
>  include/grub/diskfilter.h      |  9 ++++++
>  2 files changed, 44 insertions(+), 18 deletions(-)
>
> diff --git a/grub-core/disk/raid6_recover.c b/grub-core/disk/raid6_recover.c
> index aa674f6ca..10ee495e2 100644
> --- a/grub-core/disk/raid6_recover.c
> +++ b/grub-core/disk/raid6_recover.c
> @@ -74,14 +74,25 @@ mod_255 (unsigned x)
>  }
>
>  static grub_err_t
> -grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
> -                    char *buf, grub_disk_addr_t sector, grub_size_t size)
> +raid6_recover_read_node (void *data, int disk_nr,

s/disk_nr/disknr/

> +				    grub_uint64_t sector,
> +				    void *buf, grub_size_t size)
> +{
> +    struct grub_diskfilter_segment *array = data;

Please add one empty line here.

> +    return grub_diskfilter_read_node (&array->nodes[disk_nr],
> +				      (grub_disk_addr_t)sector,
> +				      size >> GRUB_DISK_SECTOR_BITS, buf);
> +}
> +
> +grub_err_t
> +grub_raid6_recover_gen (void *data, grub_uint64_t nstripes, int disknr, int p,
> +			    char *buf, grub_uint64_t sector, grub_size_t size,
> +			    raid_recover_read_t read_func, int layout)

Could you change the order of read_func and layout arguments?
I mean make read_func the last one and put the layout before it.

>  {
>    int i, q, pos;
>    int bad1 = -1, bad2 = -1;
>    char *pbuf = 0, *qbuf = 0;
>
> -  size <<= GRUB_DISK_SECTOR_BITS;
>    pbuf = grub_zalloc (size);
>    if (!pbuf)
>      goto quit;
> @@ -91,17 +102,17 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>      goto quit;
>
>    q = p + 1;
> -  if (q == (int) array->node_count)
> +  if (q == (int) nstripes)
>      q = 0;
>
>    pos = q + 1;
> -  if (pos == (int) array->node_count)
> +  if (pos == (int) nstripes)
>      pos = 0;
>
> -  for (i = 0; i < (int) array->node_count - 2; i++)
> +  for (i = 0; i < (int) nstripes - 2; i++)
>      {
>        int c;
> -      if (array->layout & GRUB_RAID_LAYOUT_MUL_FROM_POS)
> +      if (layout & GRUB_RAID_LAYOUT_MUL_FROM_POS)
>  	c = pos;
>        else
>  	c = i;
> @@ -109,8 +120,7 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>          bad1 = c;
>        else
>          {
> -          if (! grub_diskfilter_read_node (&array->nodes[pos], sector,
> -					   size >> GRUB_DISK_SECTOR_BITS, buf))
> +	  if (!read_func(data, pos, sector, buf, size))
>              {
>                grub_crypto_xor (pbuf, pbuf, buf, size);
>                grub_raid_block_mulx (c, buf, size);
> @@ -128,7 +138,7 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>          }
>
>        pos++;
> -      if (pos == (int) array->node_count)
> +      if (pos == (int) nstripes)
>          pos = 0;
>      }
>
> @@ -139,16 +149,14 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>    if (bad2 < 0)
>      {
>        /* One bad device */
> -      if ((! grub_diskfilter_read_node (&array->nodes[p], sector,
> -					size >> GRUB_DISK_SECTOR_BITS, buf)))
> +      if (!read_func(data, p, sector, buf, size))
>          {
>            grub_crypto_xor (buf, buf, pbuf, size);
>            goto quit;
>          }
>
>        grub_errno = GRUB_ERR_NONE;
> -      if (grub_diskfilter_read_node (&array->nodes[q], sector,
> -				     size >> GRUB_DISK_SECTOR_BITS, buf))
> +      if (read_func(data, q, sector, buf, size))
>          goto quit;
>
>        grub_crypto_xor (buf, buf, qbuf, size);
> @@ -160,14 +168,12 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>        /* Two bad devices */
>        unsigned c;
>
> -      if (grub_diskfilter_read_node (&array->nodes[p], sector,
> -				     size >> GRUB_DISK_SECTOR_BITS, buf))
> +      if (read_func(data, p, sector, buf, size))
>          goto quit;
>
>        grub_crypto_xor (pbuf, pbuf, buf, size);
>
> -      if (grub_diskfilter_read_node (&array->nodes[q], sector,
> -				     size >> GRUB_DISK_SECTOR_BITS, buf))
> +      if (read_func(data, q, sector, buf, size))
>          goto quit;
>
>        grub_crypto_xor (qbuf, qbuf, buf, size);
> @@ -190,6 +196,17 @@ quit:
>    return grub_errno;
>  }
>
> +static grub_err_t
> +grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
> +                    char *buf, grub_disk_addr_t sector, grub_size_t size)
> +{
> +

Drop this empty line.

> +  return grub_raid6_recover_gen (array, array->node_count, disknr, p, buf,
> +				     sector, size << GRUB_DISK_SECTOR_BITS,
> +				     raid6_recover_read_node,
> +				     array->layout);
> +}
> +
>  GRUB_MOD_INIT(raid6rec)
>  {
>    grub_raid6_init_table ();
> diff --git a/include/grub/diskfilter.h b/include/grub/diskfilter.h
> index d89273c1b..911888e33 100644
> --- a/include/grub/diskfilter.h
> +++ b/include/grub/diskfilter.h
> @@ -189,6 +189,15 @@ typedef grub_err_t (*grub_raid6_recover_func_t) (struct grub_diskfilter_segment
>  extern grub_raid5_recover_func_t grub_raid5_recover_func;
>  extern grub_raid6_recover_func_t grub_raid6_recover_func;
>
> +typedef grub_err_t (* raid_recover_read_t)(void *data, int disk_nr,
> +					   grub_uint64_t addr, void *dest,
> +					   grub_size_t size);
> +
> +grub_err_t
> +grub_raid6_recover_gen (void *data, grub_uint64_t nstripes, int disknr, int p,
> +			    char *buf, grub_uint64_t sector, grub_size_t size,
> +			    raid_recover_read_t read_func, int layout);

Please add extern here.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem.
  2018-05-16 18:48 ` [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem Goffredo Baroncelli
@ 2018-05-30 12:15   ` Daniel Kiper
  2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-05-30 12:15 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Wed, May 16, 2018 at 08:48:19PM +0200, Goffredo Baroncelli wrote:
> Add the RAID 6 recovery, in order to use a RAID 6 filesystem even if some
> disks (up to two) are missing. This code use the old md RAID 6 code already

s/the old md/the md/

> present in grub.
>
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 45 +++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index 5fcaad86f..3d71b954e 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -30,6 +30,7 @@
>  #include <grub/i18n.h>
>  #include <grub/btrfs.h>
>  #include <grub/crypto.h>
> +#include <grub/diskfilter.h>
>
>  GRUB_MOD_LICENSE ("GPLv3+");
>
> @@ -695,11 +696,35 @@ rebuild_raid5 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
>                         csize);
>  }
>
> +static grub_err_t
> +raid6_recover_read_buffer (void *data, int disk_nr,
> +			   grub_uint64_t addr __attribute__ ((unused)),
> +			   void *dest, grub_size_t size)
> +{
> +    struct raid56_buffer *buffers = data;
> +
> +    grub_memcpy(dest, buffers[disk_nr].buf, size);

Could we avoid this grub_memcpy() call somehow?

> +    grub_errno = buffers[disk_nr].data_is_valid ? GRUB_ERR_NONE :
> +		 GRUB_ERR_READ_ERROR;
> +    return grub_errno;
> +}
> +
> +static void
> +rebuild_raid6 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
> +               grub_uint64_t csize, grub_uint64_t parities_pos, void *dest,
> +               grub_uint64_t stripen)
> +
> +{
> +  grub_raid6_recover_gen (buffers, nstripes, stripen, parities_pos,
> +                          dest, 0, csize, raid6_recover_read_buffer, 0);
> +}
> +
>  static grub_err_t
>  raid56_read_retry (struct grub_btrfs_data *data,
>  		   struct grub_btrfs_chunk_item *chunk,
>  		   grub_uint64_t stripe_offset, grub_uint64_t stripen,
> -		   grub_uint64_t csize, void *buf)
> +		   grub_uint64_t csize, void *buf, grub_uint64_t parities_pos)
>  {
>
>    struct raid56_buffer *buffers = NULL;
> @@ -780,6 +805,15 @@ raid56_read_retry (struct grub_btrfs_data *data,
>        ret = GRUB_ERR_READ_ERROR;
>        goto cleanup;
>      }
> +  else if (failed_devices > 2 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID6))
> +    {
> +      grub_dprintf ("btrfs",
> +		    "not enough disks for raid6: total %" PRIuGRUB_UINT64_T
> +		    ", missing %" PRIuGRUB_UINT64_T "\n",
> +		    nstripes, failed_devices);
> +      ret = GRUB_ERR_READ_ERROR;
> +      goto cleanup;
> +    }
>    else
>      {
>        grub_dprintf ("btrfs",
> @@ -796,7 +830,7 @@ raid56_read_retry (struct grub_btrfs_data *data,
>      }
>    else
>      {
> -      grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
> +      rebuild_raid6 (buffers, nstripes, csize, parities_pos, buf, stripen);
>      }
>
>  cleanup:
> @@ -891,8 +925,9 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	int is_raid56;
>  	grub_uint64_t parities_pos = 0;
>
> -	is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
> -		       GRUB_BTRFS_CHUNK_TYPE_RAID5);
> +        is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
> +		       (GRUB_BTRFS_CHUNK_TYPE_RAID5|

Space before "|" please?

> +		        GRUB_BTRFS_CHUNK_TYPE_RAID6));
>
>  	if (grub_le_to_cpu64 (chunk->size) <= off)
>  	  {
> @@ -1089,7 +1124,7 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  					     csize, buf);
>  		if (err != GRUB_ERR_NONE)
>  		  err = raid56_read_retry (data, chunk, stripe_offset,
> -					   stripen, csize, buf);
> +					   stripen, csize, buf, parities_pos);

I am not sure what is parities_pos and where it is initialized.
If it is a part of earlier patch but it is not used then I think
that it should be a part of this patch.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem.
  2018-05-30 12:15   ` Daniel Kiper
@ 2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-01 18:50 UTC (permalink / raw)
  To: Daniel Kiper, The development of GNU GRUB

On 05/30/2018 02:15 PM, Daniel Kiper wrote:
> On Wed, May 16, 2018 at 08:48:19PM +0200, Goffredo Baroncelli wrote:
>> Add the RAID 6 recovery, in order to use a RAID 6 filesystem even if some
>> disks (up to two) are missing. This code use the old md RAID 6 code already
> 
> s/the old md/the md/

ok

>> present in grub.
>>
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/fs/btrfs.c | 45 +++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 40 insertions(+), 5 deletions(-)
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index 5fcaad86f..3d71b954e 100644
>> --- a/grub-core/fs/btrfs.c
>> +++ b/grub-core/fs/btrfs.c
>> @@ -30,6 +30,7 @@
>>  #include <grub/i18n.h>
>>  #include <grub/btrfs.h>
>>  #include <grub/crypto.h>
>> +#include <grub/diskfilter.h>
>>
>>  GRUB_MOD_LICENSE ("GPLv3+");
>>
>> @@ -695,11 +696,35 @@ rebuild_raid5 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
>>                         csize);
>>  }
>>
>> +static grub_err_t
>> +raid6_recover_read_buffer (void *data, int disk_nr,
>> +			   grub_uint64_t addr __attribute__ ((unused)),
>> +			   void *dest, grub_size_t size)
>> +{
>> +    struct raid56_buffer *buffers = data;
>> +
>> +    grub_memcpy(dest, buffers[disk_nr].buf, size);
> 
> Could we avoid this grub_memcpy() call somehow?

I would like; however this would mean that the handling of the raid 5 and raid 6 rebuilding code would be totally different. This has a big impact on the patches set and the related tests performed until now.
Are you sure that this effort will be really needed ?
> 
>> +    grub_errno = buffers[disk_nr].data_is_valid ? GRUB_ERR_NONE :
>> +		 GRUB_ERR_READ_ERROR;
>> +    return grub_errno;
>> +}
>> +
>> +static void
>> +rebuild_raid6 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
>> +               grub_uint64_t csize, grub_uint64_t parities_pos, void *dest,
>> +               grub_uint64_t stripen)
>> +
>> +{
>> +  grub_raid6_recover_gen (buffers, nstripes, stripen, parities_pos,
>> +                          dest, 0, csize, raid6_recover_read_buffer, 0);
>> +}
>> +
>>  static grub_err_t
>>  raid56_read_retry (struct grub_btrfs_data *data,
>>  		   struct grub_btrfs_chunk_item *chunk,
>>  		   grub_uint64_t stripe_offset, grub_uint64_t stripen,
>> -		   grub_uint64_t csize, void *buf)
>> +		   grub_uint64_t csize, void *buf, grub_uint64_t parities_pos)
>>  {
>>
>>    struct raid56_buffer *buffers = NULL;
>> @@ -780,6 +805,15 @@ raid56_read_retry (struct grub_btrfs_data *data,
>>        ret = GRUB_ERR_READ_ERROR;
>>        goto cleanup;
>>      }
>> +  else if (failed_devices > 2 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID6))
>> +    {
>> +      grub_dprintf ("btrfs",
>> +		    "not enough disks for raid6: total %" PRIuGRUB_UINT64_T
>> +		    ", missing %" PRIuGRUB_UINT64_T "\n",
>> +		    nstripes, failed_devices);
>> +      ret = GRUB_ERR_READ_ERROR;
>> +      goto cleanup;
>> +    }
>>    else
>>      {
>>        grub_dprintf ("btrfs",
>> @@ -796,7 +830,7 @@ raid56_read_retry (struct grub_btrfs_data *data,
>>      }
>>    else
>>      {
>> -      grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
>> +      rebuild_raid6 (buffers, nstripes, csize, parities_pos, buf, stripen);
>>      }
>>
>>  cleanup:
>> @@ -891,8 +925,9 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	int is_raid56;
>>  	grub_uint64_t parities_pos = 0;
>>
>> -	is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
>> -		       GRUB_BTRFS_CHUNK_TYPE_RAID5);
>> +        is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
>> +		       (GRUB_BTRFS_CHUNK_TYPE_RAID5|
> 
> Space before "|" please?
OK
> 
>> +		        GRUB_BTRFS_CHUNK_TYPE_RAID6));
>>
>>  	if (grub_le_to_cpu64 (chunk->size) <= off)
>>  	  {
>> @@ -1089,7 +1124,7 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  					     csize, buf);
>>  		if (err != GRUB_ERR_NONE)
>>  		  err = raid56_read_retry (data, chunk, stripe_offset,
>> -					   stripen, csize, buf);
>> +					   stripen, csize, buf, parities_pos);
> 
> I am not sure what is parities_pos and where it is initialized.
> If it is a part of earlier patch but it is not used then I think
> that it should be a part of this patch.
You are right
> 
> Daniel
> 

I think that I implemented all your requests in the next patches set. The only exception is the removing the memcpy in the raid6 recovery path.
Now I am starting the test. when I will completed the test phase, I will resend the patches.

BR
G.Baroncelli
-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles.
  2018-05-30 11:30   ` Daniel Kiper
@ 2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-01 18:50 UTC (permalink / raw)
  To: Daniel Kiper, The development of GNU GRUB

On 05/30/2018 01:30 PM, Daniel Kiper wrote:
> On Wed, May 16, 2018 at 08:48:17PM +0200, Goffredo Baroncelli wrote:
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/fs/btrfs.c | 174 ++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 170 insertions(+), 4 deletions(-)
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index 63651928b..5fcaad86f 100644
>> --- a/grub-core/fs/btrfs.c
>> +++ b/grub-core/fs/btrfs.c
>> @@ -29,6 +29,7 @@
>>  #include <minilzo.h>
>>  #include <grub/i18n.h>
>>  #include <grub/btrfs.h>
>> +#include <grub/crypto.h>
>>
>>  GRUB_MOD_LICENSE ("GPLv3+");
>>
>> @@ -666,6 +667,150 @@ btrfs_read_from_chunk (struct grub_btrfs_data *data,
>>      return err;
>>  }
>>
>> +struct raid56_buffer {
>> +  void *buf;
>> +  int  data_is_valid;
>> +};
>> +
>> +static void
>> +rebuild_raid5 (struct raid56_buffer *buffers, grub_uint64_t nstripes,
>> +               grub_uint64_t csize)
>> +{
>> +  grub_uint64_t target = 0, i;
>> +
>> +  while (buffers[target].data_is_valid && target < nstripes)
>> +    ++target;
>> +
>> +  if (target == nstripes)
>> +    {
>> +      grub_dprintf ("btrfs", "called rebuild_raid5(), but all disks are OK\n");
>> +      return;
>> +    }
>> +
>> +  grub_dprintf ("btrfs", "rebuilding raid5 stripe #%" PRIuGRUB_UINT64_T "\n",
>> +	        target);
>> +  for (i = 0; i < nstripes; i++)
>> +    if (i != target)
>> +      grub_crypto_xor (buffers[target].buf, buffers[target].buf, buffers[i].buf,
>> +                       csize);
>> +}
>> +
>> +static grub_err_t
>> +raid56_read_retry (struct grub_btrfs_data *data,
>> +		   struct grub_btrfs_chunk_item *chunk,
>> +		   grub_uint64_t stripe_offset, grub_uint64_t stripen,
>> +		   grub_uint64_t csize, void *buf)
>> +{
>> +
>> +  struct raid56_buffer *buffers = NULL;
>> +  grub_uint64_t nstripes = grub_le_to_cpu16 (chunk->nstripes);
>> +  grub_uint64_t chunk_type = grub_le_to_cpu64 (chunk->type);
>> +  grub_err_t ret = GRUB_ERR_NONE;
>> +  grub_uint64_t i, failed_devices;
>> +
>> +  buffers = grub_zalloc (sizeof(*buffers) * nstripes);
>> +  if (!buffers)
>> +    {
>> +      ret = GRUB_ERR_OUT_OF_MEMORY;
>> +      goto cleanup;
>> +    }
>> +
>> +  for (i = 0; i < nstripes; i++)
>> +    {
>> +      buffers[i].buf = grub_zalloc (csize);
>> +      if (!buffers[i].buf)
>> +	{
>> +	  ret = GRUB_ERR_OUT_OF_MEMORY;
>> +	  goto cleanup;
>> +	}
>> +    }
>> +
>> +  for (i = 0; i < nstripes; i++)
>> +    {
>> +      struct grub_btrfs_chunk_stripe *stripe;
>> +      grub_disk_addr_t paddr;
>> +      grub_device_t dev;
>> +      grub_err_t err2;
>> +
>> +      stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
>> +      stripe += i;
>> +
>> +      paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
>> +      grub_dprintf ("btrfs", "reading paddr %" PRIxGRUB_UINT64_T
>> +                    " from stripe ID %" PRIxGRUB_UINT64_T "\n", paddr,
>> +                    stripe->device_id);
>> +
>> +      dev = find_device (data, stripe->device_id);
>> +      if (!dev)
>> +	{
>> +	  buffers[i].data_is_valid = 0;
>> +	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " FAILED (dev ID %"
>> +			PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
>> +	  continue;
> 
> What will happen if more than one stripe is broken?
See below
> 
>> +	}
>> +
>> +      err2 = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
>> +			     paddr & (GRUB_DISK_SECTOR_SIZE - 1),
>> +			     csize, buffers[i].buf);
>> +      if (err2 == GRUB_ERR_NONE)
>> +	{
>> +	  buffers[i].data_is_valid = 1;
>> +	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " Ok (dev ID %"
>> +			PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
>> +	}
>> +      else
>> +	{
>> +	  buffers[i].data_is_valid = 0;
>> +	  grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T
>> +			" FAILED (dev ID %" PRIxGRUB_UINT64_T ")\n", i,
>> +			stripe->device_id);
> 
> Ditto?
See below
> 
>> +	}
>> +    }
>> +
>> +  failed_devices = 0;
>> +  for (i = 0; i < nstripes; i++)
>> +    if (!buffers[i].data_is_valid)
>> +      ++failed_devices;
>> +  if (failed_devices > 1 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5))
>> +    {
>> +      grub_dprintf ("btrfs",
>> +		    "not enough disks for raid5: total %" PRIuGRUB_UINT64_T
> 
> s/raid5/RAID5/ Please fix this and the rest of messages in the other patches.
OK
> 
>> +		    ", missing %" PRIuGRUB_UINT64_T "\n",
>> +		    nstripes, failed_devices);
>> +      ret = GRUB_ERR_READ_ERROR;
>> +      goto cleanup;
> 
> Ahhh... Here it is! Perfect!
Right !
> 
>> +    }
>> +  else
>> +    {
>> +      grub_dprintf ("btrfs",
>> +                    "enough disks for raid5/6 rebuilding: total %"
> 
> s#raid5/6#RAID5# This is the patch for RAID 5. Right?
> Please do not mix RAID 5 changes with RAID 6 changes.
ok
> 
>> +		    PRIuGRUB_UINT64_T ", missing %" PRIuGRUB_UINT64_T "\n",
>> +                    nstripes, failed_devices);
>> +    }
>> +
>> +  /* if these are enough, try to rebuild the data */
>> +  if (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5)
>> +    {
>> +      rebuild_raid5 (buffers, nstripes, csize);
>> +      grub_memcpy (buf, buffers[stripen].buf, csize);
> 
> Hmmm... Do we need this grub_memcpy()? Why rebuild_raid5() could
> not fill buf directly? If it is not possible then I think that
> grub_memcpy() should be called from rebuild_raid5().
I removed it
> 
>> +    }
>> +  else
>> +    {
>> +      grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
>> +    }
> 
> Please drop this curly brackets.
Ok, 
> 
>> +
>> +cleanup:
> 
> Space before the label please.
ok
> 
>> +  if (buffers)
>> +    {
>> +      for (i = 0; i < nstripes; i++)
>> +	if (buffers[i].buf)
>> +	  grub_free(buffers[i].buf);
>> +      grub_free(buffers);
>> +    }
>> +
>> +  return ret;
>> +}
>> +
>>  static grub_err_t
>>  grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  			 void *buf, grub_size_t size, int recursion_depth)
>> @@ -743,6 +888,11 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	grub_uint16_t nstripes;
>>  	unsigned redundancy = 1;
>>  	unsigned i, j;
>> +	int is_raid56;
>> +	grub_uint64_t parities_pos = 0;
>> +
>> +	is_raid56 = !!(grub_le_to_cpu64 (chunk->type) &
>> +		       GRUB_BTRFS_CHUNK_TYPE_RAID5);
> 
> OK, I would leave this as is. However, I would mention
> in the commit message that this patch also do some minor
> preparatory work for RAID 6 support.
OK
> 
>>
>>  	if (grub_le_to_cpu64 (chunk->size) <= off)
>>  	  {
>> @@ -885,6 +1035,8 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	       * number of disks
>>  	       */
>>  	      grub_divmod64 (high + stripen, nstripes, &stripen);
>> +	      grub_divmod64 (high + nstripes - nparities, nstripes,
>> +			     &parities_pos);
>>
>>  	      stripe_offset = low + chunk_stripe_length * high;
>>  	      csize = chunk_stripe_length - low;
>> @@ -917,15 +1069,29 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	    grub_dprintf ("btrfs", "reading laddr 0x%" PRIxGRUB_UINT64_T "\n",
>>  			  addr);
>>
>> -	    for (i = 0; i < redundancy; i++)
>> +	    if (!is_raid56)
>> +	      {
>> +		for (i = 0; i < redundancy; i++)
>> +		  {
>> +		    err = btrfs_read_from_chunk (data, chunk, stripen,
>> +						 stripe_offset,
>> +						 i,     /* redundancy */
>> +						 csize, buf);
>> +		    if (err == GRUB_ERR_NONE)
>> +			break;
>> +		  }
>> +	      }
>> +	    else
>>  	      {
>>  		err = btrfs_read_from_chunk (data, chunk, stripen,
>>  					     stripe_offset,
>> -					     i,     /* redundancy */
>> +					     0,     /* no mirror */
>>  					     csize, buf);
> 
> I do not understand this change... Why?
In RAID1, RAID10 you have several disks from which you can read without further effort. In raid5 and 6, you must read from a specific disk, otherwise you need to rebuild the stripe. Is very different.
Pay attention that the diff output is very confusing, because the old code is shifted to left and diff think that it is a new code....
> 
>> -		if (err == GRUB_ERR_NONE)
>> -		    break;
>> +		if (err != GRUB_ERR_NONE)
>> +		  err = raid56_read_retry (data, chunk, stripe_offset,
>> +					   stripen, csize, buf);
> 
> I have a feeling that this change is somehow related to my comment
> for patch #6.
IS very different, the former is "exit from the retry-from-another-mirror loop if the read is ok", the latter is "in case of error rebuild the stripe". In the first case it is possible to check the iterator index; in the second one I can check only the exit code.
> 
>>  	      }
>> +
> 
> Drop this change please.
ok
> 
>>  	    if (err == GRUB_ERR_NONE)
>>  		break;
>>  	  }
> 
> Daniel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 6/9] btrfs: refactor the code that read from disk
  2018-05-30 11:07   ` Daniel Kiper
@ 2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-01 18:50 UTC (permalink / raw)
  To: Daniel Kiper, The development of GNU GRUB

On 05/30/2018 01:07 PM, Daniel Kiper wrote:
> On Wed, May 16, 2018 at 08:48:16PM +0200, Goffredo Baroncelli wrote:
>> This is a preparatory patch, to help the adding of the RAID 5/6 recovery
> 
> Please move "This is a preparatory patch" sentence below...
> 
>> code. In case of availability of all disks, this code is good for all the
>> RAID profiles. However in case of failure, the error handling is quite
>> different. Refactoring this code increases the general readability.
> 
> s/readability/readability too/?
> 
> You can put "This is a preparatory patch" in separate line here.
> Same for the other patches.

What about
----
btrfs: Refactor the code that read from disk

Move the code in charge to read the data from disk in a separate
function. This helps to separate the error handling logic (which depend by 
the different raid profiles) from the read from disk logic.
Refactoring this code increases the general readability too.

This is a preparatory patch, to help the adding of the RAID 5/6 recovery
code. 

----


> 
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/fs/btrfs.c | 85 +++++++++++++++++++++++++-------------------
>>  1 file changed, 49 insertions(+), 36 deletions(-)
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index 51f300829..63651928b 100644
>> --- a/grub-core/fs/btrfs.c
>> +++ b/grub-core/fs/btrfs.c
>> @@ -625,6 +625,47 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id)
>>    return ctx.dev_found;
>>  }
>>
>> +static grub_err_t
>> +btrfs_read_from_chunk (struct grub_btrfs_data *data,
>> +		       struct grub_btrfs_chunk_item *chunk,
>> +		       grub_uint64_t stripen, grub_uint64_t stripe_offset,
>> +		       int redundancy, grub_uint64_t csize,
>> +		       void *buf)
>> +{
>> +
>> +    struct grub_btrfs_chunk_stripe *stripe;
>> +    grub_disk_addr_t paddr;
>> +    grub_device_t dev;
>> +    grub_err_t err;
>> +
>> +    stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
>> +    /* Right now the redundancy handling is easy.
>> +       With RAID5-like it will be more difficult.  */
>> +    stripe += stripen + redundancy;
>> +
>> +    paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
>> +
>> +    grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
>> +		  " maps to 0x%" PRIxGRUB_UINT64_T "\n",
>> +		  stripen, stripe->offset);
>> +    grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T "\n", paddr);
>> +
>> +    dev = find_device (data, stripe->device_id);
>> +    if (!dev)
>> +      {
>> +	grub_dprintf ("btrfs",
>> +		      "couldn't find a necessary member device "
>> +		      "of multi-device filesystem\n");
>> +	grub_errno = GRUB_ERR_NONE;
>> +	return GRUB_ERR_READ_ERROR;
>> +      }
>> +
>> +    err = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
>> +			  paddr & (GRUB_DISK_SECTOR_SIZE - 1),
>> +			  csize, buf);
>> +    return err;
>> +}
>> +
>>  static grub_err_t
>>  grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  			 void *buf, grub_size_t size, int recursion_depth)
>> @@ -638,7 +679,6 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>        grub_err_t err = 0;
>>        struct grub_btrfs_key key_out;
>>        int challoc = 0;
>> -      grub_device_t dev;
>>        struct grub_btrfs_key key_in;
>>        grub_size_t chsize;
>>        grub_disk_addr_t chaddr;
>> @@ -879,42 +919,15 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>
>>  	    for (i = 0; i < redundancy; i++)
>>  	      {
>> -		struct grub_btrfs_chunk_stripe *stripe;
>> -		grub_disk_addr_t paddr;
>> -
>> -		stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
>> -		/* Right now the redundancy handling is easy.
>> -		   With RAID5-like it will be more difficult.  */
>> -		stripe += stripen + i;
>> -
>> -		paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
>> -
>> -		grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
>> -			      " maps to 0x%" PRIxGRUB_UINT64_T "\n",
>> -			      stripen, stripe->offset);
>> -		grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T
>> -			      "\n", paddr);
>> -
>> -		dev = find_device (data, stripe->device_id);
>> -		if (!dev)
>> -		  {
>> -		    grub_dprintf ("btrfs",
>> -				  "couldn't find a necessary member device "
>> -				  "of multi-device filesystem\n");
>> -		    err = grub_errno;
>> -		    grub_errno = GRUB_ERR_NONE;
>> -		    continue;
>> -		  }
>> -
>> -		err = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
>> -				      paddr & (GRUB_DISK_SECTOR_SIZE - 1),
>> -				      csize, buf);
>> -		if (!err)
>> -		  break;
>> -		grub_errno = GRUB_ERR_NONE;
> 
> If you drop this line then you change behavior of this function.
> I have a feeling that this should stay as is. At least for now.
> If you need this change then probably it should be a part of the
> other patch.
OK
> 
>> +		err = btrfs_read_from_chunk (data, chunk, stripen,
>> +					     stripe_offset,
>> +					     i,     /* redundancy */
>> +					     csize, buf);
>> +		if (err == GRUB_ERR_NONE)
>> +		    break;
>>  	      }
>> -	    if (i != redundancy)
>> -	      break;
>> +	    if (err == GRUB_ERR_NONE)
>> +		break;
OK
> 
> Ditto.
> 
> Daniel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding
  2018-05-30 12:05   ` Daniel Kiper
@ 2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-01 18:50 UTC (permalink / raw)
  To: Daniel Kiper, The development of GNU GRUB

On 05/30/2018 02:05 PM, Daniel Kiper wrote:
> On Wed, May 16, 2018 at 08:48:18PM +0200, Goffredo Baroncelli wrote:
>> The original code which handles the recovery of a RAID 6 disks array
>> assumes that all reads are multiple of 1 << GRUB_DISK_SECTOR_BITS and it
>> assumes that all the I/O is done via the struct grub_diskfilter_segment.
>> This is not true for the btrfs code. In order to reuse the native
>> grub_raid6_recover() code, it is modified to not call
>> grub_diskfilter_read_node() directly, but to call an handler passed
>> as argument.
> 
> s/as argument/as an argument/
ok
> 
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/disk/raid6_recover.c | 53 ++++++++++++++++++++++------------
>>  include/grub/diskfilter.h      |  9 ++++++
>>  2 files changed, 44 insertions(+), 18 deletions(-)
>>
>> diff --git a/grub-core/disk/raid6_recover.c b/grub-core/disk/raid6_recover.c
>> index aa674f6ca..10ee495e2 100644
>> --- a/grub-core/disk/raid6_recover.c
>> +++ b/grub-core/disk/raid6_recover.c
>> @@ -74,14 +74,25 @@ mod_255 (unsigned x)
>>  }
>>
>>  static grub_err_t
>> -grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>> -                    char *buf, grub_disk_addr_t sector, grub_size_t size)
>> +raid6_recover_read_node (void *data, int disk_nr,
> 
> s/disk_nr/disknr/

OK

>> +				    grub_uint64_t sector,
>> +				    void *buf, grub_size_t size)
>> +{
>> +    struct grub_diskfilter_segment *array = data;
> 
> Please add one empty line here.
> 
>> +    return grub_diskfilter_read_node (&array->nodes[disk_nr],
>> +				      (grub_disk_addr_t)sector,
>> +				      size >> GRUB_DISK_SECTOR_BITS, buf);
>> +}
>> +
>> +grub_err_t
>> +grub_raid6_recover_gen (void *data, grub_uint64_t nstripes, int disknr, int p,
>> +			    char *buf, grub_uint64_t sector, grub_size_t size,
>> +			    raid_recover_read_t read_func, int layout)
> 
> Could you change the order of read_func and layout arguments?
> I mean make read_func the last one and put the layout before it.
OK
> 
>>  {
>>    int i, q, pos;
>>    int bad1 = -1, bad2 = -1;
>>    char *pbuf = 0, *qbuf = 0;
>>
>> -  size <<= GRUB_DISK_SECTOR_BITS;
>>    pbuf = grub_zalloc (size);
>>    if (!pbuf)
>>      goto quit;
>> @@ -91,17 +102,17 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>>      goto quit;
>>
>>    q = p + 1;
>> -  if (q == (int) array->node_count)
>> +  if (q == (int) nstripes)
>>      q = 0;
>>
>>    pos = q + 1;
>> -  if (pos == (int) array->node_count)
>> +  if (pos == (int) nstripes)
>>      pos = 0;
>>
>> -  for (i = 0; i < (int) array->node_count - 2; i++)
>> +  for (i = 0; i < (int) nstripes - 2; i++)
>>      {
>>        int c;
>> -      if (array->layout & GRUB_RAID_LAYOUT_MUL_FROM_POS)
>> +      if (layout & GRUB_RAID_LAYOUT_MUL_FROM_POS)
>>  	c = pos;
>>        else
>>  	c = i;
>> @@ -109,8 +120,7 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>>          bad1 = c;
>>        else
>>          {
>> -          if (! grub_diskfilter_read_node (&array->nodes[pos], sector,
>> -					   size >> GRUB_DISK_SECTOR_BITS, buf))
>> +	  if (!read_func(data, pos, sector, buf, size))
>>              {
>>                grub_crypto_xor (pbuf, pbuf, buf, size);
>>                grub_raid_block_mulx (c, buf, size);
>> @@ -128,7 +138,7 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>>          }
>>
>>        pos++;
>> -      if (pos == (int) array->node_count)
>> +      if (pos == (int) nstripes)
>>          pos = 0;
>>      }
>>
>> @@ -139,16 +149,14 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>>    if (bad2 < 0)
>>      {
>>        /* One bad device */
>> -      if ((! grub_diskfilter_read_node (&array->nodes[p], sector,
>> -					size >> GRUB_DISK_SECTOR_BITS, buf)))
>> +      if (!read_func(data, p, sector, buf, size))
>>          {
>>            grub_crypto_xor (buf, buf, pbuf, size);
>>            goto quit;
>>          }
>>
>>        grub_errno = GRUB_ERR_NONE;
>> -      if (grub_diskfilter_read_node (&array->nodes[q], sector,
>> -				     size >> GRUB_DISK_SECTOR_BITS, buf))
>> +      if (read_func(data, q, sector, buf, size))
>>          goto quit;
>>
>>        grub_crypto_xor (buf, buf, qbuf, size);
>> @@ -160,14 +168,12 @@ grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>>        /* Two bad devices */
>>        unsigned c;
>>
>> -      if (grub_diskfilter_read_node (&array->nodes[p], sector,
>> -				     size >> GRUB_DISK_SECTOR_BITS, buf))
>> +      if (read_func(data, p, sector, buf, size))
>>          goto quit;
>>
>>        grub_crypto_xor (pbuf, pbuf, buf, size);
>>
>> -      if (grub_diskfilter_read_node (&array->nodes[q], sector,
>> -				     size >> GRUB_DISK_SECTOR_BITS, buf))
>> +      if (read_func(data, q, sector, buf, size))
>>          goto quit;
>>
>>        grub_crypto_xor (qbuf, qbuf, buf, size);
>> @@ -190,6 +196,17 @@ quit:
>>    return grub_errno;
>>  }
>>
>> +static grub_err_t
>> +grub_raid6_recover (struct grub_diskfilter_segment *array, int disknr, int p,
>> +                    char *buf, grub_disk_addr_t sector, grub_size_t size)
>> +{
>> +
> 
> Drop this empty line.
ok
> 
>> +  return grub_raid6_recover_gen (array, array->node_count, disknr, p, buf,
>> +				     sector, size << GRUB_DISK_SECTOR_BITS,
>> +				     raid6_recover_read_node,
>> +				     array->layout);
>> +}
>> +
>>  GRUB_MOD_INIT(raid6rec)
>>  {
>>    grub_raid6_init_table ();
>> diff --git a/include/grub/diskfilter.h b/include/grub/diskfilter.h
>> index d89273c1b..911888e33 100644
>> --- a/include/grub/diskfilter.h
>> +++ b/include/grub/diskfilter.h
>> @@ -189,6 +189,15 @@ typedef grub_err_t (*grub_raid6_recover_func_t) (struct grub_diskfilter_segment
>>  extern grub_raid5_recover_func_t grub_raid5_recover_func;
>>  extern grub_raid6_recover_func_t grub_raid6_recover_func;
>>
>> +typedef grub_err_t (* raid_recover_read_t)(void *data, int disk_nr,
>> +					   grub_uint64_t addr, void *dest,
>> +					   grub_size_t size);
>> +
>> +grub_err_t
>> +grub_raid6_recover_gen (void *data, grub_uint64_t nstripes, int disknr, int p,
>> +			    char *buf, grub_uint64_t sector, grub_size_t size,
>> +			    raid_recover_read_t read_func, int layout);
> 
> Please add extern here.
ok
> 
> Daniel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-05-30 10:07   ` Daniel Kiper
@ 2018-06-01 18:50     ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-01 18:50 UTC (permalink / raw)
  To: Daniel Kiper, The development of GNU GRUB

Hi Daniel,

my comments below
On 05/30/2018 12:07 PM, Daniel Kiper wrote:
> On Wed, May 16, 2018 at 08:48:11PM +0200, Goffredo Baroncelli wrote:
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
[...]

>> +	       *  off -> logical address to read (from the beginning of the
>> +	       *         chunk space)
> 
> What do you mean by "chunk"? Is it e.g. A1 + B1 region? Please make it
> clear what do you mean by "chunk".

Chunk is a very pervasive concept in BTRFS. A reader who looks to a BTRFS
raid code, must know very well this concepts. I am not sure that GRUB code
is the right place to contain a full BTRFS chunk description.

Basically, the BTRFS logical space is mapped to the physical one via the
chunks (aka block group). Each chunk maps a range of the logical address to a
range in the disk(s). If more disks are involved, different profile
might be used (linear, raid0... raid5/6). E.g.: a chunk maps a logical
address (say 0.1GB) to a physical address (say 1GB..2GB of disk1, 3GB..to 
4GB of disk2, in raid1 mode).
The chunks are a low layer. All the BTRFS metadata are in term of the 
logical address (upper layer).

[...]
>> +	      /*
>> +	       * In the line below stripen is evaluated without considering
> 
> s/In the line below //
> 
>> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...);
> 
> s/;/./
> 
>> +	       */
>> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
>> +
>> +	      /*
>> +	       * In the line below stripen, now consider also the parities (0
> 
> s/In the line below stripen, now/Now/

I think that "stripen" must be re-added

> 
>> +	       * for A1, 1 for A2, 2 for A3....); the math is done "modulo"
> 
> s/; the/. The/
> s/"modulo"/modulo/
> 
>> +	       * number of disks
> 
> Full stop at the end of sentence please.
> 
> Daniel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-10-22 17:29 [PATCH V11] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-10-22 17:29 ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-10-22 17:29 UTC (permalink / raw)
  To: grub-devel; +Cc: Daniel Kiper, linux-btrfs, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
 grub-core/fs/btrfs.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..9122169aa 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,77 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * RAID 6 layout consists of several stripes spread over
+	       * the disks, e.g.:
+	       *
+	       *   Disk_0  Disk_1  Disk_2  Disk_3
+	       *     A0      B0      P0      Q0
+	       *     Q1      A1      B1      P1
+	       *     P2      Q2      A2      B2
+	       *
+	       * Note: placement of the parities depend on row number.
+	       *
+	       * Pay attention that the btrfs terminology may differ from
+	       * terminology used in other RAID implementations, e.g. LVM,
+	       * dm or md. The main difference is that btrfs calls contiguous
+	       * block of data on a given disk, e.g. A0, stripe instead of chunk.
+	       *
+	       * The variables listed below have following meaning:
+	       *   - stripe_nr is the stripe number excluding the parities
+	       *     (A0 = 0, B0 = 1, A1 = 2, B1 = 3, etc.),
+	       *   - high is the row number (0 for A0...Q0, 1 for Q1...P1, etc.),
+	       *   - stripen is the disk number in a row (0 for A0, Q1, P2,
+	       *     1 for B0, A1, Q2, etc.),
+	       *   - off is the logical address to read,
+	       *   - chunk_stripe_length is the size of a stripe (typically 64 KiB),
+	       *   - nstripes is the number of disks in a row,
+	       *   - low is the offset of the data inside a stripe,
+	       *   - stripe_offset is the data offset in an array,
+	       *   - csize is the "potential" data to read; it will be reduced
+	       *     to size if the latter is smaller,
+	       *   - nparities is the number of parities (1 for RAID 5, 2 for
+	       *     RAID 6); used only in RAID 5/6 code.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities
+	       * (0 for A0, A1, A2, 1 for B0, B1, B2, etc.).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * The stripes are spread over the disks. Every each row their
+	       * positions are shifted by 1 place. So, the real disks number
+	       * change. Hence, we have to take into account current row number
+	       * modulo nstripes (0 for A0, 1 for A1, 2 for A2, etc.).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-10-18 17:55 [PATCH V10] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-10-18 17:55 ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-10-18 17:55 UTC (permalink / raw)
  To: grub-devel; +Cc: Daniel Kiper, linux-btrfs, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
 grub-core/fs/btrfs.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..9122169aa 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,77 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * RAID 6 layout consists of several stripes spread over
+	       * the disks, e.g.:
+	       *
+	       *   Disk_0  Disk_1  Disk_2  Disk_3
+	       *     A0      B0      P0      Q0
+	       *     Q1      A1      B1      P1
+	       *     P2      Q2      A2      B2
+	       *
+	       * Note: placement of the parities depend on row number.
+	       *
+	       * Pay attention that the btrfs terminology may differ from
+	       * terminology used in other RAID implementations, e.g. LVM,
+	       * dm or md. The main difference is that btrfs calls contiguous
+	       * block of data on a given disk, e.g. A0, stripe instead of chunk.
+	       *
+	       * The variables listed below have following meaning:
+	       *   - stripe_nr is the stripe number excluding the parities
+	       *     (A0 = 0, B0 = 1, A1 = 2, B1 = 3, etc.),
+	       *   - high is the row number (0 for A0...Q0, 1 for Q1...P1, etc.),
+	       *   - stripen is the disk number in a row (0 for A0, Q1, P2,
+	       *     1 for B0, A1, Q2, etc.),
+	       *   - off is the logical address to read,
+	       *   - chunk_stripe_length is the size of a stripe (typically 64 KiB),
+	       *   - nstripes is the number of disks in a row,
+	       *   - low is the offset of the data inside a stripe,
+	       *   - stripe_offset is the data offset in an array,
+	       *   - csize is the "potential" data to read; it will be reduced
+	       *     to size if the latter is smaller,
+	       *   - nparities is the number of parities (1 for RAID 5, 2 for
+	       *     RAID 6); used only in RAID 5/6 code.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities
+	       * (0 for A0, A1, A2, 1 for B0, B1, B2, etc.).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * The stripes are spread over the disks. Every each row their
+	       * positions are shifted by 1 place. So, the real disks number
+	       * change. Hence, we have to take into account current row number
+	       * modulo nstripes (0 for A0, 1 for A1, 2 for A2, etc.).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-10-11 18:50 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-10-17 13:46   ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-10-17 13:46 UTC (permalink / raw)
  To: Goffredo Baroncelli
  Cc: grub-devel, Daniel Kiper, linux-btrfs, Goffredo Baroncelli

On Thu, Oct 11, 2018 at 08:50:55PM +0200, Goffredo Baroncelli wrote:
> From: Goffredo Baroncelli <kreijack@inwind.it>
>
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>

One nit pick below...

Otherwise you can add Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>

> ---
>  grub-core/fs/btrfs.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..933a57d3b 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,77 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, stripe_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * RAID 6 layout consists of several stripes spread over
> +	       * the disks, e.g.:
> +	       *
> +	       *   Disk_0  Disk_1  Disk_2  Disk_3
> +	       *     A0      B0      P0      Q0
> +	       *     Q1      A1      B1      P1
> +	       *     P2      Q2      A2      B2
> +	       *
> +	       * Note: placement of the parities depend on row number.
> +	       *
> +	       * Pay attention that the btrfs terminology may differ from
> +	       * terminology used in other RAID implementations, e.g. LVM,
> +	       * dm or md. The main difference is that btrfs calls contiguous
> +	       * block of data on a given disk, e.g. A0, stripe instead of chunk.
> +	       *
> +	       * The variables listed below have following meaning:
> +	       *   - stripe_nr is the stripe number excluding the parities
> +	       *     (A0 = 0, B0 = 1, A1 = 2, B1 = 3, etc.),
> +	       *   - high is the row number (0 for A0...Q0, 1 for Q1...P1, etc.),
> +	       *   - stripen is the disk number in a row (0 for A0, Q1, P2,
> +	       *     1 for B0, A1, Q2, etc.),
> +	       *   - off is the logical address to read,
> +	       *   - chunk_stripe_length is the size of a stripe (typically 64 KiB),
> +	       *   - nstripes is the number of disks in a row,
> +	       *   - low is the offset of the data inside a stripe,
> +	       *   - stripe_offset is the data offset in an array,
> +	       *   - csize is the "potential" data to read; it will be reduced
> +	       *     to size if the latter is smaller,
> +	       *   - nparities is the number of parities (1 for RAID 5, 2 for
> +	       *     RAID 6); used only in RAID 5/6 code.
> +	       */
> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> +
> +	      /*
> +	       * stripen is computed without the parities
> +	       * (0 for A0, A1, A2, 1 for B0, B1, B2, etc.).
> +	       */
> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> +
> +	      /*
> +	       * The stripes are spread over the disks. Every each row their
> +	       * positions are shifted by 1 place. So, the real disks number
> +	       * change. Hence, we have to take current row number modulo
> +	       * nstripes into account (0 for A0, 1 for A1, 2 for A2, etc.).

s/current row number modulo nstripes into account/into account current row number modulo nstripes/

Daniel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-10-11 18:50 [PATCH V9] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-10-11 18:50 ` Goffredo Baroncelli
  2018-10-17 13:46   ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-10-11 18:50 UTC (permalink / raw)
  To: grub-devel; +Cc: Daniel Kiper, linux-btrfs, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
---
 grub-core/fs/btrfs.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..933a57d3b 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,77 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * RAID 6 layout consists of several stripes spread over
+	       * the disks, e.g.:
+	       *
+	       *   Disk_0  Disk_1  Disk_2  Disk_3
+	       *     A0      B0      P0      Q0
+	       *     Q1      A1      B1      P1
+	       *     P2      Q2      A2      B2
+	       *
+	       * Note: placement of the parities depend on row number.
+	       *
+	       * Pay attention that the btrfs terminology may differ from
+	       * terminology used in other RAID implementations, e.g. LVM,
+	       * dm or md. The main difference is that btrfs calls contiguous
+	       * block of data on a given disk, e.g. A0, stripe instead of chunk.
+	       *
+	       * The variables listed below have following meaning:
+	       *   - stripe_nr is the stripe number excluding the parities
+	       *     (A0 = 0, B0 = 1, A1 = 2, B1 = 3, etc.),
+	       *   - high is the row number (0 for A0...Q0, 1 for Q1...P1, etc.),
+	       *   - stripen is the disk number in a row (0 for A0, Q1, P2,
+	       *     1 for B0, A1, Q2, etc.),
+	       *   - off is the logical address to read,
+	       *   - chunk_stripe_length is the size of a stripe (typically 64 KiB),
+	       *   - nstripes is the number of disks in a row,
+	       *   - low is the offset of the data inside a stripe,
+	       *   - stripe_offset is the data offset in an array,
+	       *   - csize is the "potential" data to read; it will be reduced
+	       *     to size if the latter is smaller,
+	       *   - nparities is the number of parities (1 for RAID 5, 2 for
+	       *     RAID 6); used only in RAID 5/6 code.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities
+	       * (0 for A0, A1, A2, 1 for B0, B1, B2, etc.).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * The stripes are spread over the disks. Every each row their
+	       * positions are shifted by 1 place. So, the real disks number
+	       * change. Hence, we have to take current row number modulo
+	       * nstripes into account (0 for A0, 1 for A1, 2 for A2, etc.).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-10-09 17:51   ` Daniel Kiper
@ 2018-10-11 13:17     ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-10-11 13:17 UTC (permalink / raw)
  To: kreijack; +Cc: dkiper, grub-devel, linux-btrfs, Goffredo Baroncelli

On Tue, Oct 09, 2018 at 07:51:01PM +0200, Daniel Kiper wrote:
> On Thu, Sep 27, 2018 at 08:34:56PM +0200, Goffredo Baroncelli wrote:
> > From: Goffredo Baroncelli <kreijack@inwind.it>
> >
> > Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> 
> Code LGTM. Though comment begs improvement. I will send you updated
> comment for approval shortly.

Below you can find updated patch. Please check I have not messed up something.

Daniel

From ecefb12a10d39bdd09e1d2b8fbbcbdb1b35274f8 Mon Sep 17 00:00:00 2001
From: Goffredo Baroncelli <kreijack@inwind.it>
Date: Thu, 27 Sep 2018 20:34:56 +0200
Subject: [PATCH 1/1] btrfs: Add support for reading a filesystem with a RAID
 5 or RAID 6 profile.

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
---
 grub-core/fs/btrfs.c |   73 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be19544..933a57d 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -766,6 +768,77 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      csize = chunk_stripe_length - low;
 	      break;
 	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * RAID 6 layout consists of several stripes spread over
+	       * the disks, e.g.:
+	       *
+	       *   Disk_0  Disk_1  Disk_2  Disk_3
+	       *     A0      B0      P0      Q0
+	       *     Q1      A1      B1      P1
+	       *     P2      Q2      A2      B2
+	       *
+	       * Note: placement of the parities depend on row number.
+	       *
+	       * Pay attention that the btrfs terminology may differ from
+	       * terminology used in other RAID implementations, e.g. LVM,
+	       * dm or md. The main difference is that btrfs calls contiguous
+	       * block of data on a given disk, e.g. A0, stripe instead of chunk.
+	       *
+	       * The variables listed below have following meaning:
+	       *   - stripe_nr is the stripe number excluding the parities
+	       *     (A0 = 0, B0 = 1, A1 = 2, B1 = 3, etc.),
+	       *   - high is the row number (0 for A0...Q0, 1 for Q1...P1, etc.),
+	       *   - stripen is the disk number in a row (0 for A0, Q1, P2,
+	       *     1 for B0, A1, Q2, etc.),
+	       *   - off is the logical address to read,
+	       *   - chunk_stripe_length is the size of a stripe (typically 64 KiB),
+	       *   - nstripes is the number of disks in a row,
+	       *   - low is the offset of the data inside a stripe,
+	       *   - stripe_offset is the data offset in an array,
+	       *   - csize is the "potential" data to read; it will be reduced
+	       *     to size if the latter is smaller,
+	       *   - nparities is the number of parities (1 for RAID 5, 2 for
+	       *     RAID 6); used only in RAID 5/6 code.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities
+	       * (0 for A0, A1, A2, 1 for B0, B1, B2, etc.).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * The stripes are spread over the disks. Every each row their
+	       * positions are shifted by 1 place. So, the real disks number
+	       * change. Hence, we have to take current row number modulo
+	       * nstripes into account (0 for A0, 1 for A1, 2 for A2, etc.).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
+	      break;
+	    }
 	  default:
 	    grub_dprintf ("btrfs", "unsupported RAID\n");
 	    return grub_error (GRUB_ERR_NOT_IMPLEMENTED_YET,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-27 18:34 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-10-09 17:51   ` Daniel Kiper
  2018-10-11 13:17     ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-10-09 17:51 UTC (permalink / raw)
  To: Goffredo Baroncelli
  Cc: grub-devel, Daniel Kiper, linux-btrfs, Goffredo Baroncelli

On Thu, Sep 27, 2018 at 08:34:56PM +0200, Goffredo Baroncelli wrote:
> From: Goffredo Baroncelli <kreijack@inwind.it>
>
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>

Code LGTM. Though comment begs improvement. I will send you updated
comment for approval shortly.

Daniel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-27 18:34 [PATCH V8] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-09-27 18:34 ` Goffredo Baroncelli
  2018-10-09 17:51   ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-09-27 18:34 UTC (permalink / raw)
  To: grub-devel; +Cc: Daniel Kiper, linux-btrfs, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 74 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..9bc6d399d 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,78 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * A RAID 6 layout consists of several stripes spread on the
+	       * disks, following a layout like the one below
+	       *
+	       *   Disk0  Disk1  Disk2  Disk3
+	       *
+	       *    A1     B1     P1     Q1
+	       *    Q2     A2     B2     P2
+	       *    P3     Q3     A3     B3
+	       *  [...]
+	       *
+	       *  Note that the placement of the parities depends on row index.
+	       *  Pay attention that the BTRFS terminolgy may be different
+	       *  from others RAID implementation (e.g. lvm/dm or md). In BTRFS
+	       *  a contiguous block of data of a disk (like A1) is called
+	       *  stripe.
+	       *  In the code below:
+	       *  - stripe_nr is the stripe number without the parities
+	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
+	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
+	       *  - stripen is the disk number in a row (0 for A1,Q2,P3, 1 for
+	       *    B1...),
+	       *  - off is the logical address to read,
+	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
+	       *  - nstripes is the number of disks of a row
+	       *  - low is the offset of the data inside a stripe,
+	       *  - stripe_offset is the data offset in an array,
+	       *  - csize is the "potential" data to read. It will be reduced to
+	       *    size if the latter is smaller.
+	       *  - nparities is the number of parities (1 for RAID5, 2 for
+	       *    RAID6); used only in RAID5/6 code.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities (0 for A1, A2, A3...
+	       * 1 for B1, B2...).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * the stripes are spread across the disks, each row their
+	       * position is shifted by 1 place. So to compute the real disk
+	       * number occuped by a stripe, we need to sum also the
+	       * "row number" in modulo nstripes (0 for A1, 1 for A2, 2 for
+	       * A3....).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-26 20:40     ` Goffredo Baroncelli
@ 2018-09-27 15:47       ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-09-27 15:47 UTC (permalink / raw)
  To: kreijack; +Cc: Daniel Kiper, grub-devel, linux-btrfs

On Wed, Sep 26, 2018 at 10:40:32PM +0200, Goffredo Baroncelli wrote:
> On 25/09/2018 17.31, Daniel Kiper wrote:
> > On Wed, Sep 19, 2018 at 08:40:32PM +0200, Goffredo Baroncelli wrote:
> >> From: Goffredo Baroncelli <kreijack@inwind.it>
> >>
> >> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> >> ---
> >>  grub-core/fs/btrfs.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 66 insertions(+)
> >>
> >> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> >> index be195448d..56c42746d 100644
> >> --- a/grub-core/fs/btrfs.c
> >> +++ b/grub-core/fs/btrfs.c
> >> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
> >>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
> >>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
> >>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> >> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> >> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
> >>    grub_uint8_t dummy2[0xc];
> >>    grub_uint16_t nstripes;
> >>    grub_uint16_t nsubstripes;
> >> @@ -764,6 +766,70 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
> >>  	      stripe_offset = low + chunk_stripe_length
> >>  		* high;
> >>  	      csize = chunk_stripe_length - low;
> >> +	      break;
> >> +	    }
> >> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> >> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> >> +	    {
> >> +	      grub_uint64_t nparities, block_nr, high, low;
> >> +
> >> +	      redundancy = 1;	/* no redundancy for now */
> >> +
> >> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> >> +		{
> >> +		  grub_dprintf ("btrfs", "RAID5\n");
> >> +		  nparities = 1;
> >> +		}
> >> +	      else
> >> +		{
> >> +		  grub_dprintf ("btrfs", "RAID6\n");
> >> +		  nparities = 2;
> >> +		}
> >> +
> >> +	      /*
> >> +	       * A RAID 6 layout consists of several blocks spread on the disks.
> >> +	       * The raid terminology is used to call all the blocks of a row
> >> +	       * "stripe". Unfortunately the BTRFS terminology confuses block
> >
> > Stripe is data set or parity (parity stripe) on one disk. Block has
> > different meaning. Please stick to btrfs terminology and say it clearly
> > in the comment. And even add a link to btrfs wiki page to ease reading.
> >
> > I think about this one:
> >   https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs#BLOCK_GROUPS.2C_CHUNKS.2C_RAID
> >
> >> +	       * and stripe.
> >
> > I do not think so. Or at least not so much...
>
> Trust me, generally speaking stripe is the "row" in the disks (without the parity); looking at the ext3 man page:
>
> ....
>                    stride=stride-size
>                           Configure  the  filesystem  for  a  RAID  array with
>                           stride-size filesystem blocks. This is the number of
>                           blocks  read or written to disk before moving to the
>                           next disk, which is sometimes  referred  to  as  the
>                           chunk   size.   This  mostly  affects  placement  of
>                           filesystem metadata like bitmaps at mke2fs  time  to
>                           avoid  placing them on a single disk, which can hurt
>                           performance.  It may also be used by the block allo???
>                           cator.
>
>                    stripe_width=stripe-width
>                           Configure  the  filesystem  for  a  RAID  array with
>                           stripe-width filesystem blocks per stripe.  This  is
>                           typically  stride-size * N, where N is the number of
>                           data-bearing disks in the  RAID  (e.g.  for  RAID  5
>                           there is one parity disk, so N will be the number of
>                           disks in the array minus 1).  This allows the  block
>                           allocator to prevent read-modify-write of the parity
>                           in a RAID stripe if possible when the data is  writ???
>                           ten.
>
> ....
> Looking at the RAID5 wikipedia page, it seems that the term "stripe"
> is coherent with the ext3 man page.

Ugh... It looks that I have messed up things. Sorry about that.

> I suspect that the confusion is due to the fact that in RAID1 a stripe
> is in ONE disk (because the others are like parities). In BTRFS the
> RAID5/6 code uses the structure of RAID1 with some minimal
> extensions...
>
> To be clear, I don't have problem to be adherent to the BTRFS
> terminology. However I found this very confusing because I was used to
> a different terminology. I am bit worried about the fact that grub

Yeah, I have the same feeling. However, I think that in btrfs code we
should stay with btrfs terms. Though I think that it make sense to
underline differences between btrfs and well known RAID here.

> uses both MD/DM code and BTRFS code; a quick look to the code (eg
> ./grub-core/disk/diskfilter.c) shows that the stripe_size field seems
> to be related to a disks row without parities.
>
> And... yes in BTRFS "chunk" is a completely different beast than what
> it is reported in the ext3 man page :-)

As I said above, please say it in the comment. This will ease reading
for people who are not used to btrfs terms.

> >> +	       *
> >> +	       *   Disk0  Disk1  Disk2  Disk3
> >> +	       *
> >> +	       *    A1     B1     P1     Q1
> >> +	       *    Q2     A2     B2     P2
> >> +	       *    P3     Q3     A3     B3
> >> +	       *  [...]
> >> +	       *
> >> +	       *  Note that the placement of the parities depends on row index.
> >> +	       *  In the code below:
> >> +	       *  - block_nr is the block number without the parities
> >
> > Well, it seems to me that the btrfs code introduces confusion not the
> > spec itself. I would leave code as is but s/block number/stripe number/.
>
> Ok I will replace the two terms. However I have to put a warning that this is a "BTRFS" terminology :-)

Yep, and please explain the differences at the beginning of the comment.

> >> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> >> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> >> +	       *  - stripen is the disk number (0 for A1,Q2,P3, 1 for B1...),
> >
> > s/disk number/disk number in a row/
>
> This value doesn't depend by the row. So "number of disk" is more correct

Yes, but without "row" it is a bit confusing because it suggests that
it is an arbitrary number. Even if you give an example next to the
description. So, I am insisting on adding "in a row" here.

> >> +	       *  - off is the logical address to read
> >> +	       *  - chunk_stripe_length is the size of a block (typically 64k),
> >
> > s/a block/a stripe/

Taking into account discussion above I am not sure right now which one
is correct. Please double check and fix it if it is needed.

> >> +	       *  - nstripes is the number of disks,
> >
> > s/number of disks/number of disks in a row/
> ditto

As above, Is it total number of disks in array? I do not think so.
Hence, I think that "in a row" helps a bit. Even if it is not very
precise. However, if you come up with something better I am not
against it.

> > I miss the description of nparities here...
>
> Right:
> +              *  - nparities is the number of parities (1 for RAID5, 2 for RAID6);
> +              *    used only in RAID5/6 code.

LGTM.

> >> +	       *  - low is the offset of the data inside a stripe,
> >> +	       *  - stripe_offset is the disk offset,
> >
> > s/the disk offset/the data offset in an array/?
>
> Yes
>
> >
> >> +	       *  - csize is the "potential" data to read. It will be reduced to
> >> +	       *    size if the latter is smaller.
> >> +	       */
> >> +	      block_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> >> +
> >> +	      /*
> >> +	       * stripen is computed without the parities (0 for A1, A2, A3...
> >> +	       * 1 for B1, B2...).
> >> +	       */
> >> +	      high = grub_divmod64 (block_nr, nstripes - nparities, &stripen);
> >
> > This is clear...
> >
> >> +	      /*
> >> +	       * stripen is recomputed considering the parities (0 for A1, 1 for
> >> +	       * A2, 2 for A3....).
> >> +	       */
> >> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> >
> > ... but this looks strange... You add disk number to row number. Hmmm...
> > It looks that it works but this is not obvious at first sight. Could you
> > explain that?
>
> What about
> +	      /*
> +	       * stripen is recomputed considering the parities: different row have
> +              * a different offset, we have to add to stripen the number of row ("high") in
> +              * modulo nstripes (0 for A1, 1 for A2, 2 for A3....).
> +	       */

This is better but not much. You are repeating what code does.
I am especially interested in why this math is correct. It is not
obvious at first sight. If it is not it should be explained.
Otherwise we will forget in a few months why it is correct.

Daniel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-25 15:31   ` Daniel Kiper
@ 2018-09-26 20:40     ` Goffredo Baroncelli
  2018-09-27 15:47       ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-09-26 20:40 UTC (permalink / raw)
  To: Daniel Kiper; +Cc: grub-devel, linux-btrfs, Goffredo Baroncelli

On 25/09/2018 17.31, Daniel Kiper wrote:
> On Wed, Sep 19, 2018 at 08:40:32PM +0200, Goffredo Baroncelli wrote:
>> From: Goffredo Baroncelli <kreijack@inwind.it>
>>
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/fs/btrfs.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 66 insertions(+)
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index be195448d..56c42746d 100644
>> --- a/grub-core/fs/btrfs.c
>> +++ b/grub-core/fs/btrfs.c
>> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
>> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
>> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>>    grub_uint8_t dummy2[0xc];
>>    grub_uint16_t nstripes;
>>    grub_uint16_t nsubstripes;
>> @@ -764,6 +766,70 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	      stripe_offset = low + chunk_stripe_length
>>  		* high;
>>  	      csize = chunk_stripe_length - low;
>> +	      break;
>> +	    }
>> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
>> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
>> +	    {
>> +	      grub_uint64_t nparities, block_nr, high, low;
>> +
>> +	      redundancy = 1;	/* no redundancy for now */
>> +
>> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
>> +		{
>> +		  grub_dprintf ("btrfs", "RAID5\n");
>> +		  nparities = 1;
>> +		}
>> +	      else
>> +		{
>> +		  grub_dprintf ("btrfs", "RAID6\n");
>> +		  nparities = 2;
>> +		}
>> +
>> +	      /*
>> +	       * A RAID 6 layout consists of several blocks spread on the disks.
>> +	       * The raid terminology is used to call all the blocks of a row
>> +	       * "stripe". Unfortunately the BTRFS terminology confuses block
> 
> Stripe is data set or parity (parity stripe) on one disk. Block has
> different meaning. Please stick to btrfs terminology and say it clearly
> in the comment. And even add a link to btrfs wiki page to ease reading.
> 
> I think about this one:
>   https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs#BLOCK_GROUPS.2C_CHUNKS.2C_RAID
> 
>> +	       * and stripe.
> 
> I do not think so. Or at least not so much...

Trust me, generally speaking stripe is the "row" in the disks (without the parity); looking at the ext3 man page:

....
                   stride=stride-size
                          Configure  the  filesystem  for  a  RAID  array with
                          stride-size filesystem blocks. This is the number of
                          blocks  read or written to disk before moving to the
                          next disk, which is sometimes  referred  to  as  the
                          chunk   size.   This  mostly  affects  placement  of
                          filesystem metadata like bitmaps at mke2fs  time  to
                          avoid  placing them on a single disk, which can hurt
                          performance.  It may also be used by the block allo‐
                          cator.

                   stripe_width=stripe-width
                          Configure  the  filesystem  for  a  RAID  array with
                          stripe-width filesystem blocks per stripe.  This  is
                          typically  stride-size * N, where N is the number of
                          data-bearing disks in the  RAID  (e.g.  for  RAID  5
                          there is one parity disk, so N will be the number of
                          disks in the array minus 1).  This allows the  block
                          allocator to prevent read-modify-write of the parity
                          in a RAID stripe if possible when the data is  writ‐
                          ten.

....
Looking at the RAID5 wikipedia page, it seems that the term "stripe" is coherent with the ext3 man page.

I suspect that the confusion is due to the fact that in RAID1 a stripe is in ONE disk (because the others are like parities). In BTRFS the RAID5/6 code uses the structure of RAID1 with some minimal extensions...

To be clear, I don't have problem to be adherent to the BTRFS terminology. However I found this very confusing because I was used to a different terminology. I am bit worried about the fact that grub uses both MD/DM code and BTRFS code; a quick look to the code (eg ./grub-core/disk/diskfilter.c) shows that the stripe_size field seems to be related to a disks row without parities.

And... yes in BTRFS "chunk" is a completely different beast than what it is reported in the ext3 man page :-)


> 
>> +	       *
>> +	       *   Disk0  Disk1  Disk2  Disk3
>> +	       *
>> +	       *    A1     B1     P1     Q1
>> +	       *    Q2     A2     B2     P2
>> +	       *    P3     Q3     A3     B3
>> +	       *  [...]
>> +	       *
>> +	       *  Note that the placement of the parities depends on row index.
>> +	       *  In the code below:
>> +	       *  - block_nr is the block number without the parities
> 
> Well, it seems to me that the btrfs code introduces confusion not the
> spec itself. I would leave code as is but s/block number/stripe number/.

Ok I will replace the two terms. However I have to put a warning that this is a "BTRFS" terminology :-)
> 
>> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
>> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
>> +	       *  - stripen is the disk number (0 for A1,Q2,P3, 1 for B1...),
> 
> s/disk number/disk number in a row/

This value doesn't depend by the row. So "number of disk" is more correct
> 
>> +	       *  - off is the logical address to read
>> +	       *  - chunk_stripe_length is the size of a block (typically 64k),
> 
> s/a block/a stripe/
> 
>> +	       *  - nstripes is the number of disks,
> 
> s/number of disks/number of disks in a row/
ditto


> 
> I miss the description of nparities here...

Right:
+              *  - nparities is the number of parities (1 for RAID5, 2 for RAID6);
+              *    used only in RAID5/6 code.

> 
>> +	       *  - low is the offset of the data inside a stripe,
>> +	       *  - stripe_offset is the disk offset,
> 
> s/the disk offset/the data offset in an array/?

Yes

> 
>> +	       *  - csize is the "potential" data to read. It will be reduced to
>> +	       *    size if the latter is smaller.
>> +	       */
>> +	      block_nr = grub_divmod64 (off, chunk_stripe_length, &low);
>> +
>> +	      /*
>> +	       * stripen is computed without the parities (0 for A1, A2, A3...
>> +	       * 1 for B1, B2...).
>> +	       */
>> +	      high = grub_divmod64 (block_nr, nstripes - nparities, &stripen);
> 
> This is clear...
> 
>> +	      /*
>> +	       * stripen is recomputed considering the parities (0 for A1, 1 for
>> +	       * A2, 2 for A3....).
>> +	       */
>> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> 
> ... but this looks strange... You add disk number to row number. Hmmm...
> It looks that it works but this is not obvious at first sight. Could you
> explain that?

What about
+	      /*
+	       * stripen is recomputed considering the parities: different row have
+              * a different offset, we have to add to stripen the number of row ("high") in 
+              * modulo nstripes (0 for A1, 1 for A2, 2 for A3....).
+	       */

> 
>> +	      stripe_offset = low + chunk_stripe_length * high;
>> +	      csize = chunk_stripe_length - low;
>> +
>>  	      break;
>>  	    }
>>  	  default:
> 
> Daniel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-19 18:40 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-09-25 15:31   ` Daniel Kiper
  2018-09-26 20:40     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-09-25 15:31 UTC (permalink / raw)
  To: Goffredo Baroncelli
  Cc: grub-devel, Daniel Kiper, linux-btrfs, Goffredo Baroncelli

On Wed, Sep 19, 2018 at 08:40:32PM +0200, Goffredo Baroncelli wrote:
> From: Goffredo Baroncelli <kreijack@inwind.it>
>
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..56c42746d 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,70 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, block_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * A RAID 6 layout consists of several blocks spread on the disks.
> +	       * The raid terminology is used to call all the blocks of a row
> +	       * "stripe". Unfortunately the BTRFS terminology confuses block

Stripe is data set or parity (parity stripe) on one disk. Block has
different meaning. Please stick to btrfs terminology and say it clearly
in the comment. And even add a link to btrfs wiki page to ease reading.

I think about this one:
  https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs#BLOCK_GROUPS.2C_CHUNKS.2C_RAID

> +	       * and stripe.

I do not think so. Or at least not so much...

> +	       *
> +	       *   Disk0  Disk1  Disk2  Disk3
> +	       *
> +	       *    A1     B1     P1     Q1
> +	       *    Q2     A2     B2     P2
> +	       *    P3     Q3     A3     B3
> +	       *  [...]
> +	       *
> +	       *  Note that the placement of the parities depends on row index.
> +	       *  In the code below:
> +	       *  - block_nr is the block number without the parities

Well, it seems to me that the btrfs code introduces confusion not the
spec itself. I would leave code as is but s/block number/stripe number/.

> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> +	       *  - stripen is the disk number (0 for A1,Q2,P3, 1 for B1...),

s/disk number/disk number in a row/

> +	       *  - off is the logical address to read
> +	       *  - chunk_stripe_length is the size of a block (typically 64k),

s/a block/a stripe/

> +	       *  - nstripes is the number of disks,

s/number of disks/number of disks in a row/

I miss the description of nparities here...

> +	       *  - low is the offset of the data inside a stripe,
> +	       *  - stripe_offset is the disk offset,

s/the disk offset/the data offset in an array/?

> +	       *  - csize is the "potential" data to read. It will be reduced to
> +	       *    size if the latter is smaller.
> +	       */
> +	      block_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> +
> +	      /*
> +	       * stripen is computed without the parities (0 for A1, A2, A3...
> +	       * 1 for B1, B2...).
> +	       */
> +	      high = grub_divmod64 (block_nr, nstripes - nparities, &stripen);

This is clear...

> +	      /*
> +	       * stripen is recomputed considering the parities (0 for A1, 1 for
> +	       * A2, 2 for A3....).
> +	       */
> +	      grub_divmod64 (high + stripen, nstripes, &stripen);

... but this looks strange... You add disk number to row number. Hmmm...
It looks that it works but this is not obvious at first sight. Could you
explain that?

> +	      stripe_offset = low + chunk_stripe_length * high;
> +	      csize = chunk_stripe_length - low;
> +
>  	      break;
>  	    }
>  	  default:

Daniel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-19 18:36 Goffredo Baroncelli
@ 2018-09-19 18:42 ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-09-19 18:42 UTC (permalink / raw)
  To: Goffredo Baroncelli, grub-devel

Please ignore this email

On 19/09/2018 20.36, Goffredo Baroncelli wrote:
> From: Goffredo Baroncelli <kreijack@inwind.it>
> 
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
> 
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..56c42746d 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,70 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, block_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * A RAID 6 layout consists of several blocks spread on the disks.
> +	       * The raid terminology is used to call all the blocks of a row
> +	       * "stripe". Unfortunately the BTRFS terminology confuses block
> +	       * and stripe.
> +	       *
> +	       *   Disk0  Disk1  Disk2  Disk3
> +	       *
> +	       *    A1     B1     P1     Q1
> +	       *    Q2     A2     B2     P2
> +	       *    P3     Q3     A3     B3
> +	       *  [...]
> +	       *
> +	       *  Note that the placement of the parities depends on row index.
> +	       *  In the code below:
> +	       *  - block_nr is the block number without the parities
> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> +	       *  - stripen is the disk number (0 for A1,Q2,P3, 1 for B1...),
> +	       *  - off is the logical address to read
> +	       *  - chunk_stripe_length is the size of a block (typically 64k),
> +	       *  - nstripes is the number of disks,
> +	       *  - low is the offset of the data inside a stripe,
> +	       *  - stripe_offset is the disk offset,
> +	       *  - csize is the "potential" data to read. It will be reduced to
> +	       *    size if the latter is smaller.
> +	       */
> +	      block_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> +
> +	      /*
> +	       * stripen is computed without the parities (0 for A1, A2, A3...
> +	       * 1 for B1, B2...).
> +	       */
> +	      high = grub_divmod64 (block_nr, nstripes - nparities, &stripen);
> +
> +	      /*
> +	       * stripen is recomputed considering the parities (0 for A1, 1 for
> +	       * A2, 2 for A3....).
> +	       */
> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> +
> +	      stripe_offset = low + chunk_stripe_length * high;
> +	      csize = chunk_stripe_length - low;
> +
>  	      break;
>  	    }
>  	  default:
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-09-19 18:40 [PATCH V7] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-09-19 18:40 ` Goffredo Baroncelli
  2018-09-25 15:31   ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-09-19 18:40 UTC (permalink / raw)
  To: grub-devel; +Cc: Daniel Kiper, linux-btrfs, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..56c42746d 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,70 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, block_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * A RAID 6 layout consists of several blocks spread on the disks.
+	       * The raid terminology is used to call all the blocks of a row
+	       * "stripe". Unfortunately the BTRFS terminology confuses block
+	       * and stripe.
+	       *
+	       *   Disk0  Disk1  Disk2  Disk3
+	       *
+	       *    A1     B1     P1     Q1
+	       *    Q2     A2     B2     P2
+	       *    P3     Q3     A3     B3
+	       *  [...]
+	       *
+	       *  Note that the placement of the parities depends on row index.
+	       *  In the code below:
+	       *  - block_nr is the block number without the parities
+	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
+	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
+	       *  - stripen is the disk number (0 for A1,Q2,P3, 1 for B1...),
+	       *  - off is the logical address to read
+	       *  - chunk_stripe_length is the size of a block (typically 64k),
+	       *  - nstripes is the number of disks,
+	       *  - low is the offset of the data inside a stripe,
+	       *  - stripe_offset is the disk offset,
+	       *  - csize is the "potential" data to read. It will be reduced to
+	       *    size if the latter is smaller.
+	       */
+	      block_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities (0 for A1, A2, A3...
+	       * 1 for B1, B2...).
+	       */
+	      high = grub_divmod64 (block_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * stripen is recomputed considering the parities (0 for A1, 1 for
+	       * A2, 2 for A3....).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
@ 2018-09-19 18:36 Goffredo Baroncelli
  2018-09-19 18:42 ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-09-19 18:36 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..56c42746d 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,70 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, block_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * A RAID 6 layout consists of several blocks spread on the disks.
+	       * The raid terminology is used to call all the blocks of a row
+	       * "stripe". Unfortunately the BTRFS terminology confuses block
+	       * and stripe.
+	       *
+	       *   Disk0  Disk1  Disk2  Disk3
+	       *
+	       *    A1     B1     P1     Q1
+	       *    Q2     A2     B2     P2
+	       *    P3     Q3     A3     B3
+	       *  [...]
+	       *
+	       *  Note that the placement of the parities depends on row index.
+	       *  In the code below:
+	       *  - block_nr is the block number without the parities
+	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
+	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
+	       *  - stripen is the disk number (0 for A1,Q2,P3, 1 for B1...),
+	       *  - off is the logical address to read
+	       *  - chunk_stripe_length is the size of a block (typically 64k),
+	       *  - nstripes is the number of disks,
+	       *  - low is the offset of the data inside a stripe,
+	       *  - stripe_offset is the disk offset,
+	       *  - csize is the "potential" data to read. It will be reduced to
+	       *    size if the latter is smaller.
+	       */
+	      block_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is computed without the parities (0 for A1, A2, A3...
+	       * 1 for B1, B2...).
+	       */
+	      high = grub_divmod64 (block_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * stripen is recomputed considering the parities (0 for A1, 1 for
+	       * A2, 2 for A3....).
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.19.0



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-07-18  6:24     ` Goffredo Baroncelli
@ 2018-09-03 12:52       ` Daniel Kiper
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Kiper @ 2018-09-03 12:52 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: Daniel Kiper, The development of GNU GRUB

On Wed, Jul 18, 2018 at 08:24:50AM +0200, Goffredo Baroncelli wrote:
> On 07/12/2018 03:46 PM, Daniel Kiper wrote:
> > On Tue, Jun 19, 2018 at 07:39:48PM +0200, Goffredo Baroncelli wrote:
> >> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> >> ---
> >>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 70 insertions(+)
> >>
> >> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> >> index be195448d..fbabaebbe 100644
> >> --- a/grub-core/fs/btrfs.c
> >> +++ b/grub-core/fs/btrfs.c
> >> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
> >>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
> >>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
> >>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> >> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> >> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
> >>    grub_uint8_t dummy2[0xc];
> >>    grub_uint16_t nstripes;
> >>    grub_uint16_t nsubstripes;
> >> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
> >>  	      stripe_offset = low + chunk_stripe_length
> >>  		* high;
> >>  	      csize = chunk_stripe_length - low;
> >> +	      break;
> >> +	    }
> >> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> >> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> >> +	    {
> >> +	      grub_uint64_t nparities, stripe_nr, high, low;
> >> +
> >> +	      redundancy = 1;	/* no redundancy for now */
> >> +
> >> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> >> +		{
> >> +		  grub_dprintf ("btrfs", "RAID5\n");
> >> +		  nparities = 1;
> >> +		}
> >> +	      else
> >> +		{
> >> +		  grub_dprintf ("btrfs", "RAID6\n");
> >> +		  nparities = 2;
> >> +		}
> >> +
> >> +	      /*
> >> +	       * Below is an example of a RAID 6 layout and the meaning of the
> >> +	       * variables. The same applies to RAID 5. The only differences is
> >> +	       * that there is only one parity disk instead of two.
> >> +	       *
> >> +	       * A RAID 6 layout consists of several stripes spread
> >> +	       * on the disks, following a layout like the one below
> >> +	       *
> >> +	       *   Disk0  Disk1  Disk2  Ddisk3
> >
> > s/Ddisk3/Disk3/
> >
> >> +	       *
> >> +	       *    A1     B1     P1     Q1
> >> +	       *    Q2     A2     B2     P2
> >> +	       *    P3     Q3     A3     B3
> >> +	       *  [...]
> >> +	       *
> >> +	       *  Note that the placement of the parities depends on row index.
> >> +	       *  In the code below:
> >> +	       *  - stripe_nr is the stripe number not considering the parities
> >> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> >> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> >> +	       *  - stripen is the column number (or disk number),
> >> +	       *  - off is the logical address to read (from the beginning of
> >> +	       *    the chunk space),
> >
> > What do you mean by chunk?
>
> Is a specific btrfs data structure (see
> https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs#BLOCK_GROUPS.2C_CHUNKS.2C_RAID).
> Most of the BTRFS internal structures (e.g. where the file data is
> stored) are in terms of *logical address*. At this level there is no
> concept of redundancy or raid nor disks. You can see the logical
> address as a flat space, large as the sum of the disks sizes (minus
> the space used by the parities or mirroring).
>
> The "logical space" is divided in "slices" called chunk. A chunk
> typically has a size in terms of hundred of megabytes (IIRC up to 1GB)
>
> A chunk (or block group) is a mapping. A chunk maps a "logical
> address" to a "physical address" on the disks.
>
> You can think a chunk as a structure like:
>
> 	u64 profile
> 	u64 logical_start, logical_end
> 	u64 disk1_id, disk1_start, disk1_end
> 	u64 disk2_id, disk2_start, disk2_end
> 	u64 disk3_id, disk3_start, disk3_end
> 	[...]
>
> In order to translate a "logical_address" to the pair of "disk" and "physical address:
> - find the chunk which maps the logical address (looking to the "logical_start", "logical_end")
> - then on the basis of "profile" value you can have the following cases:
> 	profile == SINGLE:
> 		is the simplest one: the logical address is mapped to the range
> 		(disk1_start...disk1_end) of the disk "disk_id1"
>
> 	profile == RAID1
> 		like the previous one; however if the disk1 is not available,
> 		you can find the data in the range
> 		(disk2_start...disk2_end) of the disk "disk_id2". NOTE: the two ranges
> 		must have the same sizes, but may have different starts
>
> 	profile == RAID0
> 		the data is without redundancy and it is spread on different disk each
> 		"chunk_stripe_length" bytes; so:
> 			off = logical_address - logical_start // this is what I call
> 							      // address from the
> 							      // beginning of the chunk
>
>
>
>
>
>
> 	and so on....
>
>
>
> >
> >> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
> >> +	       *  - nstripes is the number of disks,
> >> +	       *  - low is the offset of the data inside a stripe,
> >> +	       *  - stripe_offset is the disk offset from the beginning
> >> +	       *    of the disk chunk mapping start,
> >
> > Err... I am confused here...
>
> As described above, the chunk data structure translate from "logical
> address" to the pair (disk, physical address on the disk). Each chunk
> translate a small portion of the space (typically 1GB); so it is more
> correct to say that a chunk maps a range in the logical address to a
> range in the disks. 'off' is the offset from the beginning of the
> range in the logical address. 'stripe_offset' is the offset from the
> beginning of the range in the physical space
>
> >> +	       *  - csize is the "potential" data to read. It will be reduced to
> >> +	       *    size if the latter is smaller.
> >> +	       */
> >> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> >
> > Here it seems to me that off is from the beginning of RAID volume space.
> > So, it does not agree with the comment above...
>
> RAID volume space == chunk logical address space
>
> >> +	      /*
> >> +	       * stripen is evaluated without considering
> >> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
> >> +	       */
> >> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> >
> > This looks good.
> >
> >> +	      /*
> >> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
> >> +	       * 2 for A3....). The math is performed modulo number of disks.
> >> +	       */
> >> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> >
> > Ditto.
> >
> >> +	      stripe_offset = low + chunk_stripe_length * high;
> >
> > And here I am confused. Why "* high"? If chunk_stripe_length is stripe
> > length then stripe_offset leads to nowhere. Am I missing something?
> > I have a feeling that comments does not agree with the code somehow.
>
> The "physical range" is divided in small range called(?) "stripe".
> These stripe are spread on the disks. 'high' is the offset from the
> beginning of the physical chunk range in unit of "chunk_stripe_length"
>
> I found quite confuse the btrfs terminology around the "stripe" word.
> And I have to point out that these variables were named before my
> patch.
>
> I am afraid that my comment doesn't help to "de-obfuscate" my code. I

It does. Please extend the code comment accordingly. You can add a link
to the BTRFS Wiki page. Though please try to use terms only from BRFS
specs/docs. Do not introduce new ones if it is not really needed. This
whole thing will ease reading new and old code.

> know that my English is far to be perfect. However this code is not so

Do not worry about that. Many people here learn English. Just keep doing it.

> different that the previous one (i.e. handling the btrfs
> raid1/dup/raid0....). So I am thinking to remove all the comments
> because it seems to me that the my comments increase the confusion
> instead of reducing it.

No, please do not do that. We needed it. If you take into account my
comments then everything should be OK.

Anyway, the code starts looking good. I think that we need 1-3
iterations and I will be able to get it into GRUB2 git repo.
So, please keep working on this feature.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-07-12 13:46   ` Daniel Kiper
@ 2018-07-18  6:24     ` Goffredo Baroncelli
  2018-09-03 12:52       ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-07-18  6:24 UTC (permalink / raw)
  To: Daniel Kiper, The development of GNU GRUB

On 07/12/2018 03:46 PM, Daniel Kiper wrote:
> On Tue, Jun 19, 2018 at 07:39:48PM +0200, Goffredo Baroncelli wrote:
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 70 insertions(+)
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index be195448d..fbabaebbe 100644
>> --- a/grub-core/fs/btrfs.c
>> +++ b/grub-core/fs/btrfs.c
>> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
>> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
>> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>>    grub_uint8_t dummy2[0xc];
>>    grub_uint16_t nstripes;
>>    grub_uint16_t nsubstripes;
>> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	      stripe_offset = low + chunk_stripe_length
>>  		* high;
>>  	      csize = chunk_stripe_length - low;
>> +	      break;
>> +	    }
>> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
>> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
>> +	    {
>> +	      grub_uint64_t nparities, stripe_nr, high, low;
>> +
>> +	      redundancy = 1;	/* no redundancy for now */
>> +
>> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
>> +		{
>> +		  grub_dprintf ("btrfs", "RAID5\n");
>> +		  nparities = 1;
>> +		}
>> +	      else
>> +		{
>> +		  grub_dprintf ("btrfs", "RAID6\n");
>> +		  nparities = 2;
>> +		}
>> +
>> +	      /*
>> +	       * Below is an example of a RAID 6 layout and the meaning of the
>> +	       * variables. The same applies to RAID 5. The only differences is
>> +	       * that there is only one parity disk instead of two.
>> +	       *
>> +	       * A RAID 6 layout consists of several stripes spread
>> +	       * on the disks, following a layout like the one below
>> +	       *
>> +	       *   Disk0  Disk1  Disk2  Ddisk3
> 
> s/Ddisk3/Disk3/
> 
>> +	       *
>> +	       *    A1     B1     P1     Q1
>> +	       *    Q2     A2     B2     P2
>> +	       *    P3     Q3     A3     B3
>> +	       *  [...]
>> +	       *
>> +	       *  Note that the placement of the parities depends on row index.
>> +	       *  In the code below:
>> +	       *  - stripe_nr is the stripe number not considering the parities
>> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
>> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
>> +	       *  - stripen is the column number (or disk number),
>> +	       *  - off is the logical address to read (from the beginning of
>> +	       *    the chunk space),
> 
> What do you mean by chunk?

Is a specific btrfs data structure (see https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs#BLOCK_GROUPS.2C_CHUNKS.2C_RAID). Most of the BTRFS internal structures (e.g. where the file data is stored) are in terms of *logical address*. At this level there is no concept of redundancy or raid nor disks. You can see the logical address as a flat space, large as the sum of the disks sizes (minus the space used by the parities or mirroring).

The "logical space" is divided in "slices" called chunk. A chunk typically has a size in terms of hundred of megabytes (IIRC up to 1GB)

A chunk (or block group) is a mapping. A chunk maps a "logical address" to a "physical address" on the disks.

You can think a chunk as a structure like:

	u64 profile
	u64 logical_start, logical_end
	u64 disk1_id, disk1_start, disk1_end
	u64 disk2_id, disk2_start, disk2_end
	u64 disk3_id, disk3_start, disk3_end
	[...]

In order to translate a "logical_address" to the pair of "disk" and "physical address:
- find the chunk which maps the logical address (looking to the "logical_start", "logical_end")
- then on the basis of "profile" value you can have the following cases:
	profile == SINGLE:
		is the simplest one: the logical address is mapped to the range
		(disk1_start...disk1_end) of the disk "disk_id1"

	profile == RAID1
		like the previous one; however if the disk1 is not available, 
		you can find the data in the range 
		(disk2_start...disk2_end) of the disk "disk_id2". NOTE: the two ranges
		must have the same sizes, but may have different starts

	profile == RAID0
		the data is without redundancy and it is spread on different disk each
		"chunk_stripe_length" bytes; so:
			off = logical_address - logical_start // this is what I call
							      // address from the
							      // beginning of the chunk

			
							     
			
		

	and so on....



> 
>> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
>> +	       *  - nstripes is the number of disks,
>> +	       *  - low is the offset of the data inside a stripe,
>> +	       *  - stripe_offset is the disk offset from the beginning
>> +	       *    of the disk chunk mapping start,
> 
> Err... I am confused here...

As described above, the chunk data structure translate from "logical address" to the pair (disk, physical address on the disk). Each chunk translate a small portion of the space (typically 1GB); so it is more correct to say that a chunk maps a range in the logical address to a range in the disks. 'off' is the offset from the beginning of the range in the logical address. 'stripe_offset' is the offset from the beginning of the range in the physical space



> 
>> +	       *  - csize is the "potential" data to read. It will be reduced to
>> +	       *    size if the latter is smaller.
>> +	       */
>> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> 
> Here it seems to me that off is from the beginning of RAID volume space.
> So, it does not agree with the comment above...

RAID volume space == chunk logical address space

> 
>> +	      /*
>> +	       * stripen is evaluated without considering
>> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
>> +	       */
>> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> 
> This looks good.
> 
>> +	      /*
>> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
>> +	       * 2 for A3....). The math is performed modulo number of disks.
>> +	       */
>> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> 
> Ditto.
> 
>> +	      stripe_offset = low + chunk_stripe_length * high;
> 
> And here I am confused. Why "* high"? If chunk_stripe_length is stripe
> length then stripe_offset leads to nowhere. Am I missing something?
> I have a feeling that comments does not agree with the code somehow.

The "physical range" is divided in small range called(?) "stripe". These stripe are spread on the disks. 'high' is the offset from the beginning of the physical chunk range in unit of "chunk_stripe_length"

I found quite confuse the btrfs terminology around the "stripe" word. And I have to point out that these variables were named before my patch.

I am afraid that my comment doesn't help to "de-obfuscate" my code. I know that my English is far to be perfect. However this code is not so different that the previous one (i.e. handling the btrfs raid1/dup/raid0....). So I am thinking to remove all the comments because it seems to me that the my comments increase the confusion instead of reducing it.

Please, let me know your opinion


> 
> Daniel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-19 17:39 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
  2018-07-08 15:51   ` Goffredo Baroncelli
@ 2018-07-12 13:46   ` Daniel Kiper
  2018-07-18  6:24     ` Goffredo Baroncelli
  1 sibling, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-07-12 13:46 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Tue, Jun 19, 2018 at 07:39:48PM +0200, Goffredo Baroncelli wrote:
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 70 insertions(+)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..fbabaebbe 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, stripe_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * Below is an example of a RAID 6 layout and the meaning of the
> +	       * variables. The same applies to RAID 5. The only differences is
> +	       * that there is only one parity disk instead of two.
> +	       *
> +	       * A RAID 6 layout consists of several stripes spread
> +	       * on the disks, following a layout like the one below
> +	       *
> +	       *   Disk0  Disk1  Disk2  Ddisk3

s/Ddisk3/Disk3/

> +	       *
> +	       *    A1     B1     P1     Q1
> +	       *    Q2     A2     B2     P2
> +	       *    P3     Q3     A3     B3
> +	       *  [...]
> +	       *
> +	       *  Note that the placement of the parities depends on row index.
> +	       *  In the code below:
> +	       *  - stripe_nr is the stripe number not considering the parities
> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> +	       *  - stripen is the column number (or disk number),
> +	       *  - off is the logical address to read (from the beginning of
> +	       *    the chunk space),

What do you mean by chunk?

> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
> +	       *  - nstripes is the number of disks,
> +	       *  - low is the offset of the data inside a stripe,
> +	       *  - stripe_offset is the disk offset from the beginning
> +	       *    of the disk chunk mapping start,

Err... I am confused here...

> +	       *  - csize is the "potential" data to read. It will be reduced to
> +	       *    size if the latter is smaller.
> +	       */
> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);

Here it seems to me that off is from the beginning of RAID volume space.
So, it does not agree with the comment above...

> +	      /*
> +	       * stripen is evaluated without considering
> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
> +	       */
> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);

This looks good.

> +	      /*
> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
> +	       * 2 for A3....). The math is performed modulo number of disks.
> +	       */
> +	      grub_divmod64 (high + stripen, nstripes, &stripen);

Ditto.

> +	      stripe_offset = low + chunk_stripe_length * high;

And here I am confused. Why "* high"? If chunk_stripe_length is stripe
length then stripe_offset leads to nowhere. Am I missing something?
I have a feeling that comments does not agree with the code somehow.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-07-09 10:20     ` Daniel Kiper
@ 2018-07-09 16:29       ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-07-09 16:29 UTC (permalink / raw)
  To: Daniel Kiper; +Cc: The development of GNU GRUB

On 07/09/2018 12:20 PM, Daniel Kiper wrote:
> On Sun, Jul 08, 2018 at 05:51:37PM +0200, Goffredo Baroncelli wrote:
>> A gentle ping.
> 
> Sorry for delay. I remember about this patchset but I am swamped with other stuff.
> I will take a look at it this week.
> 
> Daniel
> 
Thanks for the feedback: I was worried that my emails were lost somewhere in/with the spams :-)

BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-07-08 15:51   ` Goffredo Baroncelli
@ 2018-07-09 10:20     ` Daniel Kiper
  2018-07-09 16:29       ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-07-09 10:20 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: The development of GNU GRUB, Daniel Kiper

On Sun, Jul 08, 2018 at 05:51:37PM +0200, Goffredo Baroncelli wrote:
> A gentle ping.

Sorry for delay. I remember about this patchset but I am swamped with other stuff.
I will take a look at it this week.

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-19 17:39 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-07-08 15:51   ` Goffredo Baroncelli
  2018-07-09 10:20     ` Daniel Kiper
  2018-07-12 13:46   ` Daniel Kiper
  1 sibling, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-07-08 15:51 UTC (permalink / raw)
  To: The development of GNU GRUB, Daniel Kiper

A gentle ping.

BR
G.Baroncelli


On 06/19/2018 07:39 PM, Goffredo Baroncelli wrote:
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 70 insertions(+)
> 
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..fbabaebbe 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, stripe_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * Below is an example of a RAID 6 layout and the meaning of the
> +	       * variables. The same applies to RAID 5. The only differences is
> +	       * that there is only one parity disk instead of two.
> +	       *
> +	       * A RAID 6 layout consists of several stripes spread
> +	       * on the disks, following a layout like the one below
> +	       *
> +	       *   Disk0  Disk1  Disk2  Ddisk3
> +	       *
> +	       *    A1     B1     P1     Q1
> +	       *    Q2     A2     B2     P2
> +	       *    P3     Q3     A3     B3
> +	       *  [...]
> +	       *
> +	       *  Note that the placement of the parities depends on row index.
> +	       *  In the code below:
> +	       *  - stripe_nr is the stripe number not considering the parities
> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> +	       *  - stripen is the column number (or disk number),
> +	       *  - off is the logical address to read (from the beginning of
> +	       *    the chunk space),
> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
> +	       *  - nstripes is the number of disks,
> +	       *  - low is the offset of the data inside a stripe,
> +	       *  - stripe_offset is the disk offset from the beginning
> +	       *    of the disk chunk mapping start,
> +	       *  - csize is the "potential" data to read. It will be reduced to
> +	       *    size if the latter is smaller.
> +	       */
> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> +
> +	      /*
> +	       * stripen is evaluated without considering
> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
> +	       */
> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> +
> +	      /*
> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
> +	       * 2 for A3....). The math is performed modulo number of disks.
> +	       */
> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> +
> +	      stripe_offset = low + chunk_stripe_length * high;
> +	      csize = chunk_stripe_length - low;
> +
>  	      break;
>  	    }
>  	  default:
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-19 17:39 [PATCH V6] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-06-19 17:39 ` Goffredo Baroncelli
  2018-07-08 15:51   ` Goffredo Baroncelli
  2018-07-12 13:46   ` Daniel Kiper
  0 siblings, 2 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-19 17:39 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..fbabaebbe 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * Below is an example of a RAID 6 layout and the meaning of the
+	       * variables. The same applies to RAID 5. The only differences is
+	       * that there is only one parity disk instead of two.
+	       *
+	       * A RAID 6 layout consists of several stripes spread
+	       * on the disks, following a layout like the one below
+	       *
+	       *   Disk0  Disk1  Disk2  Ddisk3
+	       *
+	       *    A1     B1     P1     Q1
+	       *    Q2     A2     B2     P2
+	       *    P3     Q3     A3     B3
+	       *  [...]
+	       *
+	       *  Note that the placement of the parities depends on row index.
+	       *  In the code below:
+	       *  - stripe_nr is the stripe number not considering the parities
+	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
+	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
+	       *  - stripen is the column number (or disk number),
+	       *  - off is the logical address to read (from the beginning of
+	       *    the chunk space),
+	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
+	       *  - nstripes is the number of disks,
+	       *  - low is the offset of the data inside a stripe,
+	       *  - stripe_offset is the disk offset from the beginning
+	       *    of the disk chunk mapping start,
+	       *  - csize is the "potential" data to read. It will be reduced to
+	       *    size if the latter is smaller.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is evaluated without considering
+	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * stripen now considers also the parities (0 for A1, 1 for A2,
+	       * 2 for A3....). The math is performed modulo number of disks.
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.18.0.rc2



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-14 18:55     ` Goffredo Baroncelli
@ 2018-06-19 17:30       ` Goffredo Baroncelli
  0 siblings, 0 replies; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-19 17:30 UTC (permalink / raw)
  To: grub-devel

On 06/14/2018 08:55 PM, Goffredo Baroncelli wrote:
>>> +	       *  - stripe_offset is the offset from the beginning of the chunk
>>> +	       *    disks physical address,
>> I am not sure that I understand. Could clarify this?
> - stripe_offset is the offset (in bytes) from the beginning of the chunk portion 
>   stored on disk.
> 
> You can think "stripe_offset" as the "row" in the drawing, but measured in bytes.
> 

After some thoughts, I will update this description as:

-              *  - stripe_offset is the offset from the beginning of the chunk
-              *    disks physical address,
+              *  - stripe_offset is the disk offset from the beginning
+              *    of the disk chunk mapping start,

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-14 11:17   ` Daniel Kiper
@ 2018-06-14 18:55     ` Goffredo Baroncelli
  2018-06-19 17:30       ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-14 18:55 UTC (permalink / raw)
  To: grub-devel

On 06/14/2018 01:17 PM, Daniel Kiper wrote:
> On Sun, Jun 03, 2018 at 08:53:40PM +0200, Goffredo Baroncelli wrote:
>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>> ---
>>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 70 insertions(+)
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index be195448d..4d418859b 100644
>> --- a/grub-core/fs/btrfs.c
>> +++ b/grub-core/fs/btrfs.c
>> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
>> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
>> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>>    grub_uint8_t dummy2[0xc];
>>    grub_uint16_t nstripes;
>>    grub_uint16_t nsubstripes;
>> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>>  	      stripe_offset = low + chunk_stripe_length
>>  		* high;
>>  	      csize = chunk_stripe_length - low;
>> +	      break;
>> +	    }
>> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
>> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
>> +	    {
>> +	      grub_uint64_t nparities, stripe_nr, high, low;
>> +
>> +	      redundancy = 1;	/* no redundancy for now */
>> +
>> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
>> +		{
>> +		  grub_dprintf ("btrfs", "RAID5\n");
>> +		  nparities = 1;
>> +		}
>> +	      else
>> +		{
>> +		  grub_dprintf ("btrfs", "RAID6\n");
>> +		  nparities = 2;
>> +		}
>> +
>> +	      /*
>> +	       * Below is an example of a RAID 6 layout and the meaning of the
>> +	       * variables. The same applies to RAID 5. The only differences is
>> +	       * that there is only one parity disk instead of two.
>> +	       *
>> +	       * A RAID 6 layout consists of several stripes spread
>> +	       * on the disks, following a layout like the one below
>> +	       *
>> +	       *   Disk1  Disk2  Disk3  Ddisk4
> 
> Numbering seems confusing to me. I think that it should be
>                    Disk0  Disk1  Disk2  Disk3
> 
>> +	       *
>> +	       *    A1     B1     P1     Q1
>> +	       *    Q2     A2     B2     P2
>> +	       *    P3     Q3     A3     B3
>> +	       *  [...]
>> +	       *
>> +	       *  Note that the placement of the parities depends on row index.
>> +	       *  In the code below:
>> +	       *  - stripe_nr is the stripe number not considering the parities
>> +	       *    (A1=0, B1=1, A2 = 2, B2 = 3, ...),
> 
> Please be consistent. A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...
> 
>> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2..P2, ...),
> 
> Ditto. Please always use "..." not "..".
> 
>> +	       *  - stripen is the column number (or disk number),
> 
> AIUI starting from 0. Right? If yes then I think that
> drawing above requires disks/columns renumbering.
right
> 
>> +	       *  - off is the logical address to read (from the beginning of
>> +	       *    the chunk space),
> 
> s/chunk space/chunk/?
> 
>> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
>> +	       *  - nstripes is the number of disks,
>> +	       *  - low is the offset of the data inside a stripe,
>> +	       *  - stripe_offset is the offset from the beginning of the chunk
>> +	       *    disks physical address,
> 
> I am not sure that I understand. Could clarify this?
- stripe_offset is the offset (in bytes) from the beginning of the chunk portion 
  stored on disk.

You can think "stripe_offset" as the "row" in the drawing, but measured in bytes.

> 
>> +	       *  - csize is the "potential" data to read. It will be reduced to
>> +	       *    size if the latter is smaller.
>> +	       */
>> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> 
> OK.
> 
>> +	      /*
>> +	       * stripen is evaluated without considering
>> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
>> +	       */
>> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> 
> OK.
> 
>> +	      /*
>> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
>> +	       * 2 for A3....). The math is performed modulo number of disks.
>> +	       */
>> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> 
> OK.
> 
>> +	      stripe_offset = low + chunk_stripe_length * high;
> 
> Hmmm... I am confused. What does it mean?
> 
> Daniel
> 
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-03 18:53 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
@ 2018-06-14 11:17   ` Daniel Kiper
  2018-06-14 18:55     ` Goffredo Baroncelli
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Kiper @ 2018-06-14 11:17 UTC (permalink / raw)
  To: The development of GNU GRUB; +Cc: Goffredo Baroncelli

On Sun, Jun 03, 2018 at 08:53:40PM +0200, Goffredo Baroncelli wrote:
> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> ---
>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 70 insertions(+)
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..4d418859b 100644
> --- a/grub-core/fs/btrfs.c
> +++ b/grub-core/fs/btrfs.c
> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
>    grub_uint8_t dummy2[0xc];
>    grub_uint16_t nstripes;
>    grub_uint16_t nsubstripes;
> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
>  	      stripe_offset = low + chunk_stripe_length
>  		* high;
>  	      csize = chunk_stripe_length - low;
> +	      break;
> +	    }
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> +	    {
> +	      grub_uint64_t nparities, stripe_nr, high, low;
> +
> +	      redundancy = 1;	/* no redundancy for now */
> +
> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> +		{
> +		  grub_dprintf ("btrfs", "RAID5\n");
> +		  nparities = 1;
> +		}
> +	      else
> +		{
> +		  grub_dprintf ("btrfs", "RAID6\n");
> +		  nparities = 2;
> +		}
> +
> +	      /*
> +	       * Below is an example of a RAID 6 layout and the meaning of the
> +	       * variables. The same applies to RAID 5. The only differences is
> +	       * that there is only one parity disk instead of two.
> +	       *
> +	       * A RAID 6 layout consists of several stripes spread
> +	       * on the disks, following a layout like the one below
> +	       *
> +	       *   Disk1  Disk2  Disk3  Ddisk4

Numbering seems confusing to me. I think that it should be
                   Disk0  Disk1  Disk2  Disk3

> +	       *
> +	       *    A1     B1     P1     Q1
> +	       *    Q2     A2     B2     P2
> +	       *    P3     Q3     A3     B3
> +	       *  [...]
> +	       *
> +	       *  Note that the placement of the parities depends on row index.
> +	       *  In the code below:
> +	       *  - stripe_nr is the stripe number not considering the parities
> +	       *    (A1=0, B1=1, A2 = 2, B2 = 3, ...),

Please be consistent. A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...

> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2..P2, ...),

Ditto. Please always use "..." not "..".

> +	       *  - stripen is the column number (or disk number),

AIUI starting from 0. Right? If yes then I think that
drawing above requires disks/columns renumbering.

> +	       *  - off is the logical address to read (from the beginning of
> +	       *    the chunk space),

s/chunk space/chunk/?

> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
> +	       *  - nstripes is the number of disks,
> +	       *  - low is the offset of the data inside a stripe,
> +	       *  - stripe_offset is the offset from the beginning of the chunk
> +	       *    disks physical address,

I am not sure that I understand. Could clarify this?

> +	       *  - csize is the "potential" data to read. It will be reduced to
> +	       *    size if the latter is smaller.
> +	       */
> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);

OK.

> +	      /*
> +	       * stripen is evaluated without considering
> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
> +	       */
> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);

OK.

> +	      /*
> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
> +	       * 2 for A3....). The math is performed modulo number of disks.
> +	       */
> +	      grub_divmod64 (high + stripen, nstripes, &stripen);

OK.

> +	      stripe_offset = low + chunk_stripe_length * high;

Hmmm... I am confused. What does it mean?

Daniel


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
  2018-06-03 18:53 [PATCH V5] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
@ 2018-06-03 18:53 ` Goffredo Baroncelli
  2018-06-14 11:17   ` Daniel Kiper
  0 siblings, 1 reply; 48+ messages in thread
From: Goffredo Baroncelli @ 2018-06-03 18:53 UTC (permalink / raw)
  To: grub-devel; +Cc: Goffredo Baroncelli

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..4d418859b 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
 #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
 #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
 #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
   grub_uint8_t dummy2[0xc];
   grub_uint16_t nstripes;
   grub_uint16_t nsubstripes;
@@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
 	      stripe_offset = low + chunk_stripe_length
 		* high;
 	      csize = chunk_stripe_length - low;
+	      break;
+	    }
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
+	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
+	    {
+	      grub_uint64_t nparities, stripe_nr, high, low;
+
+	      redundancy = 1;	/* no redundancy for now */
+
+	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+		{
+		  grub_dprintf ("btrfs", "RAID5\n");
+		  nparities = 1;
+		}
+	      else
+		{
+		  grub_dprintf ("btrfs", "RAID6\n");
+		  nparities = 2;
+		}
+
+	      /*
+	       * Below is an example of a RAID 6 layout and the meaning of the
+	       * variables. The same applies to RAID 5. The only differences is
+	       * that there is only one parity disk instead of two.
+	       *
+	       * A RAID 6 layout consists of several stripes spread
+	       * on the disks, following a layout like the one below
+	       *
+	       *   Disk1  Disk2  Disk3  Ddisk4
+	       *
+	       *    A1     B1     P1     Q1
+	       *    Q2     A2     B2     P2  
+	       *    P3     Q3     A3     B3
+	       *  [...]
+	       *
+	       *  Note that the placement of the parities depends on row index.
+	       *  In the code below:
+	       *  - stripe_nr is the stripe number not considering the parities
+	       *    (A1=0, B1=1, A2 = 2, B2 = 3, ...),
+	       *  - high is the row number (0 for A1...Q1, 1 for Q2..P2, ...),
+	       *  - stripen is the column number (or disk number),
+	       *  - off is the logical address to read (from the beginning of
+	       *    the chunk space),
+	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
+	       *  - nstripes is the number of disks,
+	       *  - low is the offset of the data inside a stripe,
+	       *  - stripe_offset is the offset from the beginning of the chunk
+	       *    disks physical address,
+	       *  - csize is the "potential" data to read. It will be reduced to
+	       *    size if the latter is smaller.
+	       */
+	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
+
+	      /*
+	       * stripen is evaluated without considering
+	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
+	       */
+	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
+
+	      /*
+	       * stripen now considers also the parities (0 for A1, 1 for A2,
+	       * 2 for A3....). The math is performed modulo number of disks.
+	       */
+	      grub_divmod64 (high + stripen, nstripes, &stripen);
+
+	      stripe_offset = low + chunk_stripe_length * high;
+	      csize = chunk_stripe_length - low;
+
 	      break;
 	    }
 	  default:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2018-10-22 17:30 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-05-16 18:48 ` [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-05-30 10:07   ` Daniel Kiper
2018-06-01 18:50     ` Goffredo Baroncelli
2018-05-16 18:48 ` [PATCH 2/9] btrfs: add helper to check the btrfs header Goffredo Baroncelli
2018-05-30 10:12   ` Daniel Kiper
2018-05-16 18:48 ` [PATCH 3/9] btrfs: move the error logging from find_device() to its callee Goffredo Baroncelli
2018-05-30 10:25   ` Daniel Kiper
2018-05-16 18:48 ` [PATCH 4/9] btrfs: avoiding a rescan for a device which was already not founded Goffredo Baroncelli
2018-05-30 10:43   ` Daniel Kiper
2018-05-16 18:48 ` [PATCH 5/9] btrfs: move logging code in grub_btrfs_read_logical() Goffredo Baroncelli
2018-05-30 10:55   ` Daniel Kiper
2018-05-16 18:48 ` [PATCH 6/9] btrfs: refactor the code that read from disk Goffredo Baroncelli
2018-05-30 11:07   ` Daniel Kiper
2018-06-01 18:50     ` Goffredo Baroncelli
2018-05-16 18:48 ` [PATCH 7/9] btrfs: add support for recovery for a RAID 5 btrfs profiles Goffredo Baroncelli
2018-05-30 11:30   ` Daniel Kiper
2018-06-01 18:50     ` Goffredo Baroncelli
2018-05-16 18:48 ` [PATCH 8/9] btrfs: make more generic the code for RAID 6 rebuilding Goffredo Baroncelli
2018-05-30 12:05   ` Daniel Kiper
2018-06-01 18:50     ` Goffredo Baroncelli
2018-05-16 18:48 ` [PATCH 9/9] btrfs: add RAID 6 recovery for a btrfs filesystem Goffredo Baroncelli
2018-05-30 12:15   ` Daniel Kiper
2018-06-01 18:50     ` Goffredo Baroncelli
2018-06-03 18:53 [PATCH V5] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-06-03 18:53 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-06-14 11:17   ` Daniel Kiper
2018-06-14 18:55     ` Goffredo Baroncelli
2018-06-19 17:30       ` Goffredo Baroncelli
2018-06-19 17:39 [PATCH V6] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-06-19 17:39 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-07-08 15:51   ` Goffredo Baroncelli
2018-07-09 10:20     ` Daniel Kiper
2018-07-09 16:29       ` Goffredo Baroncelli
2018-07-12 13:46   ` Daniel Kiper
2018-07-18  6:24     ` Goffredo Baroncelli
2018-09-03 12:52       ` Daniel Kiper
2018-09-19 18:36 Goffredo Baroncelli
2018-09-19 18:42 ` Goffredo Baroncelli
2018-09-19 18:40 [PATCH V7] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-09-19 18:40 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-09-25 15:31   ` Daniel Kiper
2018-09-26 20:40     ` Goffredo Baroncelli
2018-09-27 15:47       ` Daniel Kiper
2018-09-27 18:34 [PATCH V8] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-09-27 18:34 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-09 17:51   ` Daniel Kiper
2018-10-11 13:17     ` Daniel Kiper
2018-10-11 18:50 [PATCH V9] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-10-11 18:50 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-17 13:46   ` Daniel Kiper
2018-10-18 17:55 [PATCH V10] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-10-18 17:55 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-22 17:29 [PATCH V11] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-10-22 17:29 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.