All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] Recording extents in GFS2
@ 2020-12-11 16:38 Abhijith Das
  2020-12-14 10:46 ` Steven Whitehouse
  2021-01-24  6:44 ` Abhijith Das
  0 siblings, 2 replies; 13+ messages in thread
From: Abhijith Das @ 2020-12-11 16:38 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi all,

With a recent set of patches, we nearly eliminated the per_node statfs
change files by recording that info in the journal. The files and some
recovery code remain only for backward compatibility. Similarly, I'd
like to get rid of the per_node quota change files and record that
info in the journal as well.

I've been talking to Andreas and Bob a bit about this and I'm
investigating how we can record extents as we allocate and deallocate
blocks instead of writing whole blocks. I'm looking into how XFS does
this.

We could have a new journal block type that adds a list of extents to
inodes with alloc/dealloc info. We could add in quota (uid/gid) info
to this as well. If we can do this right, the representation of
alloc/dealloc becomes compact and consequently we use journal space
more efficiently. We can hopefully avoid cases where we need to zero
out blocks during allocation as well.

I'm sending this out to start a discussion and to get ideas/comments/pointers.

Cheers!
--Abhi



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2020-12-11 16:38 [Cluster-devel] Recording extents in GFS2 Abhijith Das
@ 2020-12-14 10:46 ` Steven Whitehouse
  2021-01-24  6:44 ` Abhijith Das
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Whitehouse @ 2020-12-14 10:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 11/12/2020 16:38, Abhijith Das wrote:
> Hi all,
>
> With a recent set of patches, we nearly eliminated the per_node statfs
> change files by recording that info in the journal. The files and some
> recovery code remain only for backward compatibility. Similarly, I'd
> like to get rid of the per_node quota change files and record that
> info in the journal as well.
>
> I've been talking to Andreas and Bob a bit about this and I'm
> investigating how we can record extents as we allocate and deallocate
> blocks instead of writing whole blocks. I'm looking into how XFS does
> this.
>
> We could have a new journal block type that adds a list of extents to
> inodes with alloc/dealloc info. We could add in quota (uid/gid) info
> to this as well. If we can do this right, the representation of
> alloc/dealloc becomes compact and consequently we use journal space
> more efficiently. We can hopefully avoid cases where we need to zero
> out blocks during allocation as well.
>
> I'm sending this out to start a discussion and to get ideas/comments/pointers.
>
> Cheers!
> --Abhi
>
I think you need to propose something a bit more concrete. For example 
what will the data structures look like? How many entries will fit in a 
journal block at different block sizes? How will we ensure that this is 
backwards compatible? That will make it easier to have the discussions,

Steve.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2020-12-11 16:38 [Cluster-devel] Recording extents in GFS2 Abhijith Das
  2020-12-14 10:46 ` Steven Whitehouse
@ 2021-01-24  6:44 ` Abhijith Das
  2021-02-02 15:08   ` Bob Peterson
  2021-02-02 17:35   ` Steven Whitehouse
  1 sibling, 2 replies; 13+ messages in thread
From: Abhijith Das @ 2021-01-24  6:44 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi all,

I've been looking at rgrp.c:gfs2_alloc_blocks(), which is called from
various places to allocate single/multiple blocks for inodes. I've come up
with some data structures to accomplish recording of these allocations as
extents.

I'm proposing we add a new metadata type for journal blocks that will hold
these extent records.

GFS2_METATYPE_EX 15 /* New metadata type for a block that will hold extents
 */

This structure below will be at the start of the block, followed by a
number of alloc_ext structures.

struct gfs2_extents { /* This structure is 32 bytes long */
    struct gfs2_meta_header ex_header;
    __be32 ex_count; /* count of number of alloc_ext structs that follow
this header. */
    __be32 __pad;
};
/* flags for the alloc_ext struct */
#define AE_FL_XXX

struct alloc_ext { /* This structure is 48 bytes long */
    struct gfs2_inum ae_num; /* The inode this allocation/deallocation
belongs to */
    __be32 ae_flags; /* specifies if we're allocating/deallocating,
data/metadata, etc. */
    __be64 ae_start; /* starting physical block number of the extent */
    __be64 ae_len;   /* length of the extent */
    __be32 ae_uid;   /* user this belongs to, for quota accounting */
    __be32 ae_gid;   /* group this belongs to, for quota accounting */
    __be32 __pad;
};

With 4k block sizes, we can fit 84 extents (10 for 512b, 20 for 1k, 42 for
2k block sizes) in one block. As we process more allocs/deallocs, we keep
creating more such alloc_ext records and tack them to the back of this
block if there's space or else create a new block. For smaller extents,
this might not be efficient, so we might just want to revert to the old
method of recording the bitmap blocks instead.
During journal replay, we decode these new blocks and flip the
corresponding bitmaps for each of the blocks represented in the extents.
For the ones where we just recorded the bitmap blocks the old-fashioned
way, we also replay them the old-fashioned way. This way we're also
backward compatible with an older version of gfs2 that only records the
bitmaps.
Since we record the uid/gid with each extent, we can do the quota
accounting without relying on the quota change file. We might need to keep
the quota change file around for backward compatibility and for the cases
where we might want to record allocs/deallocs the old-fashioned way.

I'm going to play around with this and come up with some patches to see if
this works and what kind of performance improvements we get. These data
structures will mostly likely need reworking and renaming, but this is the
general direction I'm thinking along.

Please let me know what you think.

Cheers!
--Abhi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210124/4958c390/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-01-24  6:44 ` Abhijith Das
@ 2021-02-02 15:08   ` Bob Peterson
  2021-02-02 17:35   ` Steven Whitehouse
  1 sibling, 0 replies; 13+ messages in thread
From: Bob Peterson @ 2021-02-02 15:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> Hi all,
> 
> I've been looking at rgrp.c:gfs2_alloc_blocks(), which is called from
> various places to allocate single/multiple blocks for inodes. I've come up
> with some data structures to accomplish recording of these allocations as
> extents.
> 
> I'm proposing we add a new metadata type for journal blocks that will hold
> these extent records.
> 
> GFS2_METATYPE_EX 15 /* New metadata type for a block that will hold extents
>  */
> 
> This structure below will be at the start of the block, followed by a
> number of alloc_ext structures.
> 
> struct gfs2_extents { /* This structure is 32 bytes long */
>     struct gfs2_meta_header ex_header;
>     __be32 ex_count; /* count of number of alloc_ext structs that follow
> this header. */
>     __be32 __pad;
> };
> /* flags for the alloc_ext struct */
> #define AE_FL_XXX
> 
> struct alloc_ext { /* This structure is 48 bytes long */
>     struct gfs2_inum ae_num; /* The inode this allocation/deallocation
> belongs to */
>     __be32 ae_flags; /* specifies if we're allocating/deallocating,
> data/metadata, etc. */
>     __be64 ae_start; /* starting physical block number of the extent */
>     __be64 ae_len;   /* length of the extent */
>     __be32 ae_uid;   /* user this belongs to, for quota accounting */
>     __be32 ae_gid;   /* group this belongs to, for quota accounting */
>     __be32 __pad;
> };
> 
> With 4k block sizes, we can fit 84 extents (10 for 512b, 20 for 1k, 42 for
> 2k block sizes) in one block. As we process more allocs/deallocs, we keep
> creating more such alloc_ext records and tack them to the back of this
> block if there's space or else create a new block. For smaller extents,
> this might not be efficient, so we might just want to revert to the old
> method of recording the bitmap blocks instead.
> During journal replay, we decode these new blocks and flip the
> corresponding bitmaps for each of the blocks represented in the extents.
> For the ones where we just recorded the bitmap blocks the old-fashioned
> way, we also replay them the old-fashioned way. This way we're also
> backward compatible with an older version of gfs2 that only records the
> bitmaps.
> Since we record the uid/gid with each extent, we can do the quota
> accounting without relying on the quota change file. We might need to keep
> the quota change file around for backward compatibility and for the cases
> where we might want to record allocs/deallocs the old-fashioned way.
> 
> I'm going to play around with this and come up with some patches to see if
> this works and what kind of performance improvements we get. These data
> structures will mostly likely need reworking and renaming, but this is the
> general direction I'm thinking along.
> 
> Please let me know what you think.
> 
> Cheers!
> --Abhi
> 
Hi Abhi,

Thanks for working on this. I just want to throw some thoughts out,
as long as we're in the early stages of this.

I'm concerned about whether we need to worry about these new records
being encountered during a journal replay on an old kernel that knows
nothing about them, and how we handle that. We'll need a plan going in,
but we're already talking about changes to the on-disk format and
version numbers to keep that straight. So I assume we're okay there.

It sounds like a journal replay may encounter metadata records for both
resource groups, bitmaps, and these new journal entries.
Since this is not really metadata, but a representation thereof, I wonder
if we should make this new record a new kind of journal block. After
all, they should only appear in journals. In other words, today we
have (1) log headers and (2) log descriptors. Maybe these should be
log modifiers or something? There may be advantages and disadvantages.

The reason I bring this up is: I'm concerned that journal replay might
get the ordering wrong. In other words, if journal replay encounters
a metadata block for a resource group bitmap, and rewrites the in-place
block, then it encounters one of these new gfs2_extents blocks for the
same bitmap, it needs to get the order right with regard to whether the
extents should be carved out of the original bitmap or the replayed one.
The order of the metadata in the log descriptors depends entirely on
the order in which they appear within the ail lists as they're added to
the transaction. I'm not convinced we get the order "right" today, but
today it doesn't matter because there will only be one copy of the bitmap
per transaction. With extents, we will potentially have more than one,
which means we need to guarantee the order is correct, or at least
guard against illegal bitmap changes caused by them.

With today's scheme of journaling the entire bitmap, we use our scheme
of revoking the metadata that's been written once it's safely written
back. So we need a way to do something similar for these extent blocks.
This is mostly accomplished by way of the journal sequence numbers in
the log headers. So maybe we can leverage these same sequence numbers
to guarantee the order and ensure some kind of revoke process.

It's likely to get messy. But probably still worth the effort.

Unless, of course, we go to an all-or-nothing scheme: IOW, either we
have all bitmap changes journaled as either (a) metadata or as (b) extents
but not both. That could, as you brought up, affect performance for
smaller allocations.

Another thought is that maybe we could toss these records into the
log headers? I suppose there are probably better long-term uses for
that space though.

Regards,

Bob Peterson
Red Hat File SYstems



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-01-24  6:44 ` Abhijith Das
  2021-02-02 15:08   ` Bob Peterson
@ 2021-02-02 17:35   ` Steven Whitehouse
  2021-02-20  9:48     ` Andreas Gruenbacher
  2021-03-01 17:53     ` Andreas Gruenbacher
  1 sibling, 2 replies; 13+ messages in thread
From: Steven Whitehouse @ 2021-02-02 17:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 24/01/2021 06:44, Abhijith Das wrote:
> Hi all,
>
> I've been looking at rgrp.c:gfs2_alloc_blocks(), which is called from 
> various places to allocate single/multiple blocks for inodes. I've 
> come up with some data structures to accomplish recording of these 
> allocations as extents.
>
> I'm proposing we add a new metadata type for journal blocks that will 
> hold these extent records.
>
> GFS2_METATYPE_EX 15 /* New metadata type for a block that will hold 
> extents?*/
>
> This structure below will be at the start of the block, followed by a 
> number of alloc_ext structures.
>
> struct gfs2_extents {/* This structure is 32 bytes long */
> struct gfs2_meta_header ex_header;
> __be32 ex_count;/* count of number of alloc_ext structs that follow 
> this header. */
> __be32 __pad;
> };
> /* flags for the alloc_ext struct */
> #define AE_FL_XXX
>
> struct alloc_ext {/* This structure is 48 bytes long */
> struct gfs2_inum ae_num;/* The inode this allocation/deallocation 
> belongs to */
> __be32 ae_flags;/* specifies if we're allocating/deallocating, 
> data/metadata, etc. */
> __be64 ae_start;/* starting physical block number of the extent */
> __be64 ae_len;? ?/* length of the extent */
> __be32 ae_uid;? ?/* user this belongs to, for quota accounting */
> __be32 ae_gid;? ?/* group this belongs to, for quota accounting */
> __be32 __pad;
> };
>
The gfs2_inum structure is a bit OTT for this I think. A single 64 bit 
inode number should be enough? Also, it is quite likely we may have 
multiple extents for the same inode... so should we split this into two 
so we can have something like this? It is more complicated, but should 
save space in the average case.

struct alloc_hdr {

 ??? __be64 inum;

 ??? __be32 uid; /* This is duplicated from the inode... various options 
here depending on whether we think this is something we should do. 
Should we also consider logging chown using this structure? We will have 
to carefully check chown sequence wrt to allocations/deallocations for 
quota purposes */

 ??? __be32 gid;

 ??? __u8 num_extents; /* Never likely to have huge numbers of extents 
per header, due to block size! */

 ??? /* padding... or is there something else we could/should add here? */

};

followed by num_extents copies of:

struct alloc_extent {

 ??? __be64 phys_start;

 ??? __be64 logical_start; /* Do we need a logical & physical start? 
Maybe we don't care about the logical start? */

 ??? __be32 length; /* Max extent length is limited by rgrp length... 
only need 32 bits */

 ??? __be32 flags; /* Can we support unwritten, zero extents with this? 
Need to indicate alloc/free/zero, data/metadata */

};

Just wondering if there is also some shorthand we might be able to use 
in case we have multiple extents all separated by either one metadata 
block, or a very small number of metadata blocks (which will be the case 
for streaming writes). Again it increases the complexity, but will 
likely reduce the amount we have to write into the new journal blocks 
quite a lot. Not much point having a 32 bit length, but never filling it 
with a value above 509 (4k block size)...


> With 4k block sizes, we can fit 84 extents (10 for 512b, 20 for 1k, 42 
> for 2k block sizes) in one block. As we process more allocs/deallocs, 
> we keep creating more such alloc_ext records and tack them to the back 
> of this block if there's space or else create a new block. For smaller 
> extents, this might not be efficient, so we might just want to revert 
> to the old method of recording the bitmap blocks instead.
> During journal replay, we decode these new blocks and flip the 
> corresponding bitmaps for each of the blocks represented in the 
> extents. For the ones where we just recorded the bitmap blocks the 
> old-fashioned way, we also replay them the old-fashioned way. This way 
> we're also backward compatible with an older version of gfs2 that only 
> records the bitmaps.
> Since we record the uid/gid with each extent, we can do the quota 
> accounting without relying on the quota change file. We might need to 
> keep the quota change file around for backward compatibility and for 
> the cases where we might want to record allocs/deallocs the 
> old-fashioned way.
>
> I'm going to play around with this and come up with some patches to 
> see if this works and what kind of performance improvements we get. 
> These data structures will mostly likely need reworking and renaming, 
> but this is the general direction I'm thinking along.
>
> Please let me know what you think.
>
> Cheers!
> --Abhi

That all sounds good. I'm sure it will take a little while to figure out 
how to get this right,

Steve.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210202/6f21c571/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-02 17:35   ` Steven Whitehouse
@ 2021-02-20  9:48     ` Andreas Gruenbacher
  2021-02-22 10:20       ` Steven Whitehouse
  2021-03-01 17:53     ` Andreas Gruenbacher
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2021-02-20  9:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi all,

once we change the journal format, in addition to recording block numbers
as extents, there are some additional issues we should address at the same
time:

I. The current transaction format of our journals is as follows:

   - One METADATA log descriptor block for each [503 / 247 / 119 / 55]
   metadata blocks, followed by those metadata blocks. For each metadata
   block, the log descriptor records the 64-bit block number.
   - One JDATA log descriptor block for each [251 / 123 / 59 / 27] metadata
   blocks, followed by those metadata blocks. For each metadata block, the log
   descriptor records the 64-bit block number and another 64-bit field for
   indicating whether the block needed escaping.
   - One REVOKE log descriptor block for the initial [503 / 247 / 119 / 55]
   revokes, followed by a metadata header (not to be confused with the log
   header) for each additional [509 / 253 / 125 / 61] revokes. Each revoke is
   recorded as a 64-bit block number in its REVOKE log descriptor or metadata
   header.
   - One log header with various necessary and useful metadata that acts as
   a COMMIT record. If the log header is incorrect or missing, the preceding
   log descriptors are ignored.

We should change that so that a single log descriptor contains a number of
records. There should be records for METADATA and JDATA blocks that follow,
as well as for REVOKES and for COMMIT. If a transaction contains metadata
and/or jdata blocks, those will obviously need a precursor and a commit
block like today, but we shouldn't need separate blocks for metadata and
journaled data in many cases. Small transactions that only consist of
revokes and of a commit should frequently fit into a single block entirely,
though.

Right now, we're writing log headers ("commits") with REQ_PREFLUSH to make
sure all the log descriptors of a transaction make it to disk before the
log header. Depending on the device, this is often costly. If we can fit an
entire transaction into a single block, REQ_PREFLUSH won't be needed
anymore.

III. We could also checksum entire transactions to avoid REQ_PREFLUSH. At
replay time, all the blocks that make up a transaction will either be there
and the checksum will match, or the transaction will be invalid. This
should be less prohibitively expensive with CPU support for CRC32C
nowadays, but depending on the hardware, it may make sense to turn this off.

IV. We need recording of unwritten blocks / extents (allocations /
fallocate) as this will significantly speed up moving glocks from one node
to another:

At the moment, data=ordered is implemented by keeping a list of all inodes
that did an ordered write. When it comes time to flush the log, the data of
all those ordered inodes is flushed first. When all we want is to flush a
single glock in order to move it to a different node, we currently flush
all the ordered inodes as well as the journal.

If we only flushed the ordered data of the glock being moved plus the
entire journal, the ordering guarantees for the other ordered inodes in the
journal would be violated. In that scenario, unwritten blocks could (and
would) show up in files after crashes.

If we instead record unwritten blocks in the journal, we'll know which
blocks need to be zeroed out at recovery time. Once an unwritten block is
written, we record a REVOKE entry for that block.

This comes at the cost of tracking those blocks of course, but with that in
place, moving a glock from one node to another will only require flushing
the underlying inode (assuming it's a inode glock) and the journal. And
most likely, we won't have to bother with implementing "simple"
transactions as described in
https://bugzilla.redhat.com/show_bug.cgi?id=1631499.

Thanks,
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210220/ed43bb40/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-20  9:48     ` Andreas Gruenbacher
@ 2021-02-22 10:20       ` Steven Whitehouse
  2021-02-22 11:41         ` Andreas Gruenbacher
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Whitehouse @ 2021-02-22 10:20 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 20/02/2021 09:48, Andreas Gruenbacher wrote:
> Hi all,
>
> once we change the journal format, in addition to recording block 
> numbers as extents, there are some additional issues we should address 
> at the same time:
>
> I. The current transaction format of our journals is as follows:
>
>   * One METADATA log descriptor block for each [503 / 247 / 119 / 55]
>     metadata blocks, followed by those metadata blocks. For each
>     metadata block, the log descriptor records the 64-bit block number.
>   * One JDATA log descriptor block for each [251 / 123 / 59 / 27]
>     metadata blocks, followed by those metadata blocks. For each
>     metadata block, the log descriptor records the 64-bit block number
>     and another 64-bit field for indicating whether the block needed
>     escaping.
>   * One REVOKE log descriptor block for the initial [503 / 247 / 119 /
>     55] revokes, followed by a metadata header (not to be confused
>     with the log header) for each additional [509 / 253 / 125 / 61]
>     revokes. Each revoke is recorded as a 64-bit block number in its
>     REVOKE log descriptor or metadata header.
>   * One log header with various necessary and useful metadata that
>     acts as a COMMIT record. If the log header is incorrect or
>     missing, the preceding log descriptors are ignored.
>
^^^^ succeeding? (I hope!)
> We should change that so that a single log descriptor contains a 
> number of records. There should be records for METADATA and JDATA 
> blocks that follow, as well as for REVOKES and for COMMIT. If a 
> transaction contains metadata and/or jdata blocks, those will 
> obviously need a precursor and a commit block like today, but we 
> shouldn't need separate blocks for metadata and journaled data in many 
> cases. Small transactions that only consist of revokes and of a commit 
> should frequently fit into a single block entirely, though.
>
Yes, it makes sense to try and condense what we are writing. Why would 
we not need to have separate blocks for journaled data though? That one 
seems difficult to avoid, and since it is used so infrequently, perhaps 
not such an important issue.


> Right now, we're writing log headers ("commits") with REQ_PREFLUSH to 
> make sure all the log descriptors of a transaction make it to disk 
> before the log header. Depending on the device, this is often costly. 
> If we can fit an entire transaction into a single block, REQ_PREFLUSH 
> won't be needed anymore.

I'm not sure I agree. The purpose of the preflush is to ensure that the 
data and the preceding log blocks are really on disk before we write the 
commit record. That will still be required while we use ordered writes, 
even if we can use (as you suggest below) a checksum to cover the whole 
transaction, and thus check for a complete log record after the fact. 
Also, we would still have to issue the flush in the case of a fsync 
derived log flush too.


>
> III. We could also checksum entire transactions to avoid REQ_PREFLUSH. 
> At replay time, all the blocks that make up a transaction will either 
> be there and the checksum will match, or the transaction will be 
> invalid. This should be less prohibitively expensive with CPU support 
> for CRC32C nowadays, but depending on the hardware, it may make sense 
> to turn this off.
>
> IV. We need recording of unwritten blocks / extents (allocations / 
> fallocate) as this will significantly speed up moving glocks from one 
> node to another:

That would definitely be a step forward.


>
> At the moment, data=ordered is implemented by keeping a list of all 
> inodes that did an ordered write. When it comes time to flush the log, 
> the data of all those ordered inodes is flushed first. When all we 
> want is to flush a single glock in order to move it to a different 
> node, we currently flush all the ordered inodes as well as the journal.
>
> If we only flushed the ordered data of the glock being moved plus the 
> entire journal, the ordering guarantees for the other ordered inodes 
> in the journal would be violated. In that scenario, unwritten blocks 
> could (and would) show up in files after crashes.
>
> If we instead record unwritten blocks in the journal, we'll know which 
> blocks need to be zeroed out at recovery time. Once an unwritten block 
> is written, we record a REVOKE entry for that block.
>
> This comes at the cost of tracking those blocks of course, but with 
> that in place, moving a glock from one node to another will only 
> require flushing the underlying inode (assuming it's a inode glock) 
> and the journal. And most likely, we won't have to bother with 
> implementing "simple" transactions as described in 
> https://bugzilla.redhat.com/show_bug.cgi?id=1631499.
>
> Thanks,
> Andreas

That would be another way of looking at the problem, yes. It does add a 
lot to the complexity though, and it doesn't scale very well on systems 
with large amounts of memory (and therefore potentially lots of 
unwritten extents to record & keep track of). If there are lots of small 
transactions, then each one might be significantly expanded by the need 
to write the info to track the things which have not been written yet.

If we keep track of individual allocations/deallocations, as per Abhi's 
suggestion, then we know where the areas are which may potentially have 
unwritten data in them. That may allow us to avoid having to do the data 
writeback ahead of the journal flush in the first place - moving 
something more towards the XFS way of doing things. We would have to 
ensure that we did get data written back before the allocation records 
vanish from the active part of the log though, so a slightly different 
constraint to currently,

Steve.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210222/76f29712/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-22 10:20       ` Steven Whitehouse
@ 2021-02-22 11:41         ` Andreas Gruenbacher
  2021-02-22 13:03           ` Andreas Gruenbacher
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2021-02-22 11:41 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, Feb 22, 2021 at 11:21 AM Steven Whitehouse <swhiteho@redhat.com>
wrote:

> Hi,
> On 20/02/2021 09:48, Andreas Gruenbacher wrote:
>
> Hi all,
>
> once we change the journal format, in addition to recording block numbers
> as extents, there are some additional issues we should address at the same
> time:
>
> I. The current transaction format of our journals is as follows:
>
>    - One METADATA log descriptor block for each [503 / 247 / 119 / 55]
>    metadata blocks, followed by those metadata blocks. For each metadata
>    block, the log descriptor records the 64-bit block number.
>    - One JDATA log descriptor block for each [251 / 123 / 59 / 27]
>    metadata blocks, followed by those metadata blocks. For each metadata
>    block, the log descriptor records the 64-bit block number and another
>    64-bit field for indicating whether the block needed escaping.
>    - One REVOKE log descriptor block for the initial [503 / 247 / 119 /
>    55] revokes, followed by a metadata header (not to be confused with the log
>    header) for each additional [509 / 253 / 125 / 61] revokes. Each revoke is
>    recorded as a 64-bit block number in its REVOKE log descriptor or metadata
>    header.
>    - One log header with various necessary and useful metadata that acts
>    as a COMMIT record. If the log header is incorrect or missing, the
>    preceding log descriptors are ignored.
>
>                                                                   ^^^^
> succeeding? (I hope!)
>

No, we call lops_before_commit (which writes the various log descriptors,
metadata, and journaled data blocks) before writing the log header in
log_write_header -> gfs2_write_log_header. In that sense, we could call it
a trailer.

We should change that so that a single log descriptor contains a number of
> records. There should be records for METADATA and JDATA blocks that follow,
> as well as for REVOKES and for COMMIT. If a transaction contains metadata
> and/or jdata blocks, those will obviously need a precursor and a commit
> block like today, but we shouldn't need separate blocks for metadata and
> journaled data in many cases. Small transactions that only consist of
> revokes and of a commit should frequently fit into a single block entirely,
> though.
>
> Yes, it makes sense to try and condense what we are writing. Why would we
> not need to have separate blocks for journaled data though? That one seems
> difficult to avoid, and since it is used so infrequently, perhaps not such
> an important issue.
>
Journaled data would of course still need to be written. We could have a
single log descriptor with METADATA and JDATA records, followed by the
metadata and journaled data blocks, followed by a log descriptor with a
COMMIT record.

> Right now, we're writing log headers ("commits") with REQ_PREFLUSH to make
> sure all the log descriptors of a transaction make it to disk before the
> log header. Depending on the device, this is often costly. If we can fit an
> entire transaction into a single block, REQ_PREFLUSH won't be needed
> anymore.
>
> I'm not sure I agree. The purpose of the preflush is to ensure that the
> data and the preceding log blocks are really on disk before we write the
> commit record. That will still be required while we use ordered writes,
> even if we can use (as you suggest below) a checksum to cover the whole
> transaction, and thus check for a complete log record after the fact. Also,
> we would still have to issue the flush in the case of a fsync derived log
> flush too.
>
>
>
> III. We could also checksum entire transactions to avoid REQ_PREFLUSH. At
> replay time, all the blocks that make up a transaction will either be there
> and the checksum will match, or the transaction will be invalid. This
> should be less prohibitively expensive with CPU support for CRC32C
> nowadays, but depending on the hardware, it may make sense to turn this off.
>
> IV. We need recording of unwritten blocks / extents (allocations /
> fallocate) as this will significantly speed up moving glocks from one node
> to another:
>
> That would definitely be a step forward.
>
>
>
> At the moment, data=ordered is implemented by keeping a list of all inodes
> that did an ordered write. When it comes time to flush the log, the data of
> all those ordered inodes is flushed first. When all we want is to flush a
> single glock in order to move it to a different node, we currently flush
> all the ordered inodes as well as the journal.
>
> If we only flushed the ordered data of the glock being moved plus the
> entire journal, the ordering guarantees for the other ordered inodes in the
> journal would be violated. In that scenario, unwritten blocks could (and
> would) show up in files after crashes.
>
> If we instead record unwritten blocks in the journal, we'll know which
> blocks need to be zeroed out at recovery time. Once an unwritten block is
> written, we record a REVOKE entry for that block.
>
> This comes at the cost of tracking those blocks of course, but with that
> in place, moving a glock from one node to another will only require
> flushing the underlying inode (assuming it's a inode glock) and the
> journal. And most likely, we won't have to bother with implementing "simple"
> transactions as described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1631499.
>
> Thanks,
> Andreas
>
> That would be another way of looking at the problem, yes. It does add a
> lot to the complexity though, and it doesn't scale very well on systems
> with large amounts of memory (and therefore potentially lots of unwritten
> extents to record & keep track of). If there are lots of small
> transactions, then each one might be significantly expanded by the need to
> write the info to track the things which have not been written yet.
>
> If we keep track of individual allocations/deallocations, as per Abhi's
> suggestion, then we know where the areas are which may potentially have
> unwritten data in them. That may allow us to avoid having to do the data
> writeback ahead of the journal flush in the first place - moving something
> more towards the XFS way of doing things.
>
Well, allocations and unwritten data are essentially the same thing; I may
not have said that very clearly. So avoiding unnecessary ordered data
write-out is *exactly* what I'm proposing here. When moving a glock from
one node to another, we very certainly do want to write out the ordered
data of that specific inode, however. The problem is that tracking
allocations is worthless if we don't record one of the following things in
the journal: either (a) which of the unwritten blocks have been written
already, or (b) the fact that all unwritten blocks of an inode have been
written now. When moving a glock from one node to another, (b) may be
relatively easy to ascertain, but in a running system, we may never reach
that state.

If we don't "revoke" unwritten blocks some time soon after they are written
(i.e., mark allocated blocks as written), recovery will have no way of
knowing which of the newly allocated blocks to wipe out.

> We would have to ensure that we did get data written back before the
> allocation records vanish from the active part of the log though, so a
> slightly different constraint to currently,
>
Indeed.

Thanks,
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210222/9fd90ff4/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-22 11:41         ` Andreas Gruenbacher
@ 2021-02-22 13:03           ` Andreas Gruenbacher
  2021-02-25 18:48             ` Bob Peterson
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2021-02-22 13:03 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, Feb 22, 2021 at 12:41 PM Andreas Gruenbacher <agruenba@redhat.com>
wrote:

> On Mon, Feb 22, 2021 at 11:21 AM Steven Whitehouse <swhiteho@redhat.com>
> wrote:
>
>> Hi,
>> On 20/02/2021 09:48, Andreas Gruenbacher wrote:
>>
>> Hi all,
>>
>> once we change the journal format, in addition to recording block numbers
>> as extents, there are some additional issues we should address at the same
>> time:
>>
>> I. The current transaction format of our journals is as follows:
>>
>>    - One METADATA log descriptor block for each [503 / 247 / 119 / 55]
>>    metadata blocks, followed by those metadata blocks. For each metadata
>>    block, the log descriptor records the 64-bit block number.
>>    - One JDATA log descriptor block for each [251 / 123 / 59 / 27]
>>    metadata blocks, followed by those metadata blocks. For each metadata
>>    block, the log descriptor records the 64-bit block number and another
>>    64-bit field for indicating whether the block needed escaping.
>>    - One REVOKE log descriptor block for the initial [503 / 247 / 119 /
>>    55] revokes, followed by a metadata header (not to be confused with the log
>>    header) for each additional [509 / 253 / 125 / 61] revokes. Each revoke is
>>    recorded as a 64-bit block number in its REVOKE log descriptor or metadata
>>    header.
>>    - One log header with various necessary and useful metadata that acts
>>    as a COMMIT record. If the log header is incorrect or missing, the
>>    preceding log descriptors are ignored.
>>
>>                                                                   ^^^^
>> succeeding? (I hope!)
>>
>
> No, we call lops_before_commit (which writes the various log descriptors,
> metadata, and journaled data blocks) before writing the log header in
> log_write_header -> gfs2_write_log_header. In that sense, we could call it
> a trailer.
>
> We should change that so that a single log descriptor contains a number of
>> records. There should be records for METADATA and JDATA blocks that follow,
>> as well as for REVOKES and for COMMIT. If a transaction contains metadata
>> and/or jdata blocks, those will obviously need a precursor and a commit
>> block like today, but we shouldn't need separate blocks for metadata and
>> journaled data in many cases. Small transactions that only consist of
>> revokes and of a commit should frequently fit into a single block entirely,
>> though.
>>
>> Yes, it makes sense to try and condense what we are writing. Why would we
>> not need to have separate blocks for journaled data though? That one seems
>> difficult to avoid, and since it is used so infrequently, perhaps not such
>> an important issue.
>>
> Journaled data would of course still need to be written. We could have a
> single log descriptor with METADATA and JDATA records, followed by the
> metadata and journaled data blocks, followed by a log descriptor with a
> COMMIT record.
>
>> Right now, we're writing log headers ("commits") with REQ_PREFLUSH to
>> make sure all the log descriptors of a transaction make it to disk before
>> the log header. Depending on the device, this is often costly. If we can
>> fit an entire transaction into a single block, REQ_PREFLUSH won't be needed
>> anymore.
>>
>> I'm not sure I agree. The purpose of the preflush is to ensure that the
>> data and the preceding log blocks are really on disk before we write the
>> commit record. That will still be required while we use ordered writes,
>> even if we can use (as you suggest below) a checksum to cover the whole
>> transaction, and thus check for a complete log record after the fact. Also,
>> we would still have to issue the flush in the case of a fsync derived log
>> flush too.
>>
>>
>>
>> III. We could also checksum entire transactions to avoid REQ_PREFLUSH. At
>> replay time, all the blocks that make up a transaction will either be there
>> and the checksum will match, or the transaction will be invalid. This
>> should be less prohibitively expensive with CPU support for CRC32C
>> nowadays, but depending on the hardware, it may make sense to turn this off.
>>
>> IV. We need recording of unwritten blocks / extents (allocations /
>> fallocate) as this will significantly speed up moving glocks from one node
>> to another:
>>
>> That would definitely be a step forward.
>>
>>
>>
>> At the moment, data=ordered is implemented by keeping a list of all
>> inodes that did an ordered write. When it comes time to flush the log, the
>> data of all those ordered inodes is flushed first. When all we want is to
>> flush a single glock in order to move it to a different node, we currently
>> flush all the ordered inodes as well as the journal.
>>
>> If we only flushed the ordered data of the glock being moved plus the
>> entire journal, the ordering guarantees for the other ordered inodes in the
>> journal would be violated. In that scenario, unwritten blocks could (and
>> would) show up in files after crashes.
>>
>> If we instead record unwritten blocks in the journal, we'll know which
>> blocks need to be zeroed out at recovery time. Once an unwritten block is
>> written, we record a REVOKE entry for that block.
>>
>> This comes at the cost of tracking those blocks of course, but with that
>> in place, moving a glock from one node to another will only require
>> flushing the underlying inode (assuming it's a inode glock) and the
>> journal. And most likely, we won't have to bother with implementing "simple"
>> transactions as described in
>> https://bugzilla.redhat.com/show_bug.cgi?id=1631499.
>>
>> Thanks,
>> Andreas
>>
>> That would be another way of looking at the problem, yes. It does add a
>> lot to the complexity though, and it doesn't scale very well on systems
>> with large amounts of memory (and therefore potentially lots of unwritten
>> extents to record & keep track of). If there are lots of small
>> transactions, then each one might be significantly expanded by the need to
>> write the info to track the things which have not been written yet.
>>
>> If we keep track of individual allocations/deallocations, as per Abhi's
>> suggestion, then we know where the areas are which may potentially have
>> unwritten data in them. That may allow us to avoid having to do the data
>> writeback ahead of the journal flush in the first place - moving something
>> more towards the XFS way of doing things.
>>
> Well, allocations and unwritten data are essentially the same thing; I may
> not have said that very clearly. So avoiding unnecessary ordered data
> write-out is *exactly* what I'm proposing here. When moving a glock from
> one node to another, we very certainly do want to write out the ordered
> data of that specific inode, however. The problem is that tracking
> allocations is worthless if we don't record one of the following things in
> the journal: either (a) which of the unwritten blocks have been written
> already, or (b) the fact that all unwritten blocks of an inode have been
> written now. When moving a glock from one node to another, (b) may be
> relatively easy to ascertain, but in a running system, we may never reach
> that state.
>

To expand on this a little, fsync is a point at which (b) is achieved, due
to the fact that we don't allow multiple local processes concurrent "EX"
access to a file today. This isn't really a desired property of the
filesystem though; other filesystems allow a lot more concurrency. So
before too long, we might end up in a situation where an fsync only
guarantees that all previous writes will be synced to disk. The resource
group glock sharing is a move in that direction.

Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210222/589fb56b/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-22 13:03           ` Andreas Gruenbacher
@ 2021-02-25 18:48             ` Bob Peterson
  2021-02-25 19:08               ` Andreas Gruenbacher
  0 siblings, 1 reply; 13+ messages in thread
From: Bob Peterson @ 2021-02-25 18:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> >> once we change the journal format, in addition to recording block numbers
> >> as extents, there are some additional issues we should address at the same
> >> time:

One thing I've always thought we should improve upon was the way we manage
our bitmaps. Right now, if you allocate or free a block, unless it's on the
first block of the rgrp, we need to write two blocks: (1) One for the bitmap
that needs to change and, (2) Another for the rgrp to adjust its allocated and
free numbers. The rgrplvb code will make this faster, but it would be nice if
we would somehow keep "version 2" bitmaps such that each keeps its own statistics.

That way we only need to journal and write the affected bitmap, and not
necessarily its rgrp block as well. I could see us keep separate glocks
for each bitmap, for example, and allowing multiple nodes to work on the
same portion of the file system, but on unique bitmaps.

Bob



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-25 18:48             ` Bob Peterson
@ 2021-02-25 19:08               ` Andreas Gruenbacher
  2021-02-25 19:45                 ` Bob Peterson
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2021-02-25 19:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Feb 25, 2021 at 7:48 PM Bob Peterson <rpeterso@redhat.com> wrote:

> ----- Original Message -----
> > >> once we change the journal format, in addition to recording block
> numbers
> > >> as extents, there are some additional issues we should address at the
> same
> > >> time:
>
> One thing I've always thought we should improve upon was the way we manage
> our bitmaps. Right now, if you allocate or free a block, unless it's on the
> first block of the rgrp, we need to write two blocks: (1) One for the
> bitmap
> that needs to change and, (2) Another for the rgrp to adjust its allocated
> and
> free numbers. The rgrplvb code will make this faster, but it would be nice
> if
> we would somehow keep "version 2" bitmaps such that each keeps its own
> statistics.
>
> That way we only need to journal and write the affected bitmap, and not
> necessarily its rgrp block as well. I could see us keep separate glocks
> for each bitmap, for example, and allowing multiple nodes to work on the
> same portion of the file system, but on unique bitmaps.
>

On the other hand, we currently only need to look at the first block of
each resource group to figure out if a resource group is suitable for an
allocation. If we move that information into the bitmap blocks, we'll have
to look at each of those blocks instead. That's not going to improve our
performance.

Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210225/c7dfb659/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-25 19:08               ` Andreas Gruenbacher
@ 2021-02-25 19:45                 ` Bob Peterson
  0 siblings, 0 replies; 13+ messages in thread
From: Bob Peterson @ 2021-02-25 19:45 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> On Thu, Feb 25, 2021 at 7:48 PM Bob Peterson <rpeterso@redhat.com> wrote:
> 
> > ----- Original Message -----
> > > >> once we change the journal format, in addition to recording block
> > numbers
> > > >> as extents, there are some additional issues we should address at the
> > same
> > > >> time:
> >
> > One thing I've always thought we should improve upon was the way we manage
> > our bitmaps. Right now, if you allocate or free a block, unless it's on the
> > first block of the rgrp, we need to write two blocks: (1) One for the
> > bitmap
> > that needs to change and, (2) Another for the rgrp to adjust its allocated
> > and
> > free numbers. The rgrplvb code will make this faster, but it would be nice
> > if
> > we would somehow keep "version 2" bitmaps such that each keeps its own
> > statistics.
> >
> > That way we only need to journal and write the affected bitmap, and not
> > necessarily its rgrp block as well. I could see us keep separate glocks
> > for each bitmap, for example, and allowing multiple nodes to work on the
> > same portion of the file system, but on unique bitmaps.
> >
> 
> On the other hand, we currently only need to look at the first block of
> each resource group to figure out if a resource group is suitable for an
> allocation. If we move that information into the bitmap blocks, we'll have
> to look at each of those blocks instead. That's not going to improve our
> performance.
> 
> Andreas
> 
The LVBs will shield us from that, but they won't shield us from having
to rewrite 2 blocks instead of 1.

Bob



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] Recording extents in GFS2
  2021-02-02 17:35   ` Steven Whitehouse
  2021-02-20  9:48     ` Andreas Gruenbacher
@ 2021-03-01 17:53     ` Andreas Gruenbacher
  1 sibling, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2021-03-01 17:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Feb 2, 2021 at 6:35 PM Steven Whitehouse <swhiteho@redhat.com>
wrote:

> Hi,
> On 24/01/2021 06:44, Abhijith Das wrote:
>
> Hi all,
>
> I've been looking at rgrp.c:gfs2_alloc_blocks(), which is called from
> various places to allocate single/multiple blocks for inodes. I've come up
> with some data structures to accomplish recording of these allocations as
> extents.
>
> I'm proposing we add a new metadata type for journal blocks that will hold
> these extent records.
>
> GFS2_METATYPE_EX 15 /* New metadata type for a block that will hold
> extents */
>
> This structure below will be at the start of the block, followed by a
> number of alloc_ext structures.
>
> struct gfs2_extents { /* This structure is 32 bytes long */
>     struct gfs2_meta_header ex_header;
>     __be32 ex_count; /* count of number of alloc_ext structs that follow
> this header. */
>     __be32 __pad;
> };
> /* flags for the alloc_ext struct */
> #define AE_FL_XXX
>
> struct alloc_ext { /* This structure is 48 bytes long */
>     struct gfs2_inum ae_num; /* The inode this allocation/deallocation
> belongs to */
>     __be32 ae_flags; /* specifies if we're allocating/deallocating,
> data/metadata, etc. */
>     __be64 ae_start; /* starting physical block number of the extent */
>     __be64 ae_len;   /* length of the extent */
>     __be32 ae_uid;   /* user this belongs to, for quota accounting */
>     __be32 ae_gid;   /* group this belongs to, for quota accounting */
>     __be32 __pad;
> };
>
> The gfs2_inum structure is a bit OTT for this I think. A single 64 bit
> inode number should be enough? Also, it is quite likely we may have
> multiple extents for the same inode... so should we split this into two so
> we can have something like this? It is more complicated, but should save
> space in the average case.
>
> struct alloc_hdr {
>
>     __be64 inum;
>
>     __be32 uid; /* This is duplicated from the inode... various options
> here depending on whether we think this is something we should do. Should
> we also consider logging chown using this structure? We will have to
> carefully check chown sequence wrt to allocations/deallocations for quota
> purposes */
>
>     __be32 gid;
>
>     __u8 num_extents; /* Never likely to have huge numbers of extents per
> header, due to block size! */
>
>     /* padding... or is there something else we could/should add here? */
>
> };
>
> followed by num_extents copies of:
>
> struct alloc_extent {
>
>     __be64 phys_start;
>
>     __be64 logical_start; /* Do we need a logical & physical start? Maybe
> we don't care about the logical start? */
>
>     __be32 length; /* Max extent length is limited by rgrp length... only
> need 32 bits */
>
>     __be32 flags; /* Can we support unwritten, zero extents with this?
> Need to indicate alloc/free/zero, data/metadata */
>
> };
>
We're trying to keep allocations relatively close together and within the
same resource group, so to store extent lists more compactly, we could
store the first extent's start address absolutely, and the start of each
successive extent within range as a signed 32-bit number relative to that.

> Just wondering if there is also some shorthand we might be able to use in
> case we have multiple extents all separated by either one metadata block,
> or a very small number of metadata blocks (which will be the case for
> streaming writes). Again it increases the complexity, but will likely
> reduce the amount we have to write into the new journal blocks quite a lot.
> Not much point having a 32 bit length, but never filling it with a value
> above 509 (4k block size)...
>
The current allocator fills at most one indirect block before allocating
the next indirect block(s), which is why we end up with the above described
pattern. Once we switch to extent-based inodes, we won't be allocating
indirect blocks anymore, so we also won't end up with those chopped-up
extents anymore. There will be the occasional node split in the inode
extent tree, but that will be a much less frequent occurrence, and it won't
happen when extending an existing extent. Delayed allocation would further
improve the on-disk allocation patterns. On the other hand, we'll end up
with more overhead when files are highly fragmented.

As long as we're only storing extents in the journal, I don't think those
509-block chunks are a problem; we'll still end up with more compact
metadata for mostly-contiguous files. We'll do much worse for test cases
that write every other block, for example.

> With 4k block sizes, we can fit 84 extents (10 for 512b, 20 for 1k, 42
> for 2k block sizes) in one block. As we process more allocs/deallocs, we
> keep creating more such alloc_ext records and tack them to the back of this
> block if there's space or else create a new block. For smaller extents,
> this might not be efficient, so we might just want to revert to the old
> method of recording the bitmap blocks instead.
> During journal replay, we decode these new blocks and flip the
> corresponding bitmaps for each of the blocks represented in the extents.
> For the ones where we just recorded the bitmap blocks the old-fashioned
> way, we also replay them the old-fashioned way. This way we're also
> backward compatible with an older version of gfs2 that only records the
> bitmaps.
> Since we record the uid/gid with each extent, we can do the quota
> accounting without relying on the quota change file. We might need to keep
> the quota change file around for backward compatibility and for the cases
> where we might want to record allocs/deallocs the old-fashioned way.
>
> I'm going to play around with this and come up with some patches to see if
> this works and what kind of performance improvements we get. These data
> structures will mostly likely need reworking and renaming, but this is the
> general direction I'm thinking along.
>
> Please let me know what you think.
>
> Cheers!
> --Abhi
>
> That all sounds good. I'm sure it will take a little while to figure out
> how to get this right,
>
> Steve.
>
Thanks,
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20210301/fd2fbd65/attachment.htm>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-03-01 17:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-11 16:38 [Cluster-devel] Recording extents in GFS2 Abhijith Das
2020-12-14 10:46 ` Steven Whitehouse
2021-01-24  6:44 ` Abhijith Das
2021-02-02 15:08   ` Bob Peterson
2021-02-02 17:35   ` Steven Whitehouse
2021-02-20  9:48     ` Andreas Gruenbacher
2021-02-22 10:20       ` Steven Whitehouse
2021-02-22 11:41         ` Andreas Gruenbacher
2021-02-22 13:03           ` Andreas Gruenbacher
2021-02-25 18:48             ` Bob Peterson
2021-02-25 19:08               ` Andreas Gruenbacher
2021-02-25 19:45                 ` Bob Peterson
2021-03-01 17:53     ` Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.