All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
@ 2023-05-04 17:43 Bob Peterson
  2023-05-05  7:44 ` Andrew Price
  0 siblings, 1 reply; 5+ messages in thread
From: Bob Peterson @ 2023-05-04 17:43 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Before this patch function gfs2_dinode_dealloc would abort if it got a
bad return code from gfs2_rindex_update. The problem is that it left the
dinode in the unlinked (not free) state, which meant subsequent fsck
would clean it up and flag an error. That meant some of our QE tests
would fail.

The sole purpose of gfs2_rindex_update, in this code path, is to read in
any newer rgrps added by gfs2_grow. But since this is a delete operation
it won't actually use any of those new rgrps. It can really only twiddle
the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the
error should not prevent the transition from unlinked to free.

This patch makes gfs2_dinode_dealloc ignore the bad return code and
proceed with freeing the dinode so the QE tests will not be tripped up.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/super.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index d3b5c6278be0..1f23d7845123 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1131,9 +1131,7 @@ static int gfs2_dinode_dealloc(struct gfs2_inode *ip)
 		return -EIO;
 	}
 
-	error = gfs2_rindex_update(sdp);
-	if (error)
-		return error;
+	gfs2_rindex_update(sdp);
 
 	error = gfs2_quota_hold(ip, NO_UID_QUOTA_CHANGE, NO_GID_QUOTA_CHANGE);
 	if (error)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
  2023-05-04 17:43 [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc Bob Peterson
@ 2023-05-05  7:44 ` Andrew Price
  2023-05-05  8:49   ` Steven Whitehouse
  2023-05-05 12:29   ` Bob Peterson
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Price @ 2023-05-05  7:44 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Bob,

On 04/05/2023 18:43, Bob Peterson wrote:
> Before this patch function gfs2_dinode_dealloc would abort if it got a
> bad return code from gfs2_rindex_update. The problem is that it left the
> dinode in the unlinked (not free) state, which meant subsequent fsck
> would clean it up and flag an error. That meant some of our QE tests
> would fail.

As I understand it the test is an interrupted rename loop workload and 
gfs2_grow at the same time, and the bad return code is -EINTR, right?

> The sole purpose of gfs2_rindex_update, in this code path, is to read in
> any newer rgrps added by gfs2_grow. But since this is a delete operation
> it won't actually use any of those new rgrps. It can really only twiddle
> the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the
> error should not prevent the transition from unlinked to free.
> 
> This patch makes gfs2_dinode_dealloc ignore the bad return code and
> proceed with freeing the dinode so the QE tests will not be tripped up.

Is it really ok to ignore all potential errors here? I wonder if it 
should just ignore -EINTR (or whichever error the test produces) so that 
it can still fail well for errors like -EIO.

Cheers,
Andy

> 
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/super.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index d3b5c6278be0..1f23d7845123 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1131,9 +1131,7 @@ static int gfs2_dinode_dealloc(struct gfs2_inode *ip)
>   		return -EIO;
>   	}
>   
> -	error = gfs2_rindex_update(sdp);
> -	if (error)
> -		return error;
> +	gfs2_rindex_update(sdp);



>   
>   	error = gfs2_quota_hold(ip, NO_UID_QUOTA_CHANGE, NO_GID_QUOTA_CHANGE);
>   	if (error)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
  2023-05-05  7:44 ` Andrew Price
@ 2023-05-05  8:49   ` Steven Whitehouse
  2023-05-05 12:46     ` Bob Peterson
  2023-05-05 12:29   ` Bob Peterson
  1 sibling, 1 reply; 5+ messages in thread
From: Steven Whitehouse @ 2023-05-05  8:49 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, 2023-05-05 at 08:44 +0100, Andrew Price wrote:
> Hi Bob,
> 
> On 04/05/2023 18:43, Bob Peterson wrote:
> > Before this patch function gfs2_dinode_dealloc would abort if it
> > got a
> > bad return code from gfs2_rindex_update. The problem is that it
> > left the
> > dinode in the unlinked (not free) state, which meant subsequent
> > fsck
> > would clean it up and flag an error. That meant some of our QE
> > tests
> > would fail.
> 
> As I understand it the test is an interrupted rename loop workload
> and 
> gfs2_grow at the same time, and the bad return code is -EINTR, right?
> 
> > The sole purpose of gfs2_rindex_update, in this code path, is to
> > read in
> > any newer rgrps added by gfs2_grow. But since this is a delete
> > operation
> > it won't actually use any of those new rgrps. It can really only
> > twiddle
> > the bits from "Unlinked" to "Free" in an existing rgrp. Therefore
> > the
> > error should not prevent the transition from unlinked to free.
> > 
> > This patch makes gfs2_dinode_dealloc ignore the bad return code and
> > proceed with freeing the dinode so the QE tests will not be tripped
> > up.
> 
> Is it really ok to ignore all potential errors here? I wonder if it 
> should just ignore -EINTR (or whichever error the test produces) so
> that 
> it can still fail well for errors like -EIO.
> 
> Cheers,
> Andy
> 
Perhaps the more important question is why there are errors there in
the first place?

Steve.

> > 
> > Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> > ---
> > ? fs/gfs2/super.c | 4 +---
> > ? 1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> > index d3b5c6278be0..1f23d7845123 100644
> > --- a/fs/gfs2/super.c
> > +++ b/fs/gfs2/super.c
> > @@ -1131,9 +1131,7 @@ static int gfs2_dinode_dealloc(struct
> > gfs2_inode *ip)
> > ????????????????return -EIO;
> > ????????}
> > ? 
> > -???????error = gfs2_rindex_update(sdp);
> > -???????if (error)
> > -???????????????return error;
> > +???????gfs2_rindex_update(sdp);
> 
> 
> 
> > ? 
> > ????????error = gfs2_quota_hold(ip, NO_UID_QUOTA_CHANGE,
> > NO_GID_QUOTA_CHANGE);
> > ????????if (error)
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
  2023-05-05  7:44 ` Andrew Price
  2023-05-05  8:49   ` Steven Whitehouse
@ 2023-05-05 12:29   ` Bob Peterson
  1 sibling, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2023-05-05 12:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Andy,

On 5/5/23 3:44 AM, Andrew Price wrote:
> Hi Bob,
> 
> On 04/05/2023 18:43, Bob Peterson wrote:
>> Before this patch function gfs2_dinode_dealloc would abort if it got a
>> bad return code from gfs2_rindex_update. The problem is that it left the
>> dinode in the unlinked (not free) state, which meant subsequent fsck
>> would clean it up and flag an error. That meant some of our QE tests
>> would fail.
> 
> As I understand it the test is an interrupted rename loop workload and 
> gfs2_grow at the same time, and the bad return code is -EINTR, right?

Correct.

>> The sole purpose of gfs2_rindex_update, in this code path, is to read in
>> any newer rgrps added by gfs2_grow. But since this is a delete operation
>> it won't actually use any of those new rgrps. It can really only twiddle
>> the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the
>> error should not prevent the transition from unlinked to free.
>>
>> This patch makes gfs2_dinode_dealloc ignore the bad return code and
>> proceed with freeing the dinode so the QE tests will not be tripped up.
> 
> Is it really ok to ignore all potential errors here? I wonder if it 
> should just ignore -EINTR (or whichever error the test produces) so that 
> it can still fail well for errors like -EIO.

Good question.

The call to gfs2_rindex_update is really not even needed in 
gfs2_dinode_dealloc because this is the last stage of the delete where 
we are freeing the dinode itself. I've even considered removing the call 
altogether. So to fail the operation for such an inconsequential 
action's failure seems like throwing the proverbial baby out with the 
bath water.

Maybe we should just remove the call to gfs2_rindex_update altogether 
and delegate it to earlier parts of the evict/delete process.

The original intent of calling gfs2_rindex_update in the evict/delete 
sequence was to ensure we have the newest resource groups from gfs2_grow 
because any file being evicted may have references to the new rgrps 
created by gfs2_grow that need to be freed, even if the dinode itself 
resides in an old rgrp. This is pretty much true for all parts of the 
process that evicts deleted dinodes except for gfs2_dinode_dealloc 
itself. For example, a new dinode might have an eattr, indirect block, 
data block, or whatever, in one of the new rgrps added by gfs2_grow.

However, since the inode was created/instantiated (which must be true in 
order for it to be evicted), the dinode itself must reside in a 
previously instantiated rgrp, and therefore the call to 
gfs2_rindex_update is not needed at all.

So if the call to it fails, imho, it shouldn't fail the rest of the 
gfs2_dinode_dealloc, regardless of the failure.

The next question you may ask is: why don't we get the -EINTR when 
reading in new rgrps for the purposes of deleting other parts of the 
file, its eattrs, indirect blocks, data blocks, etc.? The answer is: I 
don't know, but I suspect we have other bugs lurking in that area. I 
suspect if we try hard enough we can create other problems in which the 
punch_hole code doesn't read in new rgrps.

It may be tempting to think that this also cannot happen because the 
rgrps must also be instantiated for any eattrs, metadata, data to be 
assigned to the dinode being evicted/deleted. But that's non-clustered 
file system thinking.

In gfs2, it is possible for one cluster node to read in new rgrps from 
gfs2_grow, then assign those blocks to a new dinode that's already open 
on a different node, then delete that file, causing a second cluster 
node to evict, and try to reference those new blocks before the new 
rgrps are read in. So we need to be very careful.

We should probably spend some time trying to force these conditions to 
see if we can flush out more bugs.

For some reason, with this test, we only see this particular problem 
with gfs2_dinode_dealloc, and that's the problem I'm trying to fix with 
the patch.

Regards,

Bob Peterson


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
  2023-05-05  8:49   ` Steven Whitehouse
@ 2023-05-05 12:46     ` Bob Peterson
  0 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2023-05-05 12:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 5/5/23 4:49 AM, Steven Whitehouse wrote:
> On Fri, 2023-05-05 at 08:44 +0100, Andrew Price wrote:
>>> This patch makes gfs2_dinode_dealloc ignore the bad return code and
>>> proceed with freeing the dinode so the QE tests will not be tripped
>>> up.
>>
>> Is it really ok to ignore all potential errors here? I wonder if it
>> should just ignore -EINTR (or whichever error the test produces) so
>> that
>> it can still fail well for errors like -EIO.
>>
>> Cheers,
>> Andy
>>
> Perhaps the more important question is why there are errors there in
> the first place?

Well, errors can be returned by gfs2_rindex_update for a number of 
reasons. The -EINTR is just one, when the test case is interrupted due 
to hitting its time limit. An -EIO may be caused by IO errors reading 
the storage media, pulled cables, failed drivers or whatever. But as I 
said in my reply to Andy's email, it's basically inconsequential to this 
code path. Errors like -EIO may prevent further reads or writes to the 
file system, its metadata, etc., in other places but that should be 
flagged up by other processes and hopefully gfs2's recovery, journal 
replays, etc., will ensure file system integrity.

Regards,

Bob Peterson


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-05 12:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-04 17:43 [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc Bob Peterson
2023-05-05  7:44 ` Andrew Price
2023-05-05  8:49   ` Steven Whitehouse
2023-05-05 12:46     ` Bob Peterson
2023-05-05 12:29   ` Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.