* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
@ 2023-05-04 17:43 Bob Peterson
2023-05-05 7:44 ` Andrew Price
0 siblings, 1 reply; 5+ messages in thread
From: Bob Peterson @ 2023-05-04 17:43 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch function gfs2_dinode_dealloc would abort if it got a
bad return code from gfs2_rindex_update. The problem is that it left the
dinode in the unlinked (not free) state, which meant subsequent fsck
would clean it up and flag an error. That meant some of our QE tests
would fail.
The sole purpose of gfs2_rindex_update, in this code path, is to read in
any newer rgrps added by gfs2_grow. But since this is a delete operation
it won't actually use any of those new rgrps. It can really only twiddle
the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the
error should not prevent the transition from unlinked to free.
This patch makes gfs2_dinode_dealloc ignore the bad return code and
proceed with freeing the dinode so the QE tests will not be tripped up.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/super.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index d3b5c6278be0..1f23d7845123 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1131,9 +1131,7 @@ static int gfs2_dinode_dealloc(struct gfs2_inode *ip)
return -EIO;
}
- error = gfs2_rindex_update(sdp);
- if (error)
- return error;
+ gfs2_rindex_update(sdp);
error = gfs2_quota_hold(ip, NO_UID_QUOTA_CHANGE, NO_GID_QUOTA_CHANGE);
if (error)
--
2.40.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
2023-05-04 17:43 [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc Bob Peterson
@ 2023-05-05 7:44 ` Andrew Price
2023-05-05 8:49 ` Steven Whitehouse
2023-05-05 12:29 ` Bob Peterson
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Price @ 2023-05-05 7:44 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi Bob,
On 04/05/2023 18:43, Bob Peterson wrote:
> Before this patch function gfs2_dinode_dealloc would abort if it got a
> bad return code from gfs2_rindex_update. The problem is that it left the
> dinode in the unlinked (not free) state, which meant subsequent fsck
> would clean it up and flag an error. That meant some of our QE tests
> would fail.
As I understand it the test is an interrupted rename loop workload and
gfs2_grow at the same time, and the bad return code is -EINTR, right?
> The sole purpose of gfs2_rindex_update, in this code path, is to read in
> any newer rgrps added by gfs2_grow. But since this is a delete operation
> it won't actually use any of those new rgrps. It can really only twiddle
> the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the
> error should not prevent the transition from unlinked to free.
>
> This patch makes gfs2_dinode_dealloc ignore the bad return code and
> proceed with freeing the dinode so the QE tests will not be tripped up.
Is it really ok to ignore all potential errors here? I wonder if it
should just ignore -EINTR (or whichever error the test produces) so that
it can still fail well for errors like -EIO.
Cheers,
Andy
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> fs/gfs2/super.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index d3b5c6278be0..1f23d7845123 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1131,9 +1131,7 @@ static int gfs2_dinode_dealloc(struct gfs2_inode *ip)
> return -EIO;
> }
>
> - error = gfs2_rindex_update(sdp);
> - if (error)
> - return error;
> + gfs2_rindex_update(sdp);
>
> error = gfs2_quota_hold(ip, NO_UID_QUOTA_CHANGE, NO_GID_QUOTA_CHANGE);
> if (error)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
2023-05-05 7:44 ` Andrew Price
@ 2023-05-05 8:49 ` Steven Whitehouse
2023-05-05 12:46 ` Bob Peterson
2023-05-05 12:29 ` Bob Peterson
1 sibling, 1 reply; 5+ messages in thread
From: Steven Whitehouse @ 2023-05-05 8:49 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Fri, 2023-05-05 at 08:44 +0100, Andrew Price wrote:
> Hi Bob,
>
> On 04/05/2023 18:43, Bob Peterson wrote:
> > Before this patch function gfs2_dinode_dealloc would abort if it
> > got a
> > bad return code from gfs2_rindex_update. The problem is that it
> > left the
> > dinode in the unlinked (not free) state, which meant subsequent
> > fsck
> > would clean it up and flag an error. That meant some of our QE
> > tests
> > would fail.
>
> As I understand it the test is an interrupted rename loop workload
> and
> gfs2_grow at the same time, and the bad return code is -EINTR, right?
>
> > The sole purpose of gfs2_rindex_update, in this code path, is to
> > read in
> > any newer rgrps added by gfs2_grow. But since this is a delete
> > operation
> > it won't actually use any of those new rgrps. It can really only
> > twiddle
> > the bits from "Unlinked" to "Free" in an existing rgrp. Therefore
> > the
> > error should not prevent the transition from unlinked to free.
> >
> > This patch makes gfs2_dinode_dealloc ignore the bad return code and
> > proceed with freeing the dinode so the QE tests will not be tripped
> > up.
>
> Is it really ok to ignore all potential errors here? I wonder if it
> should just ignore -EINTR (or whichever error the test produces) so
> that
> it can still fail well for errors like -EIO.
>
> Cheers,
> Andy
>
Perhaps the more important question is why there are errors there in
the first place?
Steve.
> >
> > Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> > ---
> > ? fs/gfs2/super.c | 4 +---
> > ? 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> > index d3b5c6278be0..1f23d7845123 100644
> > --- a/fs/gfs2/super.c
> > +++ b/fs/gfs2/super.c
> > @@ -1131,9 +1131,7 @@ static int gfs2_dinode_dealloc(struct
> > gfs2_inode *ip)
> > ????????????????return -EIO;
> > ????????}
> > ?
> > -???????error = gfs2_rindex_update(sdp);
> > -???????if (error)
> > -???????????????return error;
> > +???????gfs2_rindex_update(sdp);
>
>
>
> > ?
> > ????????error = gfs2_quota_hold(ip, NO_UID_QUOTA_CHANGE,
> > NO_GID_QUOTA_CHANGE);
> > ????????if (error)
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
2023-05-05 7:44 ` Andrew Price
2023-05-05 8:49 ` Steven Whitehouse
@ 2023-05-05 12:29 ` Bob Peterson
1 sibling, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2023-05-05 12:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi Andy,
On 5/5/23 3:44 AM, Andrew Price wrote:
> Hi Bob,
>
> On 04/05/2023 18:43, Bob Peterson wrote:
>> Before this patch function gfs2_dinode_dealloc would abort if it got a
>> bad return code from gfs2_rindex_update. The problem is that it left the
>> dinode in the unlinked (not free) state, which meant subsequent fsck
>> would clean it up and flag an error. That meant some of our QE tests
>> would fail.
>
> As I understand it the test is an interrupted rename loop workload and
> gfs2_grow at the same time, and the bad return code is -EINTR, right?
Correct.
>> The sole purpose of gfs2_rindex_update, in this code path, is to read in
>> any newer rgrps added by gfs2_grow. But since this is a delete operation
>> it won't actually use any of those new rgrps. It can really only twiddle
>> the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the
>> error should not prevent the transition from unlinked to free.
>>
>> This patch makes gfs2_dinode_dealloc ignore the bad return code and
>> proceed with freeing the dinode so the QE tests will not be tripped up.
>
> Is it really ok to ignore all potential errors here? I wonder if it
> should just ignore -EINTR (or whichever error the test produces) so that
> it can still fail well for errors like -EIO.
Good question.
The call to gfs2_rindex_update is really not even needed in
gfs2_dinode_dealloc because this is the last stage of the delete where
we are freeing the dinode itself. I've even considered removing the call
altogether. So to fail the operation for such an inconsequential
action's failure seems like throwing the proverbial baby out with the
bath water.
Maybe we should just remove the call to gfs2_rindex_update altogether
and delegate it to earlier parts of the evict/delete process.
The original intent of calling gfs2_rindex_update in the evict/delete
sequence was to ensure we have the newest resource groups from gfs2_grow
because any file being evicted may have references to the new rgrps
created by gfs2_grow that need to be freed, even if the dinode itself
resides in an old rgrp. This is pretty much true for all parts of the
process that evicts deleted dinodes except for gfs2_dinode_dealloc
itself. For example, a new dinode might have an eattr, indirect block,
data block, or whatever, in one of the new rgrps added by gfs2_grow.
However, since the inode was created/instantiated (which must be true in
order for it to be evicted), the dinode itself must reside in a
previously instantiated rgrp, and therefore the call to
gfs2_rindex_update is not needed at all.
So if the call to it fails, imho, it shouldn't fail the rest of the
gfs2_dinode_dealloc, regardless of the failure.
The next question you may ask is: why don't we get the -EINTR when
reading in new rgrps for the purposes of deleting other parts of the
file, its eattrs, indirect blocks, data blocks, etc.? The answer is: I
don't know, but I suspect we have other bugs lurking in that area. I
suspect if we try hard enough we can create other problems in which the
punch_hole code doesn't read in new rgrps.
It may be tempting to think that this also cannot happen because the
rgrps must also be instantiated for any eattrs, metadata, data to be
assigned to the dinode being evicted/deleted. But that's non-clustered
file system thinking.
In gfs2, it is possible for one cluster node to read in new rgrps from
gfs2_grow, then assign those blocks to a new dinode that's already open
on a different node, then delete that file, causing a second cluster
node to evict, and try to reference those new blocks before the new
rgrps are read in. So we need to be very careful.
We should probably spend some time trying to force these conditions to
see if we can flush out more bugs.
For some reason, with this test, we only see this particular problem
with gfs2_dinode_dealloc, and that's the problem I'm trying to fix with
the patch.
Regards,
Bob Peterson
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc
2023-05-05 8:49 ` Steven Whitehouse
@ 2023-05-05 12:46 ` Bob Peterson
0 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2023-05-05 12:46 UTC (permalink / raw)
To: cluster-devel.redhat.com
On 5/5/23 4:49 AM, Steven Whitehouse wrote:
> On Fri, 2023-05-05 at 08:44 +0100, Andrew Price wrote:
>>> This patch makes gfs2_dinode_dealloc ignore the bad return code and
>>> proceed with freeing the dinode so the QE tests will not be tripped
>>> up.
>>
>> Is it really ok to ignore all potential errors here? I wonder if it
>> should just ignore -EINTR (or whichever error the test produces) so
>> that
>> it can still fail well for errors like -EIO.
>>
>> Cheers,
>> Andy
>>
> Perhaps the more important question is why there are errors there in
> the first place?
Well, errors can be returned by gfs2_rindex_update for a number of
reasons. The -EINTR is just one, when the test case is interrupted due
to hitting its time limit. An -EIO may be caused by IO errors reading
the storage media, pulled cables, failed drivers or whatever. But as I
said in my reply to Andy's email, it's basically inconsequential to this
code path. Errors like -EIO may prevent further reads or writes to the
file system, its metadata, etc., in other places but that should be
flagged up by other processes and hopefully gfs2's recovery, journal
replays, etc., will ensure file system integrity.
Regards,
Bob Peterson
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-05 12:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-04 17:43 [Cluster-devel] [PATCH] gfs2: ignore rindex_update failure in dinode_dealloc Bob Peterson
2023-05-05 7:44 ` Andrew Price
2023-05-05 8:49 ` Steven Whitehouse
2023-05-05 12:46 ` Bob Peterson
2023-05-05 12:29 ` Bob Peterson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.