All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [GFS2 PATCH] gfs2: truncate glock address space pages during evict
       [not found] <454279735.16583317.1599751575623.JavaMail.zimbra@redhat.com>
@ 2020-09-10 15:26 ` Bob Peterson
  2020-09-11 12:58   ` Andreas Gruenbacher
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Peterson @ 2020-09-10 15:26 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Before this patch, function gfs2_evict_inode would truncate the pages
for the inode's address space, but the address space associated with its
glock was only truncated if the inode was deleted. In that case, it
needs to do the truncate within a transaction because of this calling
sequence:

truncate_inode_pages
   truncate_inode_pages_range
      gfs2_invalidatepage
         gfs2_discard
            gfs2_remove_from_journal
               gfs2_trans_add_revoke <------- Needs a transaction

If, however, the inode is not deleted, it might still have pages for
the glock's address space because gfs2_getbuf() reads them in that way.
This may happen, for example, if a file is unlinked by another node and
this node discovers that fact by reading the dinode using its glock
during the process of doing delete_work_func -> create -> put -> evict.
But if the inode is not deleted, the truncate_inode_pages for the glock
address space was bypassed.

The problem is, in some circumstances (such as the system quota file during
unmount) an inode can have pages still queued to its glock's address space.
That causes a kernel panic when __gfs2_glock_put for the inode glock does:

    GLOCK_BUG_ON(gl, mapping && mapping->nrpages && !gfs2_withdrawn(sdp));

At that point, it is too late to start a transaction for the truncate
because the file system is then read-only, so trans_begin will return an
error.

When we free the glock, the pages shouldn't be dirty anyway, so this
patch adds truncate_inode_pages_final for the glock's address space,
metamapping.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/super.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 19add2da1013..3ced8d895eca 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1456,6 +1456,10 @@ static void gfs2_evict_inode(struct inode *inode)
 		fs_warn(sdp, "gfs2_evict_inode: %d\n", error);
 out:
 	truncate_inode_pages_final(&inode->i_data);
+	if (ip->i_gl) {
+		metamapping = gfs2_glock2aspace(ip->i_gl);
+		truncate_inode_pages_final(metamapping);
+	}
 	if (ip->i_qadata)
 		gfs2_assert_warn(sdp, ip->i_qadata->qa_ref == 0);
 	gfs2_rs_delete(ip, NULL);



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] gfs2: truncate glock address space pages during evict
  2020-09-10 15:26 ` [Cluster-devel] [GFS2 PATCH] gfs2: truncate glock address space pages during evict Bob Peterson
@ 2020-09-11 12:58   ` Andreas Gruenbacher
  2020-09-11 20:08     ` Bob Peterson
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Gruenbacher @ 2020-09-11 12:58 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Bob,

On Thu, Sep 10, 2020 at 5:26 PM Bob Peterson <rpeterso@redhat.com> wrote:
> Before this patch, function gfs2_evict_inode would truncate the pages
> for the inode's address space, but the address space associated with its
> glock was only truncated if the inode was deleted. In that case, it
> needs to do the truncate within a transaction because of this calling
> sequence:
>
> truncate_inode_pages
>    truncate_inode_pages_range
>       gfs2_invalidatepage
>          gfs2_discard
>             gfs2_remove_from_journal
>                gfs2_trans_add_revoke <------- Needs a transaction
>
> If, however, the inode is not deleted, it might still have pages for
> the glock's address space because gfs2_getbuf() reads them in that way.
> This may happen, for example, if a file is unlinked by another node and
> this node discovers that fact by reading the dinode using its glock
> during the process of doing delete_work_func -> create -> put -> evict.
> But if the inode is not deleted, the truncate_inode_pages for the glock
> address space was bypassed.
>
> The problem is, in some circumstances (such as the system quota file during
> unmount) an inode can have pages still queued to its glock's address space.
> That causes a kernel panic when __gfs2_glock_put for the inode glock does:
>
>     GLOCK_BUG_ON(gl, mapping && mapping->nrpages && !gfs2_withdrawn(sdp));
>
> At that point, it is too late to start a transaction for the truncate
> because the file system is then read-only, so trans_begin will return an
> error.
>
> When we free the glock, the pages shouldn't be dirty anyway, so this
> patch adds truncate_inode_pages_final for the glock's address space,
> metamapping.

not sure why, but this patch breaks xfstests generic/311 and generic/467. Also,

> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>  fs/gfs2/super.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 19add2da1013..3ced8d895eca 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1456,6 +1456,10 @@ static void gfs2_evict_inode(struct inode *inode)
>                 fs_warn(sdp, "gfs2_evict_inode: %d\n", error);
>  out:
>         truncate_inode_pages_final(&inode->i_data);
> +       if (ip->i_gl) {
> +               metamapping = gfs2_glock2aspace(ip->i_gl);

don't we want to initialize metamapping earlier and also call
truncate_inode_pages_final on it before we detach the glock?

(Mind that truncate_inode_pages_final leaves AS_EXITING set in
metamapping->flags, so we'll also need to do somthing about that.)

> +               truncate_inode_pages_final(metamapping);
> +       }
>         if (ip->i_qadata)
>                 gfs2_assert_warn(sdp, ip->i_qadata->qa_ref == 0);
>         gfs2_rs_delete(ip, NULL);

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] gfs2: truncate glock address space pages during evict
  2020-09-11 12:58   ` Andreas Gruenbacher
@ 2020-09-11 20:08     ` Bob Peterson
  2020-09-11 23:00       ` Andreas Gruenbacher
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Peterson @ 2020-09-11 20:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> Bob,
> 
> 
> not sure why, but this patch breaks xfstests generic/311 and generic/467.

Hm, that's odd. Can you tell me how it misbehaves?
fwiw, both 311 and 467 work for me, no hangs, panics, etc.

> Also,
> don't we want to initialize metamapping earlier and also call
> truncate_inode_pages_final on it before we detach the glock?

Yes, this is a great concern now you mention it.
I've created a 5-patch set that refactors function gfs2_evict_inode
to actually make it readable. I'll test and post that next week.
 
> (Mind that truncate_inode_pages_final leaves AS_EXITING set in
> metamapping->flags, so we'll also need to do somthing about that.)

Yes, that is another concern. I'll wait and see if Johannes W. responds
to your query and we'll go from there.

Regards,

Bob Peterson



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] gfs2: truncate glock address space pages during evict
  2020-09-11 20:08     ` Bob Peterson
@ 2020-09-11 23:00       ` Andreas Gruenbacher
  0 siblings, 0 replies; 4+ messages in thread
From: Andreas Gruenbacher @ 2020-09-11 23:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, Sep 11, 2020 at 10:08 PM Bob Peterson <rpeterso@redhat.com> wrote:
> > not sure why, but this patch breaks xfstests generic/311 and generic/467.
>
> Hm, that's odd. Can you tell me how it misbehaves?
> fwiw, both 311 and 467 work for me, no hangs, panics, etc.

Here's the check output; I'm attaching the full results files as well.
I've been running this on top of for-next (v5.9-rc2-156-g2928eebec009)
as well as a version rebased on top of v5.9-rc4.

generic/311 157s ... [failed, exit status 1]- output mismatch (see
/root/git/xfstests/results//generic/311.out.bad)
    --- tests/generic/311.out    2017-08-16 18:58:42.628827547 +0200
    +++ /root/git/xfstests/results//generic/311.out.bad    2020-09-11
23:07:00.121075324 +0200
    @@ -193,129 +193,7 @@
     ae31d41d825b392bdd6b2453e05ad02e
     Running test 13 buffered, normal suspend
     Random seed is 13
    -63f0ccfc767186236f887e5b25466e7d
    -63f0ccfc767186236f887e5b25466e7d
    -Running test 13 direct, normal suspend
    -Random seed is 13
    ...
    (Run 'diff -u /root/git/xfstests/tests/generic/311.out
/root/git/xfstests/results//generic/311.out.bad'  to see the entire
diff)
generic/467 3s ... - output mismatch (see
/root/git/xfstests/results//generic/467.out.bad)
    --- tests/generic/467.out    2018-02-01 12:50:11.104151533 +0100
    +++ /root/git/xfstests/results//generic/467.out.bad    2020-09-11
23:07:02.955968876 +0200
    @@ -1,9 +1,39 @@
     QA output created by 467
     test_file_handles TEST_DIR/467-dir -dp
     test_file_handles TEST_DIR/467-dir -rp
    +open_by_handle(/mnt/test/467-dir/file000009) returned stale data ''!
    +open_by_handle(/mnt/test/467-dir/file000009) returned stale data ''!
    +open_by_handle(/mnt/test/467-dir/file000009) returned stale data ''!
    +open_by_handle(/mnt/test/467-dir/file000009) returned stale data ''!
    ...
    (Run 'diff -u /root/git/xfstests/tests/generic/467.out
/root/git/xfstests/results//generic/467.out.bad'  to see the entire
diff)

Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: results.tar.gz
Type: application/gzip
Size: 12200 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20200912/fa197b18/attachment.gz>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-11 23:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <454279735.16583317.1599751575623.JavaMail.zimbra@redhat.com>
2020-09-10 15:26 ` [Cluster-devel] [GFS2 PATCH] gfs2: truncate glock address space pages during evict Bob Peterson
2020-09-11 12:58   ` Andreas Gruenbacher
2020-09-11 20:08     ` Bob Peterson
2020-09-11 23:00       ` Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.