All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Henriques <lhenriques@suse.com>
To: "Yan\, Zheng" <ukernel@gmail.com>
Cc: "Yan\, Zheng" <zyan@redhat.com>, Sage Weil <sage@redhat.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	ceph-devel <ceph-devel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org
Subject: Re: [PATCH] ceph: Fix a memory leak in ci->i_head_snapc
Date: Mon, 18 Mar 2019 16:22:25 +0000	[thread overview]
Message-ID: <87lg1c17b2.fsf@suse.com> (raw)
In-Reply-To: <CAAM7YAkhA6ddto3KESWm8yw3D6f02ev7oYmEto=nK3mXZeUy4A@mail.gmail.com> (Zheng Yan's message of "Mon, 18 Mar 2019 19:42:51 +0800")

"Yan, Zheng" <ukernel@gmail.com> writes:

> On Mon, Mar 18, 2019 at 6:33 PM Luis Henriques <lhenriques@suse.com> wrote:
>>
>> "Yan, Zheng" <ukernel@gmail.com> writes:
>>
>> > On Fri, Mar 15, 2019 at 7:13 PM Luis Henriques <lhenriques@suse.com> wrote:
>> >>
>> >> I'm occasionally seeing a kmemleak warning in xfstest generic/013:
>> >>
>> >> unreferenced object 0xffff8881fccca940 (size 32):
>> >>   comm "kworker/0:1", pid 12, jiffies 4295005883 (age 130.648s)
>> >>   hex dump (first 32 bytes):
>> >>     01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
>> >>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>> >>   backtrace:
>> >>     [<00000000d741a1ea>] build_snap_context+0x5b/0x2a0
>> >>     [<0000000021a00533>] rebuild_snap_realms+0x27/0x90
>> >>     [<00000000ac538600>] rebuild_snap_realms+0x42/0x90
>> >>     [<000000000e955fac>] ceph_update_snap_trace+0x2ee/0x610
>> >>     [<00000000a9550416>] ceph_handle_snap+0x317/0x5f3
>> >>     [<00000000fc287b83>] dispatch+0x362/0x176c
>> >>     [<00000000a312c741>] ceph_con_workfn+0x9ce/0x2cf0
>> >>     [<000000004168e3a9>] process_one_work+0x1d4/0x400
>> >>     [<000000002188e9e7>] worker_thread+0x2d/0x3c0
>> >>     [<00000000b593e4b3>] kthread+0x112/0x130
>> >>     [<00000000a8587dca>] ret_from_fork+0x35/0x40
>> >>     [<00000000ba1c9c1d>] 0xffffffffffffffff
>> >>
>> >> It looks like it is possible that we miss a flush_ack from the MDS when,
>> >> for example, umounting the filesystem.  In that case, we can simply drop
>> >> the reference to the ceph_snap_context obtained in ceph_queue_cap_snap().
>> >>
>> >> Link: https://tracker.ceph.com/issues/38224
>> >> Cc: stable@vger.kernel.org
>> >> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> >> ---
>> >>  fs/ceph/caps.c | 7 +++++++
>> >>  1 file changed, 7 insertions(+)
>> >>
>> >> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
>> >> index 36a8dc699448..208f4dc6f574 100644
>> >> --- a/fs/ceph/caps.c
>> >> +++ b/fs/ceph/caps.c
>> >> @@ -1054,6 +1054,7 @@ int ceph_is_any_caps(struct inode *inode)
>> >>  static void drop_inode_snap_realm(struct ceph_inode_info *ci)
>> >>  {
>> >>         struct ceph_snap_realm *realm = ci->i_snap_realm;
>> >> +
>> >>         spin_lock(&realm->inodes_with_caps_lock);
>> >>         list_del_init(&ci->i_snap_realm_item);
>> >>         ci->i_snap_realm_counter++;
>> >> @@ -1063,6 +1064,12 @@ static void drop_inode_snap_realm(struct ceph_inode_info *ci)
>> >>         spin_unlock(&realm->inodes_with_caps_lock);
>> >>         ceph_put_snap_realm(ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc,
>> >>                             realm);
>> >> +       /*
>> >> +        * ci->i_head_snapc should be NULL, but we may still be waiting for a
>> >> +        * flush_ack from the MDS.  In that case, we still hold a ref for the
>> >> +        * ceph_snap_context and we need to drop it.
>> >> +        */
>> >> +       ceph_put_snap_context(ci->i_head_snapc);
>> >>  }
>> >>
>> >>  /*
>> >
>> > This does not seem right.  i_head_snapc is cleared when
>> > (ci->i_wrbuffer_ref_head == 0 && ci->i_dirty_caps == 0 &&
>> > ci->i_flushing_caps == 0) . Nothing do with dropping ci->i_snap_realm.
>> > Did you see 'reconnect denied' during the test? If you did, the fix
>> > should be in iterate_session_caps()
>> >
>>
>> No, I didn't saw any 'reconnect denied' in the test.  The test actually
>> seems to execute fine, except from the memory leak.
>>
>> It's very difficult to reproduce this issue, but last time I managed to
>> get this memory leak to trigger I actually had some debugging code in
>> drop_inode_snap_realm, something like:
>>
>>   if (ci->i_head_snapc)
>>         printk("i_head_snapc: 0x%px\n", ci->i_head_snapc);
>
> please add code that prints i_wrbuffer_ref_head, i_dirty_caps,
> i_flushing_caps. and try reproducing it again.
>

Ok, it took me a few hours, but I managed to reproduce the bug, with
those extra printks.  All those values are set to 0 when the bug
triggers (and i_head_snapc != NULL).

Cheers,
-- 
Luis


>
>>
>> This printk was only executed when the bug triggered (during a
>> filesystem umount) and the address shown was the same as in the kmemleak
>> warning.
>>
>> After spending some time looking, I assumed this to be a missing call to
>> handle_cap_flush_ack, which would do the i_head_snapc cleanup.
>>
>> Cheers,
>> --
>> Luis
>

WARNING: multiple messages have this Message-ID (diff)
From: Luis Henriques <lhenriques@suse.com>
To: "Yan, Zheng" <ukernel@gmail.com>
Cc: "Yan, Zheng" <zyan@redhat.com>, Sage Weil <sage@redhat.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	ceph-devel <ceph-devel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org
Subject: Re: [PATCH] ceph: Fix a memory leak in ci->i_head_snapc
Date: Mon, 18 Mar 2019 16:22:25 +0000	[thread overview]
Message-ID: <87lg1c17b2.fsf@suse.com> (raw)
In-Reply-To: <CAAM7YAkhA6ddto3KESWm8yw3D6f02ev7oYmEto=nK3mXZeUy4A@mail.gmail.com> (Zheng Yan's message of "Mon, 18 Mar 2019 19:42:51 +0800")

"Yan, Zheng" <ukernel@gmail.com> writes:

> On Mon, Mar 18, 2019 at 6:33 PM Luis Henriques <lhenriques@suse.com> wrote:
>>
>> "Yan, Zheng" <ukernel@gmail.com> writes:
>>
>> > On Fri, Mar 15, 2019 at 7:13 PM Luis Henriques <lhenriques@suse.com> wrote:
>> >>
>> >> I'm occasionally seeing a kmemleak warning in xfstest generic/013:
>> >>
>> >> unreferenced object 0xffff8881fccca940 (size 32):
>> >>   comm "kworker/0:1", pid 12, jiffies 4295005883 (age 130.648s)
>> >>   hex dump (first 32 bytes):
>> >>     01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
>> >>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>> >>   backtrace:
>> >>     [<00000000d741a1ea>] build_snap_context+0x5b/0x2a0
>> >>     [<0000000021a00533>] rebuild_snap_realms+0x27/0x90
>> >>     [<00000000ac538600>] rebuild_snap_realms+0x42/0x90
>> >>     [<000000000e955fac>] ceph_update_snap_trace+0x2ee/0x610
>> >>     [<00000000a9550416>] ceph_handle_snap+0x317/0x5f3
>> >>     [<00000000fc287b83>] dispatch+0x362/0x176c
>> >>     [<00000000a312c741>] ceph_con_workfn+0x9ce/0x2cf0
>> >>     [<000000004168e3a9>] process_one_work+0x1d4/0x400
>> >>     [<000000002188e9e7>] worker_thread+0x2d/0x3c0
>> >>     [<00000000b593e4b3>] kthread+0x112/0x130
>> >>     [<00000000a8587dca>] ret_from_fork+0x35/0x40
>> >>     [<00000000ba1c9c1d>] 0xffffffffffffffff
>> >>
>> >> It looks like it is possible that we miss a flush_ack from the MDS when,
>> >> for example, umounting the filesystem.  In that case, we can simply drop
>> >> the reference to the ceph_snap_context obtained in ceph_queue_cap_snap().
>> >>
>> >> Link: https://tracker.ceph.com/issues/38224
>> >> Cc: stable@vger.kernel.org
>> >> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> >> ---
>> >>  fs/ceph/caps.c | 7 +++++++
>> >>  1 file changed, 7 insertions(+)
>> >>
>> >> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
>> >> index 36a8dc699448..208f4dc6f574 100644
>> >> --- a/fs/ceph/caps.c
>> >> +++ b/fs/ceph/caps.c
>> >> @@ -1054,6 +1054,7 @@ int ceph_is_any_caps(struct inode *inode)
>> >>  static void drop_inode_snap_realm(struct ceph_inode_info *ci)
>> >>  {
>> >>         struct ceph_snap_realm *realm = ci->i_snap_realm;
>> >> +
>> >>         spin_lock(&realm->inodes_with_caps_lock);
>> >>         list_del_init(&ci->i_snap_realm_item);
>> >>         ci->i_snap_realm_counter++;
>> >> @@ -1063,6 +1064,12 @@ static void drop_inode_snap_realm(struct ceph_inode_info *ci)
>> >>         spin_unlock(&realm->inodes_with_caps_lock);
>> >>         ceph_put_snap_realm(ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc,
>> >>                             realm);
>> >> +       /*
>> >> +        * ci->i_head_snapc should be NULL, but we may still be waiting for a
>> >> +        * flush_ack from the MDS.  In that case, we still hold a ref for the
>> >> +        * ceph_snap_context and we need to drop it.
>> >> +        */
>> >> +       ceph_put_snap_context(ci->i_head_snapc);
>> >>  }
>> >>
>> >>  /*
>> >
>> > This does not seem right.  i_head_snapc is cleared when
>> > (ci->i_wrbuffer_ref_head == 0 && ci->i_dirty_caps == 0 &&
>> > ci->i_flushing_caps == 0) . Nothing do with dropping ci->i_snap_realm.
>> > Did you see 'reconnect denied' during the test? If you did, the fix
>> > should be in iterate_session_caps()
>> >
>>
>> No, I didn't saw any 'reconnect denied' in the test.  The test actually
>> seems to execute fine, except from the memory leak.
>>
>> It's very difficult to reproduce this issue, but last time I managed to
>> get this memory leak to trigger I actually had some debugging code in
>> drop_inode_snap_realm, something like:
>>
>>   if (ci->i_head_snapc)
>>         printk("i_head_snapc: 0x%px\n", ci->i_head_snapc);
>
> please add code that prints i_wrbuffer_ref_head, i_dirty_caps,
> i_flushing_caps. and try reproducing it again.
>

Ok, it took me a few hours, but I managed to reproduce the bug, with
those extra printks.  All those values are set to 0 when the bug
triggers (and i_head_snapc != NULL).

Cheers,
-- 
Luis


>
>>
>> This printk was only executed when the bug triggered (during a
>> filesystem umount) and the address shown was the same as in the kmemleak
>> warning.
>>
>> After spending some time looking, I assumed this to be a missing call to
>> handle_cap_flush_ack, which would do the i_head_snapc cleanup.
>>
>> Cheers,
>> --
>> Luis
>

  reply	other threads:[~2019-03-18 16:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-15 11:11 [PATCH] ceph: Fix a memory leak in ci->i_head_snapc Luis Henriques
2019-03-18  3:26 ` Yan, Zheng
2019-03-18 10:33   ` Luis Henriques
2019-03-18 10:33     ` Luis Henriques
2019-03-18 11:42     ` Yan, Zheng
2019-03-18 16:22       ` Luis Henriques [this message]
2019-03-18 16:22         ` Luis Henriques
2019-03-19  3:13         ` Yan, Zheng
2019-03-19  9:39           ` Luis Henriques
2019-03-19  9:39             ` Luis Henriques
2019-03-22 10:04             ` Luis Henriques
2019-03-22 10:04               ` Luis Henriques
2019-04-03  2:54               ` Yan, Zheng
2019-04-03  9:47                 ` Luis Henriques
2019-04-03  9:47                   ` Luis Henriques
2019-04-16 13:30                   ` Luis Henriques
2019-04-16 13:30                     ` Luis Henriques
2019-04-18  3:37                     ` Yan, Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lg1c17b2.fsf@suse.com \
    --to=lhenriques@suse.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sage@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=ukernel@gmail.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.