All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiubo Li <xiubli@redhat.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org
Subject: Re: [PATCH] ceph: do not truncate pagecache if truncate size doesn't change
Date: Tue, 23 Nov 2021 16:06:13 +0800	[thread overview]
Message-ID: <3e08e0d6-5ab8-d6b1-7ee8-86b14dec7c89@redhat.com> (raw)
In-Reply-To: <bfd6b13b-efdc-6362-de9d-92a243f5b166@redhat.com>


On 11/23/21 9:00 AM, Xiubo Li wrote:
>
> On 11/23/21 3:10 AM, Jeff Layton wrote:
[...]
>> One thing I'm finding today is that this patch reliably makes
>> generic/445 hang at umount time with -o test_dummy_encryption
>> enabled...which is a bit strange as the test doesn't actually run:
>>
>>      [jlayton@client1 xfstests-dev]$ sudo ./tests/generic/445
>>      QA output created by 445
>>      445 not run: xfs_io falloc  failed (old kernel/wrong fs?)
>>      [jlayton@client1 xfstests-dev]$ sudo umount /mnt/test
>>
>> ...and the umount hangs waiting for writeback to complete. When I back
>> this patch out, the problem goes away. Are you able to reproduce this?
>>
>> There are no mds or osd calls in flight, and no caps (according to
>> debugfs). This is using -o test_dummy_encryption to force encryption.
>
> I have hit a same issue without the "test_dummy_encryption", and it 
> got stuck but I didn't see any call to ceph. But not the 445, I 
> couldn't remember which one, I thought it was something wrong with my 
> OS, I just rebooted my VM.
>
> # ps -aux | grep generic
>
> root      564385  0.0  0.0  11804  4700 pts/1    S+   09:41 0:00 
> /bin/bash ./tests/generic/318
>
> # cat /proc/564385/stack
>
> [<0>] do_wait+0x2cc/0x4e0
> [<0>] kernel_wait4+0xec/0x1b0
> [<0>] __do_sys_wait4+0xe0/0xf0
> [<0>] do_syscall_64+0x37/0x80
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
I have hit this again today, I found that the MDS daemon crashed, and 
when the standby MDSes were replaying the journal log they crashed too.

I think this should be the reason why they stuck. I will check it.

-- Xiubo


> I ran the ceph.exlude tests for two days, I just saw this one time.
>
> I have attached the test results, does it the same with yours ? There 
> have many test cases didn't run.
>
> There have 4 failures and for the generic/020 it will be reproducable 
> by 30%. All the other 3 failures are every time, but they all seems 
> not relevant to fscrypt.
>
>
>> I narrowed it down to the call to _require_seek_data_hole. That calls
>> the seek_sanity_test binary and after that point, umounting the fs
>> hangs. I've not yet been successful at reproducing this while running
>> the binary by hand, so there may be some other preliminary ops that are
>> a factor too.
>>
>> In any case, this looks like a regression, so I'm going to drop this
>> patch for now. I'll keep poking at the problem too however.


  reply	other threads:[~2021-11-23  8:07 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-16  9:20 [PATCH] ceph: do not truncate pagecache if truncate size doesn't change xiubli
2021-11-16 20:06 ` Jeff Layton
2021-11-17  1:21   ` Xiubo Li
2021-11-17 13:28     ` Jeff Layton
2021-11-17 13:40       ` Xiubo Li
2021-11-17 13:50         ` Jeff Layton
2021-11-17 15:06     ` Jeff Layton
2021-11-18  2:38       ` Xiubo Li
2021-11-18 12:19         ` Jeff Layton
2021-11-19  2:20           ` Xiubo Li
2021-11-17  2:47   ` Yan, Zheng
2021-11-17  4:19     ` Xiubo Li
2021-11-17 21:10 ` Jeff Layton
2021-11-18  4:46   ` Xiubo Li
2021-11-18  9:59     ` Xiubo Li
     [not found] ` <09babbaf077a76ace4793f2e6ae6127d2e7d6411.camel@kernel.org>
2021-11-19  4:29   ` Xiubo Li
2021-11-19  4:33     ` Xiubo Li
2021-11-19 11:59     ` Jeff Layton
2021-11-20  0:58       ` Xiubo Li
2021-11-22 19:10         ` Jeff Layton
2021-11-23  1:00           ` Xiubo Li
2021-11-23  8:06             ` Xiubo Li [this message]
2021-11-23  3:11           ` Xiubo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e08e0d6-5ab8-d6b1-7ee8-86b14dec7c89@redhat.com \
    --to=xiubli@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=vshankar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.