All of lore.kernel.org
 help / color / mirror / Atom feed
From: yangerkun <yangerkun@huawei.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: <jack@suse.cz>, <linux-ext4@vger.kernel.org>, <yukuai3@huawei.com>
Subject: Re: [PATCH] ext4: flush s_error_work before journal destroy in ext4_fill_super
Date: Tue, 24 Aug 2021 18:11:59 +0800	[thread overview]
Message-ID: <3296465e-8da2-99ac-32cf-9cde32f4de32@huawei.com> (raw)
In-Reply-To: <eb962c26-b013-957b-7931-feda7f8bf5b5@huawei.com>

Sorry for that this patch will lead a bug when mount failed on arm64(I 
have verify this patch on x86, and it seems ok...).

I will try to fix the problem with another way....

[2021-08-24 17:43:29 INFO] [   45.808127] Internal error: Oops: 96000004 
[#1] SMP
[2021-08-24 17:43:29 INFO] [   45.812986] Modules linked in: realtek 
hclge hns3 megaraid_sas hisi_sas_v3_hw hibmc_drm hisi_sas_main 
host_edma_drv hnae3
[2021-08-24 17:43:29 INFO] [   45.823896] Process mount (pid: 1187, 
stack limit = 0x00000000e0bd7181)
[2021-08-24 17:43:29 INFO] [   45.830481] CPU: 9 PID: 1187 Comm: mount 
Not tainted 4.19.90_4.19.90-2108.7.0+ #2
[2021-08-24 17:43:29 INFO] [   45.837929] Hardware name: Huawei TaiShan 
2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.26.01 06/14/2019
[2021-08-24 17:43:29 INFO] [   45.846587] pstate: 10400009 (nzcV daif 
+PAN -UAO)
[2021-08-24 17:43:29 INFO] [   45.851359] pc : 
percpu_counter_add_batch+0x5c/0x358
[2021-08-24 17:43:29 INFO] [   45.856303] lr : 
ext4_es_free_extent+0xd4/0x4a8
[2021-08-24 17:43:29 INFO] [   45.860812] sp : ffff80262a60f170
[2021-08-24 17:43:29 INFO] [   45.864112] x29: ffff80262a60f170 x28: 
dfff200000000000
[2021-08-24 17:43:29 INFO] [   45.869399] x27: ffff8026403e65cc x26: 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.874686] x25: ffff80266c429e20 x24: 
0000000000000100
[2021-08-24 17:43:29 INFO] [   45.879973] x23: 1ffff004cd8853c4 x22: 
ffffffffffffffff
[2021-08-24 17:43:29 INFO] [   45.885260] x21: 0000000000000000 x20: 
00006026b7a31000
[2021-08-24 17:43:29 INFO] [   45.890547] x19: ffff80266c429e00 x18: 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.895833] x17: 0000000000000000 x16: 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.901120] x15: 1fffe4000504bd02 x14: 
ffff200023ba1adc
[2021-08-24 17:43:29 INFO] [   45.906406] x13: ffff200024332530 x12: 
ffff20002433240c
[2021-08-24 17:43:29 INFO] [   45.911693] x11: ffff2000243304f4 x10: 
ffff200024328f6c
[2021-08-24 17:43:29 INFO] [   45.916980] x9 : 1ffff004c807ccb9 x8 : 
ffff200023b75950
[2021-08-24 17:43:29 INFO] [   45.922266] x7 : ffff200023ba1fe0 x6 : 
0000000000000004
[2021-08-24 17:43:29 INFO] [   45.927553] x5 : 0000000000000000 x4 : 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.932839] x3 : 00000c04d6f46200 x2 : 
dfff200000000000
[2021-08-24 17:43:29 INFO] [   45.938126] x1 : 0000000000000003 x0 : 
00006026b7a31000
[2021-08-24 17:43:29 INFO] [   45.943413] Call trace:
[2021-08-24 17:43:29 INFO] [   45.945851] 
percpu_counter_add_batch+0x5c/0x358
[2021-08-24 17:43:29 INFO] [   45.950446]  ext4_es_free_extent+0xd4/0x4a8
[2021-08-24 17:43:29 INFO] [   45.954610]  __es_remove_extent+0x37c/0x598
[2021-08-24 17:43:29 INFO] [   45.958773]  ext4_es_remove_extent+0x6c/0x270
[2021-08-24 17:43:29 INFO] [   45.963110]  ext4_clear_inode+0x50/0x188
[2021-08-24 17:43:29 INFO] [   45.967016]  ext4_evict_inode+0x314/0x1450
[2021-08-24 17:43:29 INFO] [   45.971094]  evict+0x238/0x5a0
[2021-08-24 17:43:29 INFO] [   45.974136]  iput+0x2bc/0x6b8
[2021-08-24 17:43:29 INFO] [   45.977090]  jbd2_journal_destroy+0x408/0x800
[2021-08-24 17:43:29 INFO] [   45.981428]  ext4_fill_super+0x5a04/0x8cc0
[2021-08-24 17:43:29 INFO] [   45.985505]  mount_bdev+0x268/0x328
[2021-08-24 17:43:29 INFO] [   45.988978]  ext4_mount+0x44/0x58
[2021-08-24 17:43:29 INFO] [   45.992277]  mount_fs+0x68/0x390
[2021-08-24 17:43:29 INFO] [   45.995492]  vfs_kern_mount.part.2+0x54/0x388
[2021-08-24 17:43:29 INFO] [   45.999830]  do_mount+0xa5c/0x20d0
[2021-08-24 17:43:29 INFO] [   46.003216]  ksys_mount+0x9c/0x118
[2021-08-24 17:43:29 INFO] [   46.006603]  __arm64_sys_mount+0xa8/0x108
[2021-08-24 17:43:29 INFO] [   46.010595]  el0_svc_common+0x10c/0x4a0
[2021-08-24 17:43:29 INFO] [   46.014415]  el0_svc_handler+0x170/0x248
[2021-08-24 17:43:29 INFO] [   46.018320]  el0_svc+0x10/0x218
[2021-08-24 17:43:29 INFO] [   46.021448] Code: f2fbffe2 d343fc03 
92400801 11000c21 (38e26862)
[2021-08-24 17:43:29 INFO] [   46.027513] ---[ end trace 
316fdb8833588599 ]---
[2021-08-24 17:43:29 INFO] [   46.037646] Kernel panic - not syncing: 
Fatal exception
[2021-08-24 17:43:29 INFO] [   46.042848] SMP: stopping secondary CPUs
[2021-08-24 17:43:29 INFO] [   46.046779] Kernel Offset: 0x1baf0000 from 
0xffff200008000000
[2021-08-24 17:43:29 INFO] [   46.052500] CPU features: 0x12,a2a00a38
[2021-08-24 17:43:29 INFO] [   46.056318] Memory Limit: none
[2021-08-24 17:43:29 INFO] [   46.064887] Rebooting in 1 seconds..
[2021-08-24 17:43:30 INFO] [   47.068489] SMP: stopping secondary CPUs
[2021-08-24 17:43:32 INFO] [   48.158471] SMP: failed to stop secondary 
CPUs 0-127

在 2021/7/23 21:25, yangerkun 写道:
> 
> 
> 在 2021/7/23 20:11, Theodore Ts'o 写道:
>> On Tue, Jul 20, 2021 at 02:24:09PM +0800, yangerkun wrote:
>>> 'commit c92dc856848f ("ext4: defer saving error info from atomic
>>> context")' and '2d01ddc86606 ("ext4: save error info to sb through 
>>> journal
>>> if available")' add s_error_work to fix checksum error problem. But the
>>> error path in ext4_fill_super can lead the follow BUG_ON.
>>
>> Can you share with me your test case?  Your patch will result in the
>> shrinker potentially not getting released in some error paths (which
>> will cause other kernel panics), and in any case, once the journal is
> 
> Hi Ted,
> 
> The only logic we have changed is that we move the flush_work before we 
> call jbd2_journal_destory. I have not seen the problem you describe... 
> Can you help to explain more...
> 
> Thanks,
> Kun.
> 
>> destroyed here:
>>
>>> @@ -5173,15 +5173,15 @@ static int ext4_fill_super(struct super_block 
>>> *sb, void *data, int silent)
>>>       ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
>>>       sbi->s_ea_block_cache = NULL;
>>> +failed_mount3a:
>>> +    ext4_es_unregister_shrinker(sbi);
>>> +failed_mount3:
>>> +    flush_work(&sbi->s_error_work);
>>>       if (sbi->s_journal) {
>>>           jbd2_journal_destroy(sbi->s_journal);
>>>           sbi->s_journal = NULL;
>>>       }
>>> -failed_mount3a:
>>> -    ext4_es_unregister_shrinker(sbi);
>>> -failed_mount3:
>>> -    flush_work(&sbi->s_error_work);
>>
>> sbi->s_journal is set to NULL, which means that in
>> flush_stashed_error_work(), journal will be NULL, which means we won't
>> call start_this_handle and so this change will not make a difference
>> given the kernel stack trace in the commit description.
>>
>> Thanks,
>>
>>                         - Ted
>> .
>>
> .

      reply	other threads:[~2021-08-24 10:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20  6:24 [PATCH] ext4: flush s_error_work before journal destroy in ext4_fill_super yangerkun
2021-07-23 12:11 ` Theodore Ts'o
2021-07-23 13:11   ` yangerkun
2021-07-23 19:11     ` Theodore Ts'o
2021-07-26  7:13       ` yangerkun
2021-07-26 13:26         ` Jan Kara
2021-08-03 11:13           ` yangerkun
2021-07-23 13:25   ` yangerkun
2021-08-24 10:11     ` yangerkun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3296465e-8da2-99ac-32cf-9cde32f4de32@huawei.com \
    --to=yangerkun@huawei.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.