linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dennis Zhou <dennis@kernel.org>
To: Wang Yugui <wangyugui@e16-tech.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, linux-btrfs@vger.kernel.org
Subject: Re: unexpected -ENOMEM from percpu_counter_init()
Date: Thu, 8 Apr 2021 13:48:33 +0000	[thread overview]
Message-ID: <YG8JsdAebEPqOhr/@google.com> (raw)
In-Reply-To: <20210408171959.2D72.409509F4@e16-tech.com>

On Thu, Apr 08, 2021 at 05:20:00PM +0800, Wang Yugui wrote:
> Hi,
> 
> > On Thu, Apr 08, 2021 at 07:28:01AM +0800, Wang Yugui wrote:
> > > Hi,
> > > 
> > > > > > > upper caller:
> > > > > > >     nofs_flag = memalloc_nofs_save();
> > > > > > >     ret = btrfs_drew_lock_init(&root->snapshot_lock);
> > > > > > >     memalloc_nofs_restore(nofs_flag);
> > > > 
> > > > The issue is here. nofs is set which means percpu attempts an atomic
> > > > allocation. If it cannot find anything already allocated it isn't happy.
> > > > This was done before memalloc_nofs_{save/restore}() were pervasive.
> > > > 
> > > > Percpu should probably try to allocate some pages if possible even if
> > > > nofs is set.
> > > 
> > > Thanks.
> > > 
> > > I will wait for the patch, and then test it.
> > > 
> > 
> > I'm currently a bit busy with some other things. Adding support I don't
> > think will be much work, just a little bit tricky.
> > 
> > I recommend carrying what you have minus the change to reserved percpu
> > memory for now. If I'm the one to write it, I'll cc you.
> > 
> > Thanks,
> > Dennis
> 
> 
> In the recent test, another problem is triggered too with my extended
> percpu buffer size patch. maybe this info is helpful.
> 
> problem:
> OS/VGA console is freezed , and no call stace is outputed.
> Just some info is outputed to IPMI/dell iDRAC
>    2 | 04/03/2021 | 11:35:01 | OS Critical Stop #0x46 | Run-time critical stop () | Asserted
>    3 | Linux kernel panic: Fatal excep
>    4 | Linux kernel panic: tion
>    5 | 04/05/2021 | 19:09:14 | OS Critical Stop #0x46 | Run-time critical stop () | Asserted
>    6 | Linux kernel panic: Fatal excep
>    7 | Linux kernel panic: tion
>    8 | 04/06/2021 | 13:08:42 | OS Critical Stop #0x46 | Run-time critical stop () | Asserted
>    9 | Linux kernel panic: Fatal excep
>    a | Linux kernel panic: tion
>    b | 04/08/2021 | 02:12:46 | OS Critical Stop #0x46 | Run-time critical stop () | Asserted
>    c | Linux kernel panic: Fatal excep
>    d | Linux kernel panic: tion

Unfortunately non of the above to me is useful.

> kernel: at least 5.10.26/5.10.27/5.10.28
> 
> This problem is triggered by our application, NOT xfstests.
> But our applicaiton have some heavy write load just like xfstest/generic/476.
> Our application use at most 75% of memory, if still not enough, 
> it will write out all buffer info to filesystem.

Do you use cgroups at all? If yes can you describe the workload pattern
a bit.

> This problem is happen in linux kernel 5.10.x, but not happen in linux
> kernel 5.4.x. It have high frequency to repduce too.

Ah. Can you try the following patch?
https://lore.kernel.org/lkml/20210408035736.883861-4-guro@fb.com/

Thanks,
Dennis


  reply	other threads:[~2021-04-08 13:48 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 10:51 Wang Yugui
2021-04-02  1:49 ` Wang Yugui
2021-04-07 12:35 ` Vlastimil Babka
2021-04-07 13:09   ` Wang Yugui
2021-04-07 14:56     ` Dennis Zhou
2021-04-07 23:28       ` Wang Yugui
2021-04-08  2:44         ` Dennis Zhou
2021-04-08  9:20           ` Wang Yugui
2021-04-08 13:48             ` Dennis Zhou [this message]
2021-04-08 14:28               ` Filipe Manana
2021-04-08 15:02                 ` Dennis Zhou
2021-04-09 11:39                   ` Filipe Manana
2021-04-09 13:39                     ` Dennis Zhou
2021-04-09 13:42                       ` Filipe Manana
2021-04-09  0:08               ` Wang Yugui
2021-04-09  2:14                 ` Dennis Zhou
2021-04-09  4:02                   ` Wang Yugui
2021-04-09  7:36                     ` Wang Yugui
2021-04-09  7:48                       ` Wang Yugui
2021-04-09 13:56                       ` Dennis Zhou
2021-04-10 15:29                         ` Wang Yugui
2021-04-10 15:52                           ` Dennis Zhou
2021-04-10 16:08                             ` Wang Yugui
2021-04-11 15:20                               ` Wang Yugui
2021-04-12  4:03                                 ` Dennis Zhou
2021-04-12  5:24                                   ` Wang Yugui
2021-04-09  9:52   ` Wang Yugui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YG8JsdAebEPqOhr/@google.com \
    --to=dennis@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    --cc=wangyugui@e16-tech.com \
    --subject='Re: unexpected -ENOMEM from percpu_counter_init()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox