linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: Dennis Zhou <dennis@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, linux-btrfs@vger.kernel.org
Subject: Re: unexpected -ENOMEM from percpu_counter_init()
Date: Fri, 09 Apr 2021 15:48:12 +0800	[thread overview]
Message-ID: <20210409154811.C541.409509F4@e16-tech.com> (raw)
In-Reply-To: <20210409153636.C53D.409509F4@e16-tech.com>

Hi,

Add top/free info when our applicaiton pipeline is running.

> Hi,
> 
> some question about workqueue for percpu.
> 
> > > > 
> > > > And a question about this,
> > > > > > > > upper caller:
> > > > > > > >     nofs_flag = memalloc_nofs_save();
> > > > > > > >     ret = btrfs_drew_lock_init(&root->snapshot_lock);
> > > > > > > >     memalloc_nofs_restore(nofs_flag);
> > > > > 
> > > > > The issue is here. nofs is set which means percpu attempts an atomic
> > > > > allocation. If it cannot find anything already allocated it isn't happy.
> > > > > This was done before memalloc_nofs_{save/restore}() were pervasive.
> > > > > 
> > > > > Percpu should probably try to allocate some pages if possible even if
> > > > > nofs is set.
> > > > 
> > > > Should we check and pre-alloc memory inside memalloc_nofs_restore()?
> > > > another memalloc_nofs_save() may come soon.
> > > > 
> > > > something like this in memalloc_nofs_save()?
> > > > 	if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_LOW)
> > > >  		pcpu_schedule_balance_work();
> > > > 
> > > 
> > > Percpu does do this via a workqueue item. The issue is in v5.9 we
> > > introduced 2 types of chunks. However, the free float page number was
> > > for the total. So even if 1 chunk type dropped below, the other chunk
> > > type might have enough pages. I'm queuing this for 5.12 and will send it
> > > out assuming it does fix your problem.
> 
> workqueue for percpu maybe not strong enough( not scheduled?) when high
> CPU load?
> 
> this is our application pipeline.
> 	file_pre_process |
> 	bwa.nipt xx |
> 	samtools.nipt sort xx |
> 	file_post_process
> 
> file_pre_process/file_post_process is fast, so often are blocked by
> pipe input/output.
> 
> 'bwa.nipt xx' is a high-cpu-load, almost all of CPU cores.
> 
> 'samtools.nipt sort xx' is a high-mem-load, it keep the input in memory.
> if the memory is not enough, it will save all the buffer to temp file,
> so it is sometimes high-IO-load too(write 60G or more to file).
> 
> 
> xfstests(generic/476) is just high-IO-load, cpu/memory load is NOT high.
> so xfstests(generic/476) maybe easy than our application pipeline.


# nproc
40
# top
top - 15:43:06 up 10:16,  1 user,  load average: 41.39, 37.90, 35.98
Tasks: 488 total,   3 running, 485 sleeping,   0 stopped,   0 zombie
%Cpu(s): 99.6 us,  0.1 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  0.0 si,  0.0 st
MiB Mem : 58.3/193384.1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||                                       ]
MiB Swap:  0.0/0.0      [                                                                                             ]


# free -h
              total        used        free      shared  buff/cache   available
Mem:          188Gi        98Gi       5.8Gi        17Mi        84Gi        78Gi
Swap:            0B          0B          0B

memory reclaim from 'buff/cache' is easy to happen.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/04/09


> Although there is yet not a simple reproducer for another problem
> happend here, but there is a little high chance that something is wrong
> in btrfs/mm/fs-buffer.
> > but another problem(os freezed without call trace, PANIC without OOPS?,
> > the reason is yet unkown) still happen.
> 
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2021/04/09
> 




  reply	other threads:[~2021-04-09  7:48 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 10:51 Wang Yugui
2021-04-02  1:49 ` Wang Yugui
2021-04-07 12:35 ` Vlastimil Babka
2021-04-07 13:09   ` Wang Yugui
2021-04-07 14:56     ` Dennis Zhou
2021-04-07 23:28       ` Wang Yugui
2021-04-08  2:44         ` Dennis Zhou
2021-04-08  9:20           ` Wang Yugui
2021-04-08 13:48             ` Dennis Zhou
2021-04-08 14:28               ` Filipe Manana
2021-04-08 15:02                 ` Dennis Zhou
2021-04-09 11:39                   ` Filipe Manana
2021-04-09 13:39                     ` Dennis Zhou
2021-04-09 13:42                       ` Filipe Manana
2021-04-09  0:08               ` Wang Yugui
2021-04-09  2:14                 ` Dennis Zhou
2021-04-09  4:02                   ` Wang Yugui
2021-04-09  7:36                     ` Wang Yugui
2021-04-09  7:48                       ` Wang Yugui [this message]
2021-04-09 13:56                       ` Dennis Zhou
2021-04-10 15:29                         ` Wang Yugui
2021-04-10 15:52                           ` Dennis Zhou
2021-04-10 16:08                             ` Wang Yugui
2021-04-11 15:20                               ` Wang Yugui
2021-04-12  4:03                                 ` Dennis Zhou
2021-04-12  5:24                                   ` Wang Yugui
2021-04-09  9:52   ` Wang Yugui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210409154811.C541.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=dennis@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    --subject='Re: unexpected -ENOMEM from percpu_counter_init()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox