linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: NeilBrown <neilb@suse.de>
Cc: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Trond Myklebust <trondmy@hammerspace.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH rfc] workqueue: honour cond_resched() more effectively.
Date: Mon, 9 Nov 2020 08:50:58 +0100	[thread overview]
Message-ID: <20201109075058.GC12240@dhcp22.suse.cz> (raw)
In-Reply-To: <87v9efp7cs.fsf@notabene.neil.brown.name>

On Mon 09-11-20 13:54:59, Neil Brown wrote:
> 
> If a worker task for a normal (bound, non-CPU-intensive) calls
> cond_resched(), this allows other non-workqueue processes to run, but
> does *not* allow other workqueue workers to run.  This is because
> workqueue will only attempt to run one task at a time on each CPU,
> unless the current task actually sleeps.
> 
> This is already a problem for should_reclaim_retry() in mm/page_alloc.c,
> which inserts a small sleep just to convince workqueue to allow other
> workers to run.
> 
> It can also be a problem for NFS when closing a very large file (e.g.
> 100 million pages in memory).  NFS can call the final iput() from a
> workqueue, which can then take long enough to trigger a workqueue-lockup
> warning, and long enough for performance problems to be observed.
> 
> I added a WARN_ON_ONCE() in cond_resched() to report when it is run from
> a workqueue, and got about 20 hits during boot, many of system_wq (aka
> "events") which suggests there is a real chance that worker are being
> delayed unnecessarily be tasks which are written to politely relinquish
> the CPU.
> 
> I think that once a worker calls cond_resched(), it should be treated as
> though it was run from a WQ_CPU_INTENSIVE queue, because only cpu-intensive
> tasks need to call cond_resched().  This would allow other workers to be
> scheduled.
> 
> The following patch achieves this I believe.

I cannot really judge the implementation because my understanding of the
WQ concurrency control is very superficial but I echo that the existing
behavior is really nonintuitive. It certainly burnt me for the oom
situations where the page allocator cannot make much progress to reclaim
memory and it has to retry really hard. Having to handle worker context
explicitly/differently is error prone and as your example of final iput
in NFS shows that the allocator is not the only path affected so having
a general solution is better.

That being said I would really love to see cond_resched to work
transparently.

Thanks!
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2020-11-09  7:51 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-09  2:54 [PATCH rfc] workqueue: honour cond_resched() more effectively NeilBrown
2020-11-09  7:50 ` Michal Hocko [this message]
2020-11-09  8:00 ` Peter Zijlstra
2020-11-09 13:50   ` Trond Myklebust
2020-11-09 14:01     ` tj
2020-11-09 14:11       ` Trond Myklebust
2020-11-09 16:10         ` tj
2020-11-17 22:16           ` NeilBrown
     [not found]           ` <20201118025820.307-1-hdanton@sina.com>
2020-11-18  5:11             ` NeilBrown
     [not found]             ` <20201118055108.358-1-hdanton@sina.com>
2020-11-19 23:07               ` NeilBrown
2020-12-02 20:20                 ` tj
     [not found]               ` <20201120025953.607-1-hdanton@sina.com>
2020-11-20  4:33                 ` NeilBrown
     [not found]                 ` <20201126100646.1790-1-hdanton@sina.com>
2020-11-26 23:44                   ` NeilBrown
2020-11-19 23:23           ` NeilBrown
2020-11-25 12:36             ` tj
2020-11-26 23:30               ` NeilBrown
2020-11-09 14:20     ` Peter Zijlstra
2020-11-10  2:26       ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201109075058.GC12240@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=jiangshanlai@gmail.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=neilb@suse.de \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=trondmy@hammerspace.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).