linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>, Lang Cheng <chenglang@huawei.com>,
	dledford@redhat.com, davem@davemloft.net, salil.mehta@huawei.com,
	yisen.zhuang@huawei.com, linuxarm@huawei.com,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	Saeed Mahameed <saeedm@mellanox.com>
Subject: Re: [RFC rdma-next] RDMA/core: Add attribute WQ_MEM_RECLAIM to workqueue "infiniband"
Date: Wed, 19 Feb 2020 08:45:07 +0200	[thread overview]
Message-ID: <20200219064507.GC15239@unreal> (raw)
In-Reply-To: <212eda31-cc86-5487-051b-cb51c368b6fe@huawei.com>

On Wed, Feb 19, 2020 at 09:13:23AM +0800, Yunsheng Lin wrote:
> On 2020/2/18 23:31, Jason Gunthorpe wrote:
> > On Tue, Feb 18, 2020 at 11:35:35AM +0800, Lang Cheng wrote:
> >> The hns3 driver sets "hclge_service_task" workqueue with
> >> WQ_MEM_RECLAIM flag in order to guarantee forward progress
> >> under memory pressure.
> >
> > Don't do that. WQ_MEM_RECLAIM is only to be used by things interlinked
> > with reclaimed processing.
> >
> > Work on queues marked with WQ_MEM_RECLAIM can't use GFP_KERNEL
> > allocations, can't do certain kinds of sleeps, can't hold certain
> > kinds of locks, etc.
>
> From mlx5 driver, it seems that there is GFP_KERNEL allocations
> on wq marked with WQ_MEM_RECLAIM too:
>
> mlx5e_tx_timeout_work() -> mlx5e_safe_reopen_channels() ->
> mlx5e_safe_switch_channels() -> mlx5e_open_channels()
>
> kcalloc() is called with GFP_KERNEL in mlx5e_open_channels(),
> and mlx5e_tx_timeout_work() is queued with priv->wq, which is
> allocated with WQ_MEM_RECLAIM flags. see:
>
> mlx5e_netdev_init() -> create_singlethread_workqueue()

There are two reasons for that, first mlx5 driver was written far before
WQ_MEM_RECLAIM usage was clarified, second mlx5 has bugs.

>
>
> From the comment in kernel/workqueue.c, the work queued with
> wq with WQ_MEM_RECLAIM flag set seems to be executed without
> blocking under some rare case. I still not quite understand
> the comment, and I can not find any doc that point out the
> GFP_KERNEL allocations can not be done in wq with WQ_MEM_RECLAIM
> yet. Is there any doc that mentions that GFP_KERNEL allocations
> can not be done in wq with WQ_MEM_RECLAIM?

It is whole purpose of WQ_MEM_RECLAIM flag - allow progress in case of
memory pressure. Allocation memory while we are under memory pressure
is an invitation for a disaster.

>
>
> /**
>  * rescuer_thread - the rescuer thread function
>  * @__rescuer: self
>  *
>  * Workqueue rescuer thread function.  There's one rescuer for each
>  * workqueue which has WQ_MEM_RECLAIM set.
>  *
>  * Regular work processing on a pool may block trying to create a new
>  * worker which uses GFP_KERNEL allocation which has slight chance of
>  * developing into deadlock if some works currently on the same queue
>  * need to be processed to satisfy the GFP_KERNEL allocation.  This is
>  * the problem rescuer solves.
>  *
>  * When such condition is possible, the pool summons rescuers of all
>  * workqueues which have works queued on the pool and let them process
>  * those works so that forward progress can be guaranteed.
>  *
>  * This should happen rarely.
>  *
>  * Return: 0
>  */
>
>
> The below is the reason we add the sets "hclge_service_task" workqueue
> with WQ_MEM_RECLAIM through analysing why other ethernet drivers has
> allocated wq with WQ_MEM_RECLAIM flag, I may be wrong about that:

Many drivers are developed using copy/paste technique, so it is wrong
to assume that "other ethernet drivers" did the right thing.

>
> hns3 ethernet driver may be used as the low level transport of a
> network file system, memory reclaim data path may depend on the
> worker in hns3 driver to bring back the ethernet link so that it flush
> the some cache to network based disk.

Unlikely that this "network file system" dependency on ethernet link is correct.

Thanks

>
> >
> > Jason
> >
> >
>

  reply	other threads:[~2020-02-19  6:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18  3:35 [RFC rdma-next] RDMA/core: Add attribute WQ_MEM_RECLAIM to workqueue "infiniband" Lang Cheng
2020-02-18 15:31 ` Jason Gunthorpe
2020-02-19  1:13   ` Yunsheng Lin
2020-02-19  6:45     ` Leon Romanovsky [this message]
2020-02-19  7:40       ` Yunsheng Lin
2020-02-19 11:07         ` Leon Romanovsky
2020-02-20  1:16           ` Yunsheng Lin
2020-02-19 13:04         ` Jason Gunthorpe
2020-02-20  1:06           ` Yunsheng Lin
2020-02-20 17:46         ` Alexander Duyck
2020-02-21  1:44           ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200219064507.GC15239@unreal \
    --to=leon@kernel.org \
    --cc=chenglang@huawei.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=salil.mehta@huawei.com \
    --cc=yisen.zhuang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).