All of lore.kernel.org
 help / color / mirror / Atom feed
From: <tao.peng@emc.com>
To: <bhalevy@tonian.com>
Cc: <Trond.Myklebust@netapp.com>, <linux-nfs@vger.kernel.org>,
	<honey@citi.umich.edu>, <rees@umich.edu>
Subject: RE: [PATCH 2/3] pnfs: introduce pnfs private workqueue
Date: Wed, 21 Sep 2011 06:23:54 -0400	[thread overview]
Message-ID: <F19688880B763E40B28B2B462677FBF805C4952A24@MX09A.corp.emc.com> (raw)
In-Reply-To: <4E798C93.40409@tonian.com>

Hi, Benny,

> -----Original Message-----
> From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-owner@vger.kernel.org]
> On Behalf Of Benny Halevy
> Sent: Wednesday, September 21, 2011 3:05 PM
> To: Peng, Tao
> Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org;
> honey@citi.umich.edu; rees@umich.edu
> Subject: Re: [PATCH 2/3] pnfs: introduce pnfs private workqueue
> 
> On 2011-09-21 08:16, tao.peng@emc.com wrote:
> >> -----Original Message-----
> >> From: Myklebust, Trond [mailto:Trond.Myklebust@netapp.com]
> >> Sent: Wednesday, September 21, 2011 12:20 PM
> >> To: Peng, Tao
> >> Cc: bhalevy@tonian.com; linux-nfs@vger.kernel.org; honey@citi.umich.edu;
> >> rees@umich.edu
> >> Subject: RE: [PATCH 2/3] pnfs: introduce pnfs private workqueue
> >>
> >>> -----Original Message-----
> >>> From: tao.peng@emc.com [mailto:tao.peng@emc.com]
> >>> Sent: Tuesday, September 20, 2011 10:44 PM
> >>> To: Myklebust, Trond
> >>> Cc: bhalevy@tonian.com; linux-nfs@vger.kernel.org;
> >> honey@citi.umich.edu;
> >>> rees@umich.edu
> >>> Subject: RE: [PATCH 2/3] pnfs: introduce pnfs private workqueue
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: linux-nfs-owner@vger.kernel.org
> >>>> [mailto:linux-nfs-owner@vger.kernel.org]
> >>>> On Behalf Of Jim Rees
> >>>> Sent: Wednesday, September 21, 2011 8:29 AM
> >>>> To: Trond Myklebust
> >>>> Cc: Benny Halevy; linux-nfs@vger.kernel.org; peter honeyman
> >>>> Subject: Re: [PATCH 2/3] pnfs: introduce pnfs private workqueue
> >>>>
> >>>> Trond Myklebust wrote:
> >>>>
> >>>>   On Mon, 2011-09-19 at 23:18 -0400, Jim Rees wrote:
> >>>>   > From: Peng Tao <bergwolf@gmail.com>
> >>>>   >
> >>>>   > For layoutdriver io done functions, default workqueue is not a
> >> good
> >>> place as
> >>>>   > the code is executed in IO path. So add a pnfs private workqueue
> >> to
> >>> handle
> >>>>   > them.
> >>>>   >
> >>>>   > Also change block and object layout code to make use of this
> >> private
> >>>>   > workqueue.
> >>>>   >
> >>>>
> >>>>   Wait, what????
> >>>>
> >>>>   Why is the I/O path (i.e. the nfsiod queue) inappropriate for
> >>>>   layoutdriver io_done functions?
> >>> The current code uses the system_wq to handle io_done functions. If
> >> any
> >>> function allocates memory in system_wq and causes dirty writeback in
> >> nfs
> >>> path, the io_done function can never be called because it is after the
> >> original
> >>> function in the system_wq. And you said that the rpciod/nfsiod is not
> >> the
> >>> ideal place because of memory allocation. So I suggested pnfs private
> >>> workqueue and Benny agreed on it.
> >>
> >> You appear to be optimizing for a corner case. Why?
> > Current code uses system_wq for io_done and it can cause deadlock. So this is
> more than just an optimization.
> >
> >>
> >> IOW: why do we need a whole new workqueue for something which is
> >> supposed to be extremely rare? Why can't we just use standard keventd or
> >> allocate a perfectly normal thread (e.g. the state recovery thread)?
> > Unlike nfsiod and rpciod, keventd (aka, system_wq) is not created with
> WQ_MEM_RECLAIM, indicating that it should not be used for IO path which can be
> invoked during memory reclaim. And I agree that a normal kthread can solve it.
> >
> > However, the blocklayout write end_io function still needs a context where it can
> allocate memory to convert extent state (for every IO that touches new extents). If
> you look into the patch, we are not just using the pnfs private workqueue to handle
> pnfs_ld_read/write_done, but also calling mark_extents_written() inside the
> workqueue.
> 
> Can this memory be preallocated before the I/O for the common case?
It is possible to preallocate memory for the common case. But it doesn't remove the need for workqueue. Without workqueue, we will need to put a bunch of extent manipulation code into bio->end_io, which is executed in bottom half and we always need minimize its execution.

Unless we do following:
1. preallocate memory for extent state convertion
2. use nfsiod/rpciod to handle bl_write_cleanup
3. for pnfs error case, create a kthread to recollapse and resend to MDS

not sure if it worth the complexity though...

Cheers,
Tao

> 
> Benny
> 
> > If we don't create a pnfs private workqueue for pnfs_ld_read/write_done (i.e., by
> just creating a normal kthread in error case), we would still have to create a
> workqueue inside blocklayout driver to handle write end_io. And if blocklayout has a
> private workqueue, it no longer needs the normal kthread for read/write_done error
> handling, which leaves the kthread only specific to object layout (if it doesn't need a
> workqueue like blocklayout does).
> >
> > So I think a generic pnfs workqueue is better because it simplifies things and solve
> both problems.
> >
> > Best,
> > Tao
> >
> >>
> >> Trond
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2011-09-21 10:24 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-20  3:18 [PATCH 0/3] replacement for "introduce pnfs private workqueue" Jim Rees
2011-09-20  3:18 ` [PATCH 1/3] pnfsblock: add missing rpc_put_mount and path_put Jim Rees
2011-09-20  3:18 ` [PATCH 2/3] pnfs: introduce pnfs private workqueue Jim Rees
2011-09-20 22:41   ` Trond Myklebust
2011-09-21  0:29     ` Jim Rees
2011-09-21  2:44       ` tao.peng
2011-09-21  4:20         ` Myklebust, Trond
2011-09-21  5:16           ` tao.peng
2011-09-21  7:04             ` Benny Halevy
2011-09-21 10:23               ` tao.peng [this message]
2011-09-21 10:38                 ` Boaz Harrosh
2011-09-21 11:04                   ` tao.peng
2011-09-21 10:56                 ` Benny Halevy
2011-09-21 11:10                   ` tao.peng
2011-09-21 11:27                     ` Benny Halevy
2011-09-21 11:42                       ` Boaz Harrosh
2011-09-21 11:50                         ` Benny Halevy
2011-09-21 13:56                           ` Boaz Harrosh
2011-09-21 15:45                             ` Peng Tao
2011-09-21 16:03                               ` Trond Myklebust
2011-09-22  3:30                                 ` tao.peng
2011-09-22  7:17                                 ` Boaz Harrosh
2011-09-21  4:22       ` Myklebust, Trond
2011-09-20  3:18 ` [PATCH 3/3] SQUASHME: pnfs: simplify and clean up pnfsiod workqueue Jim Rees
2011-09-21 11:52 ` [PATCH 0/3] replacement for "introduce pnfs private workqueue" Benny Halevy
2011-09-21 12:32   ` Jim Rees
  -- strict thread matches above, loose matches on Subject: below --
2011-09-10 17:41 [PATCH 0/3] pnfs private workqueue, and two cleanups Jim Rees
2011-09-10 17:41 ` [PATCH 2/3] pNFS: introduce pnfs private workqueue Jim Rees
2011-09-11 14:51   ` Benny Halevy
2011-09-11 15:15     ` Benny Halevy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F19688880B763E40B28B2B462677FBF805C4952A24@MX09A.corp.emc.com \
    --to=tao.peng@emc.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bhalevy@tonian.com \
    --cc=honey@citi.umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rees@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.