All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy@panasas.com>
To: Fred Isaman <iisaman@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	andros@netapp.com, linux-nfs@vger.kernel.org,
	Andy Adamon <andros@citi.umich.edu>,
	Dean Hildebrand <dhildeb@us.ibm.com>,
	Boaz Harrosh <bharrosh@panasas.com>,
	Oleg Drokin <green@linuxhacker.ru>, Tao Guo <guotao@nrchpc.ac.cn>
Subject: Re: [PATCH 09/16] pnfs: wave 3: shift pnfs_update_layout locations
Date: Tue, 15 Feb 2011 22:11:40 -0500	[thread overview]
Message-ID: <4D5B406C.4080801@panasas.com> (raw)
In-Reply-To: <AANLkTinxqbmWvWXeaPnw7oNtL_zfCYOTB=KjZFJQaKFn@mail.gmail.com>

On 2011-02-15 09:41, Fred Isaman wrote:
> On Mon, Feb 14, 2011 at 6:14 PM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
>> On Mon, 2011-02-14 at 14:18 -0500, andros@netapp.com wrote:
>>> From: Fred Isaman <iisaman@netapp.com>
>>>
>>> Move the pnfs_update_layout call location to nfs_pageio_do_add_request().
>>> Grab the lseg sent in the doio function to nfs_read_rpcsetup and attach
>>> it to each nfs_read_data so it can be sent to the layout driver.
>>>
>>> Signed-off-by: Andy Adamon <andros@netapp.com>
>>> Signed-off-by: Andy Adamon <andros@citi.umich.edu>
>>> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
>>> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
>>> Signed-off-by: Fred Isaman <iisaman@netapp.com>
>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
>>> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
>>> ---
>>>  fs/nfs/file.c            |    4 ----
>>>  fs/nfs/pagelist.c        |   15 ++++++++++++---
>>>  fs/nfs/pnfs.c            |    4 ++--
>>>  fs/nfs/pnfs.h            |    1 +
>>>  fs/nfs/read.c            |   28 ++++++++++++++++------------
>>>  fs/nfs/write.c           |    4 ++--
>>>  include/linux/nfs_page.h |    5 +++--
>>>  include/linux/nfs_xdr.h  |    1 +
>>>  8 files changed, 37 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/fs/nfs/file.c b/fs/nfs/file.c
>>> index 7bf029e..d85a534 100644
>>> --- a/fs/nfs/file.c
>>> +++ b/fs/nfs/file.c
>>> @@ -387,10 +387,6 @@ static int nfs_write_begin(struct file *file, struct address_space *mapping,
>>>               file->f_path.dentry->d_name.name,
>>>               mapping->host->i_ino, len, (long long) pos);
>>>
>>> -     pnfs_update_layout(mapping->host,
>>> -                        nfs_file_open_context(file),
>>> -                        IOMODE_RW);
>>> -
>>>  start:
>>>       /*
>>>        * Prevent starvation issues if someone is doing a consistency
>>> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
>>> index e1164e3..e0a0cb4 100644
>>> --- a/fs/nfs/pagelist.c
>>> +++ b/fs/nfs/pagelist.c
>>> @@ -20,6 +20,7 @@
>>>  #include <linux/nfs_mount.h>
>>>
>>>  #include "internal.h"
>>> +#include "pnfs.h"
>>>
>>>  static struct kmem_cache *nfs_page_cachep;
>>>
>>> @@ -213,7 +214,7 @@ nfs_wait_on_request(struct nfs_page *req)
>>>   */
>>>  void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
>>>                    struct inode *inode,
>>> -                  int (*doio)(struct inode *, struct list_head *, unsigned int, size_t, int),
>>> +                  int (*doio)(struct inode *, struct list_head *, unsigned int, size_t, int, struct pnfs_layout_segment *),
>>>                    size_t bsize,
>>>                    int io_flags)
>>>  {
>>> @@ -226,6 +227,7 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
>>>       desc->pg_doio = doio;
>>>       desc->pg_ioflags = io_flags;
>>>       desc->pg_error = 0;
>>> +     desc->pg_lseg = NULL;
>>>  }
>>>
>>>  /**
>>> @@ -288,8 +290,13 @@ static int nfs_pageio_do_add_request(struct nfs_pageio_descriptor *desc,
>>>               prev = nfs_list_entry(desc->pg_list.prev);
>>>               if (!nfs_can_coalesce_requests(prev, req))
>>>                       return 0;
>>> -     } else
>>> +     } else {
>>> +             put_lseg(desc->pg_lseg);
>>>               desc->pg_base = req->wb_pgbase;
>>> +             desc->pg_lseg = pnfs_update_layout(desc->pg_inode,
>>> +                                                req->wb_context,
>>> +                                                IOMODE_READ);
>>
>> Looking at this afresh after a week of vacation. Isn't it more natural
>> to do this as part of the pg_doio() callback?
>>
>> Your only reason for introducing the ->pg_lseg pointer is to be able to
>> pass it to the ->pg_doio() in the first place. Why not do that by simply
>> passing the 'desc' pointer to ->pg_doio(), and then having it call
>> pnfs_update_layout() instead of 'get_layout()'?
>>
> 
> The problem is that it is not the only reason.  Passing the lseg into
> the nfs_can_coalesce_requests is another.  Calling pnfs_update_layout
> in ->pg_doio would be eliminate the opportunity to have a say in
> coalescing based on the layout.
> 
> 

As long as you correctly deal with short I/Os in to doio path (like we did
many moons ago) you should be fine if the layout you got does not cover
the whole coalesced range.

>>> +     }
>>>       nfs_list_remove_request(req);
>>>       nfs_list_add_request(req, &desc->pg_list);
>>>       desc->pg_count = newlen;
>>> @@ -307,7 +314,8 @@ static void nfs_pageio_doio(struct nfs_pageio_descriptor *desc)
>>>                                         nfs_page_array_len(desc->pg_base,
>>>                                                            desc->pg_count),
>>>                                         desc->pg_count,
>>> -                                       desc->pg_ioflags);
>>> +                                       desc->pg_ioflags,
>>> +                                       desc->pg_lseg);
>>>               if (error < 0)
>>>                       desc->pg_error = error;
>>>               else
>>> @@ -345,6 +353,7 @@ int nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
>>>  void nfs_pageio_complete(struct nfs_pageio_descriptor *desc)
>>>  {
>>>       nfs_pageio_doio(desc);
>>> +     put_lseg(desc->pg_lseg);
>>>  }
>>>
>>>  /**
>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>> index f0a9578..dcd4356 100644
>>> --- a/fs/nfs/pnfs.c
>>> +++ b/fs/nfs/pnfs.c
>>> @@ -264,7 +264,7 @@ put_lseg_locked(struct pnfs_layout_segment *lseg,
>>>       return 0;
>>>  }
>>>
>>> -static void
>>> +void
>>>  put_lseg(struct pnfs_layout_segment *lseg)
>>>  {
>>>       struct inode *ino;
>>> @@ -285,6 +285,7 @@ put_lseg(struct pnfs_layout_segment *lseg)
>>>               pnfs_free_lseg_list(&free_me);
>>>       }
>>>  }
>>> +EXPORT_SYMBOL_GPL(put_lseg);
>>
>> Why is this needed here?
>>
> 
> That looks like an artifact left over from older code.  It is not needed.
> 
>>
>>> static bool
>>>  should_free_lseg(u32 lseg_iomode, u32 recall_iomode)
>>> @@ -797,7 +798,6 @@ pnfs_update_layout(struct inode *ino,
>>>  out:
>>>       dprintk("%s end, state 0x%lx lseg %p\n", __func__,
>>>               nfsi->layout ? nfsi->layout->plh_flags : -1, lseg);
>>> -     put_lseg(lseg); /* STUB - callers currently ignore return value */
>>>       return lseg;
>>>  out_unlock:
>>>       spin_unlock(&ino->i_lock);
>>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>>> index 9a994bc..121d6a3 100644
>>> --- a/fs/nfs/pnfs.h
>>> +++ b/fs/nfs/pnfs.h
>>> @@ -146,6 +146,7 @@ extern int nfs4_proc_layoutget(struct nfs4_layoutget *lgp);
>>>
>>>  /* pnfs.c */
>>>  void get_layout_hdr(struct pnfs_layout_hdr *lo);
>>> +void put_lseg(struct pnfs_layout_segment *lseg);
>>>  struct pnfs_layout_segment *
>>>  pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
>>>                  enum pnfs_iomode access_type);
>>> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
>>> index aedcaa7..c453164 100644
>>> --- a/fs/nfs/read.c
>>> +++ b/fs/nfs/read.c
>>> @@ -20,17 +20,17 @@
>>>  #include <linux/nfs_page.h>
>>>
>>>  #include <asm/system.h>
>>> +#include "pnfs.h"
>>>
>>>  #include "nfs4_fs.h"
>>>  #include "internal.h"
>>>  #include "iostat.h"
>>>  #include "fscache.h"
>>> -#include "pnfs.h"
>>>
>>>  #define NFSDBG_FACILITY              NFSDBG_PAGECACHE
>>>
>>> -static int nfs_pagein_multi(struct inode *, struct list_head *, unsigned int, size_t, int);
>>> -static int nfs_pagein_one(struct inode *, struct list_head *, unsigned int, size_t, int);
>>> +static int nfs_pagein_multi(struct inode *, struct list_head *, unsigned int, size_t, int, struct pnfs_layout_segment *);
>>> +static int nfs_pagein_one(struct inode *, struct list_head *, unsigned int, size_t, int, struct pnfs_layout_segment *);
>>>  static const struct rpc_call_ops nfs_read_partial_ops;
>>>  static const struct rpc_call_ops nfs_read_full_ops;
>>>
>>> @@ -70,6 +70,7 @@ void nfs_readdata_free(struct nfs_read_data *p)
>>>  static void nfs_readdata_release(struct nfs_read_data *rdata)
>>>  {
>>>       put_nfs_open_context(rdata->args.context);
>>> +     put_lseg(rdata->lseg);
>>
>> Shouldn't you be calling put_lseg() _before_ put_nfs_open_context()? You
>> are not guaranteed that the inode still exists after that call.
>>

Good catch.  If we need the layout to outlive the open context then
we should get a reference on the inode using iget and iput the inode
in put_layout_hdr_locked.

Benny
> 
> Yes.
> 
> Fred

  parent reply	other threads:[~2011-02-16  3:11 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-14 19:18 [PATCH 0/16] pnfs wave 3 submission andros
2011-02-14 19:18 ` [PATCH 01/16] NFS remove unnecessary CONFIG_NFS_V4 from nfs_read_data andros
2011-02-15  9:16   ` Christoph Hellwig
2011-02-15  9:24     ` Taousif_Ansari-G5Y5guI6XLZWk0Htik3J/w
2011-02-15 14:51     ` Andy Adamson
2011-02-14 19:18 ` [PATCH 02/16] NFS put_layout_hdr can remove nfsi->layout andros
2011-02-14 19:18 ` [PATCH 03/16] NFS move nfs_client initialization into nfs_get_client andros
2011-02-16  2:58   ` Benny Halevy
2011-02-16 16:00     ` Andy Adamson
2011-02-14 19:18 ` [PATCH 04/16] pnfs: wave 3: send zero stateid seqid on v4.1 i/o andros
2011-02-14 19:18 ` [PATCH 05/16] pnfs: wave 3: new flag for state renewal check andros
2011-02-14 19:18 ` [PATCH 06/16] pnfs: wave 3: new flag for lease time check andros
2011-02-14 19:18 ` [PATCH 07/16] pnfs: wave 3: add MDS mount DS only check andros
2011-02-14 19:18 ` [PATCH 08/16] pnfs: wave 3: lseg refcounting andros
2011-02-15  9:25   ` Christoph Hellwig
2011-02-15 14:48     ` Fred Isaman
2011-02-15 14:58       ` Christoph Hellwig
2011-02-15 14:59         ` Benny Halevy
2011-02-15 15:06           ` Christoph Hellwig
2011-02-15 15:11             ` Fred Isaman
2011-02-15 16:02             ` Christoph Hellwig
2011-02-15 16:37               ` William A. (Andy) Adamson
2011-02-15 19:17                 ` Andy Adamson
2011-02-15 19:29                   ` Benny Halevy
2011-02-15 19:30                     ` Andy Adamson
2011-02-15 15:07         ` Fred Isaman
2011-02-14 19:18 ` [PATCH 09/16] pnfs: wave 3: shift pnfs_update_layout locations andros
2011-02-14 23:14   ` Trond Myklebust
2011-02-15 14:41     ` Fred Isaman
2011-02-15 15:00       ` Trond Myklebust
2011-02-16  3:11       ` Benny Halevy [this message]
2011-02-14 19:18 ` [PATCH 10/16] pnfs: wave 3: coelesce across layout stripes andros
2011-02-14 23:42   ` Trond Myklebust
2011-02-15 14:43     ` William A. (Andy) Adamson
2011-02-15 15:03       ` Trond Myklebust
     [not found]         ` <1297782220.10103.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2011-02-15 15:10           ` Andy Adamson
2011-02-14 19:18 ` [PATCH 11/16] pnfs: wave 3: generic read andros
2011-02-14 23:36   ` Trond Myklebust
2011-02-15 14:47     ` Andy Adamson
2011-02-16  3:16   ` Benny Halevy
2011-02-16 14:53     ` Andy Adamson
2011-02-16 15:09       ` Trond Myklebust
2011-02-16 15:52         ` Benny Halevy
2011-02-16 15:56           ` Andy Adamson
2011-02-16 15:57           ` Sager, Mike
2011-02-14 19:18 ` [PATCH 12/16] pnfs: wave 3: data server connection andros
2011-02-14 19:18 ` [PATCH 13/16] pnfs: wave 3: filelayout i/o helpers andros
2011-02-15  9:31   ` Christoph Hellwig
2011-02-15 15:12     ` Andy Adamson
2011-02-14 19:18 ` [PATCH 14/16] pnfs: wave 3: filelayout read andros
2011-02-14 19:18 ` [PATCH 15/16] pnfs: wave 3: filelayout async error handler andros
2011-02-14 19:18 ` [PATCH 16/16] pnfs: wave 3: turn off pNFS on ds connection failure andros
2011-02-14 22:39 ` [PATCH 0/16] pnfs wave 3 submission Trond Myklebust
2011-02-15 14:44   ` William A. (Andy) Adamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5B406C.4080801@panasas.com \
    --to=bhalevy@panasas.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=andros@citi.umich.edu \
    --cc=andros@netapp.com \
    --cc=bharrosh@panasas.com \
    --cc=dhildeb@us.ibm.com \
    --cc=green@linuxhacker.ru \
    --cc=guotao@nrchpc.ac.cn \
    --cc=iisaman@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.