Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Cc: Jeff Layton <jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	xfs-VZNHf3L845pBDgjK7y7TUQ@public.gmane.org
Subject: Re: [PATCH 10/20] nfsd: implement pNFS operations
Date: Thu, 29 Jan 2015 15:33:46 -0500
Message-ID: <20150129203346.GA11064@fieldses.org> (raw)
In-Reply-To: <1421925006-24231-11-git-send-email-hch-jcswGhMUV9g@public.gmane.org>

On Thu, Jan 22, 2015 at 12:09:56PM +0100, Christoph Hellwig wrote:
> Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
> LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
> outstanding layouts and devices.
> 
> Layout management is very straight forward, with a nfs4_layout_stateid
> structure that extends nfs4_stid to manage layout stateids as the
> top-level structure.  It is linked into the nfs4_file and nfs4_client
> structures like the other stateids, and contains a linked list of
> layouts that hang of the stateid.  The actual layout operations are
> implemented in layout drivers that are not part of this commit, but
> will be added later.
> 
> The worst part of this commit is the management of the pNFS device IDs,
> which suffers from a specification that is not sanely implementable due
> to the fact that the device-IDs are global and not bound to an export,
> and have a small enough size so that we can't store the fsid portion of
> a file handle, and must never be reused.  As we still do need perform all
> export authentication and validation checks on a device ID passed to
> GETDEVICEINFO we are caught between a rock and a hard place.  To work
> around this issue we add a new hash that maps from a 64-bit integer to a
> fsid so that we can look up the export to authenticate against it,
> a 32-bit integer as a generation that we can bump when changing the device,
> and a currently unused 32-bit integer that could be used in the future
> to handle more than a single device per export.  Entries in this hash
> table are never deleted as we can't reuse the ids anyway, and would have
> a severe lifetime problem anyway as Linux export structures are temporary
> structures that can go away under load.

Looks to me like that works.

...
> diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
> new file mode 100644
> index 0000000..28c8ff2
> --- /dev/null
> +++ b/fs/nfsd/nfs4layouts.c
> @@ -0,0 +1,488 @@
...
> +__be32
> +nfsd4_preprocess_layout_stateid(struct svc_rqst *rqstp,
> +		struct nfsd4_compound_state *cstate, stateid_t *stateid,
> +		bool create, u32 layout_type, struct nfs4_layout_stateid **lsp)
> +{
> +	struct nfs4_layout_stateid *ls;
> +	struct nfs4_stid *stid;
> +	unsigned char typemask = NFS4_LAYOUT_STID;
> +	__be32 status;
> +
> +	if (create)
> +		typemask |= (NFS4_OPEN_STID | NFS4_LOCK_STID | NFS4_DELEG_STID);
> +
> +	status = nfsd4_lookup_stateid(cstate, stateid, typemask, &stid,
> +			net_generic(SVC_NET(rqstp), nfsd_net_id));
> +	if (status)
> +		goto out;
> +
> +	if (!fh_match(&cstate->current_fh.fh_handle,
> +		      &stid->sc_file->fi_fhandle)) {
> +		status = nfserr_bad_stateid;
> +		goto out_put_stid;
> +	}
> +
> +	if (stid->sc_type != NFS4_LAYOUT_STID) {
> +		ls = nfsd4_alloc_layout_stateid(cstate, stid, layout_type);
> +		nfs4_put_stid(stid);
> +
> +		status = nfserr_jukebox;
> +		if (!ls)
> +			goto out;
> +	} else {
> +		ls = container_of(stid, struct nfs4_layout_stateid, ls_stid);
> +
> +		status = nfserr_bad_stateid;
> +		if (stateid->si_generation > stid->sc_stateid.si_generation)
> +			goto out_put_stid;

Is there no old_stateid case for layout stateids?  And is there any
chance of wraparound?  (I was comparing to check_stateid_generation and
expecting the only difference to be the handling of the generation-zero
case.)

> +		if (layout_type != ls->ls_layout_type)
> +			goto out_put_stid;
> +	}
> +
> +	*lsp = ls;
> +	return 0;
> +
> +out_put_stid:
> +	nfs4_put_stid(stid);
> +out:
> +	return status;
> +}
> +
> +static inline u64
> +layout_end(struct nfsd4_layout_seg *seg)
> +{
> +	u64 end = seg->offset + seg->length;
> +	return end >= seg->offset ? seg->length : NFS4_MAX_UINT64;

Shouldn't that be

	return end >= seg->offset ? end : NFS_MAX_UINT64;

?

> +}
> +
> +static void
> +layout_update_len(struct nfsd4_layout_seg *lo, u64 end)
> +{
> +	if (end == NFS4_MAX_UINT64)
> +		lo->length = NFS4_MAX_UINT64;

Is this case necessary?

> +	else
> +		lo->length = end - lo->offset;
> +}
...
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index ac71d13..b813913 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
...
> @@ -1966,6 +2213,25 @@ static struct nfsd4_operation nfsd4_ops[] = {
>  		.op_get_currentstateid = (stateid_getter)nfsd4_get_freestateid,
>  		.op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
>  	},
> +#ifdef CONFIG_NFSD_PNFS
> +	[OP_GETDEVICEINFO] = {
> +		.op_func = (nfsd4op_func)nfsd4_getdeviceinfo,
> +		.op_flags = ALLOWED_WITHOUT_FH,
> +		.op_name = "OP_GETDEVICEINFO",
> +	},
> +	[OP_LAYOUTGET] = {
> +		.op_func = (nfsd4op_func)nfsd4_layoutget,
> +		.op_name = "OP_LAYOUTGET",
> +	},
> +	[OP_LAYOUTCOMMIT] = {
> +		.op_func = (nfsd4op_func)nfsd4_layoutcommit,
> +		.op_name = "OP_LAYOUTCOMMIT",
> +	},
> +	[OP_LAYOUTRETURN] = {
> +		.op_func = (nfsd4op_func)nfsd4_layoutreturn,
> +		.op_name = "OP_LAYOUTRETURN",
> +	},

Should any of these have OP_MODIFIES_SOMETHING set?  (Basically: would
we be in trouble if we succesfully completed one of these operations and
then weren't able to encode the result?)

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply index

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-22 11:09 a simple and scalable pNFS block layout server V2 Christoph Hellwig
2015-01-22 11:09 ` [PATCH 04/20] nfsd: factor out a helper to decode nfstime4 values Christoph Hellwig
     [not found]   ` <1421925006-24231-5-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-22 20:15     ` J. Bruce Fields
2015-01-22 11:09 ` [PATCH 05/20] nfsd: move nfsd_fh_match to nfsfh.h Christoph Hellwig
2015-01-22 11:09 ` [PATCH 09/20] nfsd: make find_any_file available outside nfs4state.c Christoph Hellwig
2015-01-22 11:09 ` [PATCH 12/20] nfsd: update documentation for pNFS support Christoph Hellwig
2015-01-22 11:09 ` [PATCH 13/20] nfsd: add trace events Christoph Hellwig
2015-01-22 11:10 ` [PATCH 15/20] nfsd: pNFS block layout driver Christoph Hellwig
2015-01-22 11:10 ` [PATCH 18/20] xfs: factor out a xfs_update_prealloc_flags() helper Christoph Hellwig
     [not found]   ` <1421925006-24231-19-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-02-01 23:06     ` Dave Chinner
2015-01-22 16:04 ` a simple and scalable pNFS block layout server V2 Chuck Lever
2015-01-22 16:21   ` Christoph Hellwig
     [not found] ` <1421925006-24231-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-22 11:09   ` [PATCH 01/20] nfs: add LAYOUT_TYPE_MAX enum value Christoph Hellwig
2015-01-22 11:09   ` [PATCH 02/20] fs: track fl_owner for leases Christoph Hellwig
2015-01-22 11:09   ` [PATCH 03/20] fs: add FL_LAYOUT lease type Christoph Hellwig
2015-01-22 15:45     ` Jeff Layton
     [not found]     ` <1421925006-24231-4-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-22 20:14       ` J. Bruce Fields
     [not found]         ` <20150122201442.GJ898-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-01-22 20:18           ` Christoph Hellwig
2015-01-22 11:09   ` [PATCH 06/20] nfsd: add fh_fsid_match helper Christoph Hellwig
2015-01-22 11:09   ` [PATCH 07/20] nfsd: make lookup/alloc/unhash_stid available outside nfs4state.c Christoph Hellwig
2015-01-22 11:09   ` [PATCH 08/20] nfsd: make find/get/put file " Christoph Hellwig
2015-01-22 11:09   ` [PATCH 10/20] nfsd: implement pNFS operations Christoph Hellwig
     [not found]     ` <1421925006-24231-11-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-29 20:33       ` J. Bruce Fields [this message]
     [not found]         ` <20150129203346.GA11064-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-02-02 12:43           ` Christoph Hellwig
2015-02-02 14:28             ` J. Bruce Fields
     [not found]               ` <20150202142832.GC22301-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-02-02 14:56                 ` Christoph Hellwig
     [not found]                   ` <20150202145619.GA18387-jcswGhMUV9g@public.gmane.org>
2015-02-02 15:00                     ` J. Bruce Fields
     [not found]                       ` <20150202150032.GD22301-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-02-02 18:56                         ` Christoph Hellwig
     [not found]                           ` <20150202185638.GB23319-jcswGhMUV9g@public.gmane.org>
2015-02-03 16:08                             ` J. Bruce Fields
2015-01-22 11:09   ` [PATCH 11/20] nfsd: implement pNFS layout recalls Christoph Hellwig
2015-01-22 11:10   ` [PATCH 14/20] exportfs: add methods for block layout exports Christoph Hellwig
2015-01-22 11:10   ` [PATCH 16/20] xfs: pass a 64-bit count argument to xfs_iomap_write_unwritten Christoph Hellwig
2015-01-29 20:52     ` J. Bruce Fields
2015-02-02  7:30       ` Christoph Hellwig
2015-02-02 19:24         ` Dave Chinner
2015-02-02 19:43           ` Dave Chinner
2015-02-02 19:48             ` J. Bruce Fields
2015-02-03 18:35               ` Christoph Hellwig
     [not found]                 ` <20150203183533.GA16929-jcswGhMUV9g@public.gmane.org>
2015-02-11 22:35                   ` J. Bruce Fields
2015-02-11 22:54                     ` J. Bruce Fields
2015-02-04  7:57             ` Christoph Hellwig
     [not found]               ` <20150204075756.GA763-jcswGhMUV9g@public.gmane.org>
2015-02-04 20:02                 ` Dave Chinner
2015-01-22 11:10   ` [PATCH 17/20] xfs: update the superblock using a synchronous transaction in growfs Christoph Hellwig
2015-01-22 11:10   ` [PATCH 19/20] xfs: implement pNFS export operations Christoph Hellwig
     [not found]     ` <1421925006-24231-20-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-02-05  0:47       ` Dave Chinner
2015-02-05  7:08         ` Christoph Hellwig
     [not found]           ` <20150205070858.GA593-jcswGhMUV9g@public.gmane.org>
2015-02-05 13:57             ` Christoph Hellwig
     [not found]               ` <20150205135756.GA6386-jcswGhMUV9g@public.gmane.org>
2015-02-06 22:20                 ` Dave Chinner
2015-02-06 22:42                   ` J. Bruce Fields
2015-02-08 13:34                     ` Christoph Hellwig
     [not found]                       ` <20150208133435.GA27081-jcswGhMUV9g@public.gmane.org>
2015-02-08 14:09                         ` Jeff Layton
     [not found]                           ` <20150208090942.51e99687-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2015-02-09 20:11                             ` J. Bruce Fields
     [not found]                               ` <20150209201154.GA27746-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-02-10  0:04                                 ` Dave Chinner
2015-02-13  1:11                                   ` J. Bruce Fields
2015-02-13  1:54                                     ` Dave Chinner
2015-02-13  2:38                                       ` Stephen Rothwell
2015-02-15 23:25                                         ` Dave Chinner
2015-01-22 11:10   ` [PATCH 20/20] xfs: recall pNFS layouts on conflicting access Christoph Hellwig
     [not found]     ` <1421925006-24231-21-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-02-05  0:51       ` Dave Chinner
2015-01-22 20:01   ` a simple and scalable pNFS block layout server V2 J. Bruce Fields
2015-01-22 20:06   ` J. Bruce Fields
2015-01-22 20:20     ` Christoph Hellwig
     [not found]     ` <20150122200618.GI898-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-01-22 20:20       ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150129203346.GA11064@fieldses.org \
    --to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=xfs-VZNHf3L845pBDgjK7y7TUQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git