ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] ceph: don't allow access to MDS-private inodes
@ 2021-04-20 14:06 Jeff Layton
  2021-04-21  0:05 ` Xiubo Li
  2021-04-21 16:22 ` Luis Henriques
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff Layton @ 2021-04-20 14:06 UTC (permalink / raw)
  To: ceph-devel; +Cc: xiubli, pdonnell, idryomov

The MDS reserves a set of inodes for its own usage, and these should
never be accessible to clients. Add a new helper to vet a proposed
inode number against that range, and complain loudly and refuse to
create or look it up if it's in it. We do need to carve out an exception
for the root and the lost+found directories.

Also, ensure that the MDS doesn't try to delegate that range to us
either. Print a warning if it does, and don't save the range in the
xarray.

URL: https://tracker.ceph.com/issues/49922
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Xiubo Li<xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
---
 fs/ceph/export.c             |  8 ++++++++
 fs/ceph/inode.c              |  3 +++
 fs/ceph/mds_client.c         |  7 +++++++
 fs/ceph/super.h              | 24 ++++++++++++++++++++++++
 include/linux/ceph/ceph_fs.h |  7 ++++---
 5 files changed, 46 insertions(+), 3 deletions(-)

v2: allow lookups of lost+found dir inodes
    flesh out and update the CEPH_INO_* definitions

diff --git a/fs/ceph/export.c b/fs/ceph/export.c
index 17d8c8f4ec89..65540a4429b2 100644
--- a/fs/ceph/export.c
+++ b/fs/ceph/export.c
@@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
 
 	vino.ino = ino;
 	vino.snap = CEPH_NOSNAP;
+
+	if (ceph_vino_is_reserved(vino))
+		return ERR_PTR(-ESTALE);
+
 	inode = ceph_find_inode(sb, vino);
 	if (!inode) {
 		struct ceph_mds_request *req;
@@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
 		vino.ino = sfh->ino;
 		vino.snap = sfh->snapid;
 	}
+
+	if (ceph_vino_is_reserved(vino))
+		return ERR_PTR(-ESTALE);
+
 	inode = ceph_find_inode(sb, vino);
 	if (inode)
 		return d_obtain_alias(inode);
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 14a1f7963625..e1c63adb196d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
 {
 	struct inode *inode;
 
+	if (ceph_vino_is_reserved(vino))
+		return ERR_PTR(-EREMOTEIO);
+
 	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
 			     ceph_set_ino_cb, &vino);
 	if (!inode)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 63b53098360c..e5af591d3bd4 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
 
 		ceph_decode_64_safe(p, end, start, bad);
 		ceph_decode_64_safe(p, end, len, bad);
+
+		/* Don't accept a delegation of system inodes */
+		if (start < CEPH_INO_SYSTEM_BASE) {
+			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
+					start, len);
+			continue;
+		}
 		while (len--) {
 			int err = xa_insert(&s->s_delegated_inos, ino = start++,
 					    DELEGATED_INO_AVAILABLE,
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index df0851b9240e..f1745403c9b0 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
 		ci->i_vino.snap == pvino->snap;
 }
 
+/*
+ * The MDS reserves a set of inodes for its own usage. These should never
+ * be accessible by clients, and so the MDS has no reason to ever hand these
+ * out.
+ *
+ * These come from src/mds/mdstypes.h in the ceph sources.
+ */
+#define CEPH_MAX_MDS		0x100
+#define CEPH_NUM_STRAY		10
+#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
+
+static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
+{
+	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
+	    vino.ino != CEPH_INO_ROOT &&
+	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
+		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);
+		return true;
+	}
+	return false;
+}
 
 static inline struct inode *ceph_find_inode(struct super_block *sb,
 					    struct ceph_vino vino)
 {
+	if (ceph_vino_is_reserved(vino))
+		return NULL;
+
 	/*
 	 * NB: The hashval will be run through the fs/inode.c hash function
 	 * anyway, so there is no need to squash the inode number down to
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index e41a811026f6..3c90ae21a7e1 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -27,9 +27,10 @@
 #define CEPH_MONC_PROTOCOL   15 /* server/client */
 
 
-#define CEPH_INO_ROOT   1
-#define CEPH_INO_CEPH   2       /* hidden .ceph dir */
-#define CEPH_INO_DOTDOT 3	/* used by ceph fuse for parent (..) */
+#define CEPH_INO_ROOT			1 /* root inode number for all cephfs's */
+#define CEPH_INO_CEPH			2 /* hidden .ceph dir */
+#define CEPH_INO_GLOBAL_SNAPREALM	3 /* includes all snapshots in the fs */
+#define CEPH_INO_LOST_AND_FOUND		4 /* lost+found dir */
 
 /* arbitrary limit on max # of monitors (cluster of 3 is typical) */
 #define CEPH_MAX_MON   31
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] ceph: don't allow access to MDS-private inodes
  2021-04-20 14:06 [PATCH v2] ceph: don't allow access to MDS-private inodes Jeff Layton
@ 2021-04-21  0:05 ` Xiubo Li
  2021-04-21 16:22 ` Luis Henriques
  1 sibling, 0 replies; 7+ messages in thread
From: Xiubo Li @ 2021-04-21  0:05 UTC (permalink / raw)
  To: Jeff Layton, ceph-devel; +Cc: pdonnell, idryomov

On 2021/4/20 22:06, Jeff Layton wrote:
> The MDS reserves a set of inodes for its own usage, and these should
> never be accessible to clients. Add a new helper to vet a proposed
> inode number against that range, and complain loudly and refuse to
> create or look it up if it's in it. We do need to carve out an exception
> for the root and the lost+found directories.
>
> Also, ensure that the MDS doesn't try to delegate that range to us
> either. Print a warning if it does, and don't save the range in the
> xarray.
>
> URL: https://tracker.ceph.com/issues/49922
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Xiubo Li<xiubli@redhat.com>
> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
> ---
>   fs/ceph/export.c             |  8 ++++++++
>   fs/ceph/inode.c              |  3 +++
>   fs/ceph/mds_client.c         |  7 +++++++
>   fs/ceph/super.h              | 24 ++++++++++++++++++++++++
>   include/linux/ceph/ceph_fs.h |  7 ++++---
>   5 files changed, 46 insertions(+), 3 deletions(-)
>
> v2: allow lookups of lost+found dir inodes
>      flesh out and update the CEPH_INO_* definitions
>
> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> index 17d8c8f4ec89..65540a4429b2 100644
> --- a/fs/ceph/export.c
> +++ b/fs/ceph/export.c
> @@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
>   
>   	vino.ino = ino;
>   	vino.snap = CEPH_NOSNAP;
> +
> +	if (ceph_vino_is_reserved(vino))
> +		return ERR_PTR(-ESTALE);
> +
>   	inode = ceph_find_inode(sb, vino);
>   	if (!inode) {
>   		struct ceph_mds_request *req;
> @@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
>   		vino.ino = sfh->ino;
>   		vino.snap = sfh->snapid;
>   	}
> +
> +	if (ceph_vino_is_reserved(vino))
> +		return ERR_PTR(-ESTALE);
> +
>   	inode = ceph_find_inode(sb, vino);
>   	if (inode)
>   		return d_obtain_alias(inode);
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index 14a1f7963625..e1c63adb196d 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
>   {
>   	struct inode *inode;
>   
> +	if (ceph_vino_is_reserved(vino))
> +		return ERR_PTR(-EREMOTEIO);
> +
>   	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
>   			     ceph_set_ino_cb, &vino);
>   	if (!inode)
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 63b53098360c..e5af591d3bd4 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
>   
>   		ceph_decode_64_safe(p, end, start, bad);
>   		ceph_decode_64_safe(p, end, len, bad);
> +
> +		/* Don't accept a delegation of system inodes */
> +		if (start < CEPH_INO_SYSTEM_BASE) {
> +			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
> +					start, len);
> +			continue;
> +		}
>   		while (len--) {
>   			int err = xa_insert(&s->s_delegated_inos, ino = start++,
>   					    DELEGATED_INO_AVAILABLE,
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index df0851b9240e..f1745403c9b0 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
>   		ci->i_vino.snap == pvino->snap;
>   }
>   
> +/*
> + * The MDS reserves a set of inodes for its own usage. These should never
> + * be accessible by clients, and so the MDS has no reason to ever hand these
> + * out.
> + *
> + * These come from src/mds/mdstypes.h in the ceph sources.
> + */
> +#define CEPH_MAX_MDS		0x100
> +#define CEPH_NUM_STRAY		10
> +#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
> +
> +static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
> +{
> +	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
> +	    vino.ino != CEPH_INO_ROOT &&
> +	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
> +		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);
> +		return true;
> +	}
> +	return false;
> +}
>   
>   static inline struct inode *ceph_find_inode(struct super_block *sb,
>   					    struct ceph_vino vino)
>   {
> +	if (ceph_vino_is_reserved(vino))
> +		return NULL;
> +
>   	/*
>   	 * NB: The hashval will be run through the fs/inode.c hash function
>   	 * anyway, so there is no need to squash the inode number down to
> diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
> index e41a811026f6..3c90ae21a7e1 100644
> --- a/include/linux/ceph/ceph_fs.h
> +++ b/include/linux/ceph/ceph_fs.h
> @@ -27,9 +27,10 @@
>   #define CEPH_MONC_PROTOCOL   15 /* server/client */
>   
>   
> -#define CEPH_INO_ROOT   1
> -#define CEPH_INO_CEPH   2       /* hidden .ceph dir */
> -#define CEPH_INO_DOTDOT 3	/* used by ceph fuse for parent (..) */
> +#define CEPH_INO_ROOT			1 /* root inode number for all cephfs's */
> +#define CEPH_INO_CEPH			2 /* hidden .ceph dir */
> +#define CEPH_INO_GLOBAL_SNAPREALM	3 /* includes all snapshots in the fs */
> +#define CEPH_INO_LOST_AND_FOUND		4 /* lost+found dir */
>   
>   /* arbitrary limit on max # of monitors (cluster of 3 is typical) */
>   #define CEPH_MAX_MON   31

LTGM. Thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] ceph: don't allow access to MDS-private inodes
  2021-04-20 14:06 [PATCH v2] ceph: don't allow access to MDS-private inodes Jeff Layton
  2021-04-21  0:05 ` Xiubo Li
@ 2021-04-21 16:22 ` Luis Henriques
  2021-04-21 16:48   ` Jeff Layton
  1 sibling, 1 reply; 7+ messages in thread
From: Luis Henriques @ 2021-04-21 16:22 UTC (permalink / raw)
  To: Jeff Layton; +Cc: ceph-devel, xiubli, pdonnell, idryomov

Jeff Layton <jlayton@kernel.org> writes:

> The MDS reserves a set of inodes for its own usage, and these should
> never be accessible to clients. Add a new helper to vet a proposed
> inode number against that range, and complain loudly and refuse to
> create or look it up if it's in it. We do need to carve out an exception
> for the root and the lost+found directories.
>
> Also, ensure that the MDS doesn't try to delegate that range to us
> either. Print a warning if it does, and don't save the range in the
> xarray.
>
> URL: https://tracker.ceph.com/issues/49922
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Xiubo Li<xiubli@redhat.com>
> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
> ---
>  fs/ceph/export.c             |  8 ++++++++
>  fs/ceph/inode.c              |  3 +++
>  fs/ceph/mds_client.c         |  7 +++++++
>  fs/ceph/super.h              | 24 ++++++++++++++++++++++++
>  include/linux/ceph/ceph_fs.h |  7 ++++---
>  5 files changed, 46 insertions(+), 3 deletions(-)
>
> v2: allow lookups of lost+found dir inodes
>     flesh out and update the CEPH_INO_* definitions
>
> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> index 17d8c8f4ec89..65540a4429b2 100644
> --- a/fs/ceph/export.c
> +++ b/fs/ceph/export.c
> @@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
>  
>  	vino.ino = ino;
>  	vino.snap = CEPH_NOSNAP;
> +
> +	if (ceph_vino_is_reserved(vino))
> +		return ERR_PTR(-ESTALE);
> +
>  	inode = ceph_find_inode(sb, vino);
>  	if (!inode) {
>  		struct ceph_mds_request *req;
> @@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
>  		vino.ino = sfh->ino;
>  		vino.snap = sfh->snapid;
>  	}
> +
> +	if (ceph_vino_is_reserved(vino))
> +		return ERR_PTR(-ESTALE);
> +
>  	inode = ceph_find_inode(sb, vino);
>  	if (inode)
>  		return d_obtain_alias(inode);
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index 14a1f7963625..e1c63adb196d 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
>  {
>  	struct inode *inode;
>  
> +	if (ceph_vino_is_reserved(vino))
> +		return ERR_PTR(-EREMOTEIO);
> +
>  	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
>  			     ceph_set_ino_cb, &vino);
>  	if (!inode)
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 63b53098360c..e5af591d3bd4 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
>  
>  		ceph_decode_64_safe(p, end, start, bad);
>  		ceph_decode_64_safe(p, end, len, bad);
> +
> +		/* Don't accept a delegation of system inodes */
> +		if (start < CEPH_INO_SYSTEM_BASE) {
> +			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
> +					start, len);
> +			continue;
> +		}
>  		while (len--) {
>  			int err = xa_insert(&s->s_delegated_inos, ino = start++,
>  					    DELEGATED_INO_AVAILABLE,
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index df0851b9240e..f1745403c9b0 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
>  		ci->i_vino.snap == pvino->snap;
>  }
>  
> +/*
> + * The MDS reserves a set of inodes for its own usage. These should never
> + * be accessible by clients, and so the MDS has no reason to ever hand these
> + * out.
> + *
> + * These come from src/mds/mdstypes.h in the ceph sources.
> + */
> +#define CEPH_MAX_MDS		0x100
> +#define CEPH_NUM_STRAY		10
> +#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
> +
> +static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
> +{
> +	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
> +	    vino.ino != CEPH_INO_ROOT &&
> +	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
> +		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);

This warning is quite easy to hit with inode 0x3, using generic/013.  It
looks like, while looking for the snaprealm to check quotas, we will
eventually get this value from the MDS.  So this function should also
return 'true' if ino is CEPH_INO_GLOBAL_SNAPREALM.

Cheers,
-- 
Luis

> +		return true;
> +	}
> +	return false;
> +}
>  
>  static inline struct inode *ceph_find_inode(struct super_block *sb,
>  					    struct ceph_vino vino)
>  {
> +	if (ceph_vino_is_reserved(vino))
> +		return NULL;
> +
>  	/*
>  	 * NB: The hashval will be run through the fs/inode.c hash function
>  	 * anyway, so there is no need to squash the inode number down to
> diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
> index e41a811026f6..3c90ae21a7e1 100644
> --- a/include/linux/ceph/ceph_fs.h
> +++ b/include/linux/ceph/ceph_fs.h
> @@ -27,9 +27,10 @@
>  #define CEPH_MONC_PROTOCOL   15 /* server/client */
>  
>  
> -#define CEPH_INO_ROOT   1
> -#define CEPH_INO_CEPH   2       /* hidden .ceph dir */
> -#define CEPH_INO_DOTDOT 3	/* used by ceph fuse for parent (..) */
> +#define CEPH_INO_ROOT			1 /* root inode number for all cephfs's */
> +#define CEPH_INO_CEPH			2 /* hidden .ceph dir */
> +#define CEPH_INO_GLOBAL_SNAPREALM	3 /* includes all snapshots in the fs */
> +#define CEPH_INO_LOST_AND_FOUND		4 /* lost+found dir */
>  
>  /* arbitrary limit on max # of monitors (cluster of 3 is typical) */
>  #define CEPH_MAX_MON   31
> -- 
>
> 2.31.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] ceph: don't allow access to MDS-private inodes
  2021-04-21 16:22 ` Luis Henriques
@ 2021-04-21 16:48   ` Jeff Layton
  2021-04-21 17:46     ` Luis Henriques
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2021-04-21 16:48 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, xiubli, pdonnell, idryomov

On Wed, 2021-04-21 at 17:22 +0100, Luis Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
> 
> > The MDS reserves a set of inodes for its own usage, and these should
> > never be accessible to clients. Add a new helper to vet a proposed
> > inode number against that range, and complain loudly and refuse to
> > create or look it up if it's in it. We do need to carve out an exception
> > for the root and the lost+found directories.
> > 
> > Also, ensure that the MDS doesn't try to delegate that range to us
> > either. Print a warning if it does, and don't save the range in the
> > xarray.
> > 
> > URL: https://tracker.ceph.com/issues/49922
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > Signed-off-by: Xiubo Li<xiubli@redhat.com>
> > Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
> > ---
> >  fs/ceph/export.c             |  8 ++++++++
> >  fs/ceph/inode.c              |  3 +++
> >  fs/ceph/mds_client.c         |  7 +++++++
> >  fs/ceph/super.h              | 24 ++++++++++++++++++++++++
> >  include/linux/ceph/ceph_fs.h |  7 ++++---
> >  5 files changed, 46 insertions(+), 3 deletions(-)
> > 
> > v2: allow lookups of lost+found dir inodes
> >     flesh out and update the CEPH_INO_* definitions
> > 
> > diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> > index 17d8c8f4ec89..65540a4429b2 100644
> > --- a/fs/ceph/export.c
> > +++ b/fs/ceph/export.c
> > @@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
> >  
> >  	vino.ino = ino;
> >  	vino.snap = CEPH_NOSNAP;
> > +
> > +	if (ceph_vino_is_reserved(vino))
> > +		return ERR_PTR(-ESTALE);
> > +
> >  	inode = ceph_find_inode(sb, vino);
> >  	if (!inode) {
> >  		struct ceph_mds_request *req;
> > @@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
> >  		vino.ino = sfh->ino;
> >  		vino.snap = sfh->snapid;
> >  	}
> > +
> > +	if (ceph_vino_is_reserved(vino))
> > +		return ERR_PTR(-ESTALE);
> > +
> >  	inode = ceph_find_inode(sb, vino);
> >  	if (inode)
> >  		return d_obtain_alias(inode);
> > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> > index 14a1f7963625..e1c63adb196d 100644
> > --- a/fs/ceph/inode.c
> > +++ b/fs/ceph/inode.c
> > @@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
> >  {
> >  	struct inode *inode;
> >  
> > +	if (ceph_vino_is_reserved(vino))
> > +		return ERR_PTR(-EREMOTEIO);
> > +
> >  	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
> >  			     ceph_set_ino_cb, &vino);
> >  	if (!inode)
> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > index 63b53098360c..e5af591d3bd4 100644
> > --- a/fs/ceph/mds_client.c
> > +++ b/fs/ceph/mds_client.c
> > @@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
> >  
> >  		ceph_decode_64_safe(p, end, start, bad);
> >  		ceph_decode_64_safe(p, end, len, bad);
> > +
> > +		/* Don't accept a delegation of system inodes */
> > +		if (start < CEPH_INO_SYSTEM_BASE) {
> > +			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
> > +					start, len);
> > +			continue;
> > +		}
> >  		while (len--) {
> >  			int err = xa_insert(&s->s_delegated_inos, ino = start++,
> >  					    DELEGATED_INO_AVAILABLE,
> > diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> > index df0851b9240e..f1745403c9b0 100644
> > --- a/fs/ceph/super.h
> > +++ b/fs/ceph/super.h
> > @@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
> >  		ci->i_vino.snap == pvino->snap;
> >  }
> >  
> > +/*
> > + * The MDS reserves a set of inodes for its own usage. These should never
> > + * be accessible by clients, and so the MDS has no reason to ever hand these
> > + * out.
> > + *
> > + * These come from src/mds/mdstypes.h in the ceph sources.
> > + */
> > +#define CEPH_MAX_MDS		0x100
> > +#define CEPH_NUM_STRAY		10
> > +#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
> > +
> > +static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
> > +{
> > +	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
> > +	    vino.ino != CEPH_INO_ROOT &&
> > +	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
> > +		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);
> 
> This warning is quite easy to hit with inode 0x3, using generic/013.  It
> looks like, while looking for the snaprealm to check quotas, we will
> eventually get this value from the MDS.  So this function should also
> return 'true' if ino is CEPH_INO_GLOBAL_SNAPREALM.
> 

I wonder why I'm not seeing this. What mount options are you using?

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] ceph: don't allow access to MDS-private inodes
  2021-04-21 16:48   ` Jeff Layton
@ 2021-04-21 17:46     ` Luis Henriques
  2021-04-21 18:18       ` Jeff Layton
  2021-04-22  1:48       ` Xiubo Li
  0 siblings, 2 replies; 7+ messages in thread
From: Luis Henriques @ 2021-04-21 17:46 UTC (permalink / raw)
  To: Jeff Layton; +Cc: ceph-devel, xiubli, pdonnell, idryomov

Jeff Layton <jlayton@kernel.org> writes:

> On Wed, 2021-04-21 at 17:22 +0100, Luis Henriques wrote:
>> Jeff Layton <jlayton@kernel.org> writes:
>> 
>> > The MDS reserves a set of inodes for its own usage, and these should
>> > never be accessible to clients. Add a new helper to vet a proposed
>> > inode number against that range, and complain loudly and refuse to
>> > create or look it up if it's in it. We do need to carve out an exception
>> > for the root and the lost+found directories.
>> > 
>> > Also, ensure that the MDS doesn't try to delegate that range to us
>> > either. Print a warning if it does, and don't save the range in the
>> > xarray.
>> > 
>> > URL: https://tracker.ceph.com/issues/49922
>> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> > Signed-off-by: Xiubo Li<xiubli@redhat.com>
>> > Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
>> > ---
>> >  fs/ceph/export.c             |  8 ++++++++
>> >  fs/ceph/inode.c              |  3 +++
>> >  fs/ceph/mds_client.c         |  7 +++++++
>> >  fs/ceph/super.h              | 24 ++++++++++++++++++++++++
>> >  include/linux/ceph/ceph_fs.h |  7 ++++---
>> >  5 files changed, 46 insertions(+), 3 deletions(-)
>> > 
>> > v2: allow lookups of lost+found dir inodes
>> >     flesh out and update the CEPH_INO_* definitions
>> > 
>> > diff --git a/fs/ceph/export.c b/fs/ceph/export.c
>> > index 17d8c8f4ec89..65540a4429b2 100644
>> > --- a/fs/ceph/export.c
>> > +++ b/fs/ceph/export.c
>> > @@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
>> >  
>> >  	vino.ino = ino;
>> >  	vino.snap = CEPH_NOSNAP;
>> > +
>> > +	if (ceph_vino_is_reserved(vino))
>> > +		return ERR_PTR(-ESTALE);
>> > +
>> >  	inode = ceph_find_inode(sb, vino);
>> >  	if (!inode) {
>> >  		struct ceph_mds_request *req;
>> > @@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
>> >  		vino.ino = sfh->ino;
>> >  		vino.snap = sfh->snapid;
>> >  	}
>> > +
>> > +	if (ceph_vino_is_reserved(vino))
>> > +		return ERR_PTR(-ESTALE);
>> > +
>> >  	inode = ceph_find_inode(sb, vino);
>> >  	if (inode)
>> >  		return d_obtain_alias(inode);
>> > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
>> > index 14a1f7963625..e1c63adb196d 100644
>> > --- a/fs/ceph/inode.c
>> > +++ b/fs/ceph/inode.c
>> > @@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
>> >  {
>> >  	struct inode *inode;
>> >  
>> > +	if (ceph_vino_is_reserved(vino))
>> > +		return ERR_PTR(-EREMOTEIO);
>> > +
>> >  	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
>> >  			     ceph_set_ino_cb, &vino);
>> >  	if (!inode)
>> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>> > index 63b53098360c..e5af591d3bd4 100644
>> > --- a/fs/ceph/mds_client.c
>> > +++ b/fs/ceph/mds_client.c
>> > @@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
>> >  
>> >  		ceph_decode_64_safe(p, end, start, bad);
>> >  		ceph_decode_64_safe(p, end, len, bad);
>> > +
>> > +		/* Don't accept a delegation of system inodes */
>> > +		if (start < CEPH_INO_SYSTEM_BASE) {
>> > +			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
>> > +					start, len);
>> > +			continue;
>> > +		}
>> >  		while (len--) {
>> >  			int err = xa_insert(&s->s_delegated_inos, ino = start++,
>> >  					    DELEGATED_INO_AVAILABLE,
>> > diff --git a/fs/ceph/super.h b/fs/ceph/super.h
>> > index df0851b9240e..f1745403c9b0 100644
>> > --- a/fs/ceph/super.h
>> > +++ b/fs/ceph/super.h
>> > @@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
>> >  		ci->i_vino.snap == pvino->snap;
>> >  }
>> >  
>> > +/*
>> > + * The MDS reserves a set of inodes for its own usage. These should never
>> > + * be accessible by clients, and so the MDS has no reason to ever hand these
>> > + * out.
>> > + *
>> > + * These come from src/mds/mdstypes.h in the ceph sources.
>> > + */
>> > +#define CEPH_MAX_MDS		0x100
>> > +#define CEPH_NUM_STRAY		10
>> > +#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
>> > +
>> > +static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
>> > +{
>> > +	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
>> > +	    vino.ino != CEPH_INO_ROOT &&
>> > +	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
>> > +		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);
>> 
>> This warning is quite easy to hit with inode 0x3, using generic/013.  It
>> looks like, while looking for the snaprealm to check quotas, we will
>> eventually get this value from the MDS.  So this function should also
>> return 'true' if ino is CEPH_INO_GLOBAL_SNAPREALM.
>> 
>
> I wonder why I'm not seeing this. What mount options are you using?

Ah, I should have mentioned that.  I'm seeing this when using 'copyfrom'
mount option, but (and more important!) I'm also not mounting the
filesystem on the "real" root, i.e. I'm doing something like:

 mount -t ceph <ip>:<port>:/test /mnt/ ...

Not mounting on the FS root will allow us to get function
ceph_has_realms_with_quotas() to return 'true'.

For reference, I'm including the WARNING bellow.

Cheers,
-- 
Luis

[  715.628284] ------------[ cut here ]------------                                                                                                                  [72/9962]
[  715.630115] Attempt to access reserved inode number 0x3          
[  715.630208] WARNING: CPU: 0 PID: 241 at fs/ceph/super.h:549 __lookup_inode+0x238/0x250 [ceph]
[  715.635142] Modules linked in: ceph libceph       
[  715.636816] CPU: 0 PID: 241 Comm: fsstress Not tainted 5.12.0-rc4+ #141             
[  715.639317] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[  715.643873] RIP: 0010:__lookup_inode+0x238/0x250 [ceph]
[  715.646703] Code: ff ff be 03 00 00 00 48 89 ef e8 b3 3b 48 e1 eb b8 48 89 ef e8 69 38 02 00 eb ae 48 89 ee 48 c7 c7 80 e5 15 a0 e8 18 8d 92 e1 <0f> 0b eb c8 49 89 c4 e9 0
[  715.656001] RSP: 0018:ffff88812aab7680 EFLAGS: 00010286                             
[  715.658834] RAX: 0000000000000000 RBX: 1ffff11025556ed0 RCX: 0000000000000000
[  715.662373] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffed1025556ec2       
[  715.665312] RBP: 0000000000000003 R08: 0000000000000001 R09: ffffffff8363a357       
[  715.668285] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88812a3ae000
[  715.671073] R13: ffff88812a0c3200 R14: ffff88812a3ac000 R15: ffff88812b94d628
[  715.673815] FS:  00007fea04b02740(0000) GS:ffff8881f7600000(0000) knlGS:0000000000000000
[  715.676672] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                       
[  715.678782] CR2: 00007fea043ffff0 CR3: 000000012aa9a000 CR4: 00000000000006b0       
[  715.681220] Call Trace:                                                             
[  715.682112]  ? ceph_ino_compare+0x60/0x60 [ceph]                                    
[  715.683829]  ? wait_for_completion+0x170/0x170                                      
[  715.685351]  ceph_lookup_inode+0xc/0x50 [ceph]       
[  715.686997]  lookup_quotarealm_inode+0x28f/0x350 [ceph]                             
[  715.688711]  check_quota_exceeded+0x297/0x350 [ceph]                                                                                                                       
[  715.690379]  ceph_write_iter+0xd95/0x1320 [ceph]                                  
[  715.691851]  ? __lock_acquire+0x863/0x3070                                   
[  715.693100]  ? _raw_spin_unlock_irqrestore+0x2d/0x40                         
[  715.694753]  ? ceph_direct_read_write+0x1090/0x1090 [ceph]                   
[  715.696506]  ? lock_acquire+0x16d/0x4e0                                             
[  715.697748]  ? iter_file_splice_write+0x129/0x650                            
[  715.699379]  ? lock_release+0x410/0x410                                             
[  715.700486]  ? do_splice+0x5e4/0xbc0                                                                                                                                       
[  715.701659]  ? __do_splice+0x102/0x1d0                                                                                                                                     
[  715.703092]  ? __x64_sys_splice+0xcb/0x140
[  715.704638]  ? do_syscall_64+0x33/0x40
[  715.706107]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  715.708067]  ? mark_held_locks+0x65/0x90
[  715.709553]  ? lock_chain_count+0x20/0x20
[  715.711175]  ? do_iter_readv_writev+0x277/0x340
[  715.712986]  ? ceph_direct_read_write+0x1090/0x1090 [ceph]
[  715.715247]  do_iter_readv_writev+0x277/0x340
[  715.716821]  ? new_sync_write+0x370/0x370
[  715.718324]  ? lock_downgrade+0x390/0x390
[  715.719810]  do_iter_write+0xd8/0x300
[  715.720996]  ? splice_from_pipe_next.part.0+0x94/0x260
[  715.722547]  iter_file_splice_write+0x434/0x650
[  715.723737]  ? generic_splice_sendpage+0x100/0x100
[  715.725065]  ? lock_acquire+0x16d/0x4e0
[  715.726218]  ? lock_release+0x410/0x410
[  715.727386]  ? wait_for_completion+0x170/0x170
[  715.728630]  ? do_splice_to+0xcc/0x130
[  715.729777]  ? kill_fasync+0x19/0x230
[  715.730811]  ? lock_is_held_type+0x98/0x110
[  715.731934]  do_splice+0x5e4/0xbc0
[  715.732692]  ? lock_release+0x1ea/0x410
[  715.733821]  ? splice_file_to_pipe+0xd0/0xd0
[  715.734999]  ? _raw_spin_unlock+0x1f/0x30
[  715.735952]  __do_splice+0x102/0x1d0
[  715.736803]  ? do_splice+0xbc0/0xbc0
[  715.737657]  ? do_pipe2+0xe4/0x120
[  715.738313]  ? __do_pipe_flags+0xd0/0xd0
[  715.739087]  __x64_sys_splice+0xcb/0x140
[  715.739810]  do_syscall_64+0x33/0x40
[  715.740463]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  715.741418] RIP: 0033:0x7fea04c059ba
[  715.742136] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 13 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c
[  715.745836] RSP: 002b:00007ffce6e5f948 EFLAGS: 00000246 ORIG_RAX: 0000000000000113
[  715.747380] RAX: ffffffffffffffda RBX: 000000000000f2b2 RCX: 00007fea04c059ba
[  715.748635] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000005
[  715.749845] RBP: 0000000000000004 R08: 000000000000f2b2 R09: 0000000000000000
[  715.751046] R10: 00007ffce6e5f980 R11: 0000000000000246 R12: 000000000000f2b2
[  715.752185] R13: 0000000000000003 R14: 000000000001f91a R15: 0000000000000000
[  715.753344] irq event stamp: 202495
[  715.753945] hardirqs last  enabled at (202505): [<ffffffff8113967d>] console_unlock+0x3ed/0x720
[  715.755355] hardirqs last disabled at (202512): [<ffffffff8113980b>] console_unlock+0x57b/0x720
[  715.756738] softirqs last  enabled at (202342): [<ffffffff810a2b81>] irq_exit_rcu+0x81/0xc0
[  715.758046] softirqs last disabled at (202337): [<ffffffff810a2b81>] irq_exit_rcu+0x81/0xc0
[  715.759330] ---[ end trace 4f15801ab08eb591 ]---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] ceph: don't allow access to MDS-private inodes
  2021-04-21 17:46     ` Luis Henriques
@ 2021-04-21 18:18       ` Jeff Layton
  2021-04-22  1:48       ` Xiubo Li
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2021-04-21 18:18 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, xiubli, pdonnell, idryomov

On Wed, 2021-04-21 at 18:46 +0100, Luis Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
> 
> > On Wed, 2021-04-21 at 17:22 +0100, Luis Henriques wrote:
> > > Jeff Layton <jlayton@kernel.org> writes:
> > > 
> > > > The MDS reserves a set of inodes for its own usage, and these should
> > > > never be accessible to clients. Add a new helper to vet a proposed
> > > > inode number against that range, and complain loudly and refuse to
> > > > create or look it up if it's in it. We do need to carve out an exception
> > > > for the root and the lost+found directories.
> > > > 
> > > > Also, ensure that the MDS doesn't try to delegate that range to us
> > > > either. Print a warning if it does, and don't save the range in the
> > > > xarray.
> > > > 
> > > > URL: https://tracker.ceph.com/issues/49922
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > Signed-off-by: Xiubo Li<xiubli@redhat.com>
> > > > Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
> > > > ---
> > > >  fs/ceph/export.c             |  8 ++++++++
> > > >  fs/ceph/inode.c              |  3 +++
> > > >  fs/ceph/mds_client.c         |  7 +++++++
> > > >  fs/ceph/super.h              | 24 ++++++++++++++++++++++++
> > > >  include/linux/ceph/ceph_fs.h |  7 ++++---
> > > >  5 files changed, 46 insertions(+), 3 deletions(-)
> > > > 
> > > > v2: allow lookups of lost+found dir inodes
> > > >     flesh out and update the CEPH_INO_* definitions
> > > > 
> > > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> > > > index 17d8c8f4ec89..65540a4429b2 100644
> > > > --- a/fs/ceph/export.c
> > > > +++ b/fs/ceph/export.c
> > > > @@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
> > > >  
> > > > 
> > > > 
> > > > 
> > > >  	vino.ino = ino;
> > > >  	vino.snap = CEPH_NOSNAP;
> > > > +
> > > > +	if (ceph_vino_is_reserved(vino))
> > > > +		return ERR_PTR(-ESTALE);
> > > > +
> > > >  	inode = ceph_find_inode(sb, vino);
> > > >  	if (!inode) {
> > > >  		struct ceph_mds_request *req;
> > > > @@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
> > > >  		vino.ino = sfh->ino;
> > > >  		vino.snap = sfh->snapid;
> > > >  	}
> > > > +
> > > > +	if (ceph_vino_is_reserved(vino))
> > > > +		return ERR_PTR(-ESTALE);
> > > > +
> > > >  	inode = ceph_find_inode(sb, vino);
> > > >  	if (inode)
> > > >  		return d_obtain_alias(inode);
> > > > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> > > > index 14a1f7963625..e1c63adb196d 100644
> > > > --- a/fs/ceph/inode.c
> > > > +++ b/fs/ceph/inode.c
> > > > @@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
> > > >  {
> > > >  	struct inode *inode;
> > > >  
> > > > 
> > > > 
> > > > 
> > > > +	if (ceph_vino_is_reserved(vino))
> > > > +		return ERR_PTR(-EREMOTEIO);
> > > > +
> > > >  	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
> > > >  			     ceph_set_ino_cb, &vino);
> > > >  	if (!inode)
> > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > > > index 63b53098360c..e5af591d3bd4 100644
> > > > --- a/fs/ceph/mds_client.c
> > > > +++ b/fs/ceph/mds_client.c
> > > > @@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
> > > >  
> > > > 
> > > > 
> > > > 
> > > >  		ceph_decode_64_safe(p, end, start, bad);
> > > >  		ceph_decode_64_safe(p, end, len, bad);
> > > > +
> > > > +		/* Don't accept a delegation of system inodes */
> > > > +		if (start < CEPH_INO_SYSTEM_BASE) {
> > > > +			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
> > > > +					start, len);
> > > > +			continue;
> > > > +		}
> > > >  		while (len--) {
> > > >  			int err = xa_insert(&s->s_delegated_inos, ino = start++,
> > > >  					    DELEGATED_INO_AVAILABLE,
> > > > diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> > > > index df0851b9240e..f1745403c9b0 100644
> > > > --- a/fs/ceph/super.h
> > > > +++ b/fs/ceph/super.h
> > > > @@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
> > > >  		ci->i_vino.snap == pvino->snap;
> > > >  }
> > > >  
> > > > 
> > > > 
> > > > 
> > > > +/*
> > > > + * The MDS reserves a set of inodes for its own usage. These should never
> > > > + * be accessible by clients, and so the MDS has no reason to ever hand these
> > > > + * out.
> > > > + *
> > > > + * These come from src/mds/mdstypes.h in the ceph sources.
> > > > + */
> > > > +#define CEPH_MAX_MDS		0x100
> > > > +#define CEPH_NUM_STRAY		10
> > > > +#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
> > > > +
> > > > +static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
> > > > +{
> > > > +	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
> > > > +	    vino.ino != CEPH_INO_ROOT &&
> > > > +	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
> > > > +		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);
> > > 
> > > This warning is quite easy to hit with inode 0x3, using generic/013.  It
> > > looks like, while looking for the snaprealm to check quotas, we will
> > > eventually get this value from the MDS.  So this function should also
> > > return 'true' if ino is CEPH_INO_GLOBAL_SNAPREALM.
> > > 
> > 
> > I wonder why I'm not seeing this. What mount options are you using?
> 
> Ah, I should have mentioned that.  I'm seeing this when using 'copyfrom'
> mount option, but (and more important!) I'm also not mounting the
> filesystem on the "real" root, i.e. I'm doing something like:
> 
>  mount -t ceph <ip>:<port>:/test /mnt/ ...
> 
> Not mounting on the FS root will allow us to get function
> ceph_has_realms_with_quotas() to return 'true'.
> 
> For reference, I'm including the WARNING bellow.
> 
> Cheers,

Thanks, that makes sense. At this point, I think we ought to probably
just assume that the very low inode numbers (0..0xff) are eligible to
be looked up by the MDS.

I'll send a v3 patch in a bit with what I think we might want.
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] ceph: don't allow access to MDS-private inodes
  2021-04-21 17:46     ` Luis Henriques
  2021-04-21 18:18       ` Jeff Layton
@ 2021-04-22  1:48       ` Xiubo Li
  1 sibling, 0 replies; 7+ messages in thread
From: Xiubo Li @ 2021-04-22  1:48 UTC (permalink / raw)
  To: Luis Henriques, Jeff Layton; +Cc: ceph-devel, pdonnell, idryomov

On 2021/4/22 1:46, Luis Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
>
>> On Wed, 2021-04-21 at 17:22 +0100, Luis Henriques wrote:
>>> Jeff Layton <jlayton@kernel.org> writes:
>>>
>>>> The MDS reserves a set of inodes for its own usage, and these should
>>>> never be accessible to clients. Add a new helper to vet a proposed
>>>> inode number against that range, and complain loudly and refuse to
>>>> create or look it up if it's in it. We do need to carve out an exception
>>>> for the root and the lost+found directories.
>>>>
>>>> Also, ensure that the MDS doesn't try to delegate that range to us
>>>> either. Print a warning if it does, and don't save the range in the
>>>> xarray.
>>>>
>>>> URL: https://tracker.ceph.com/issues/49922
>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>> Signed-off-by: Xiubo Li<xiubli@redhat.com>
>>>> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
>>>> ---
>>>>   fs/ceph/export.c             |  8 ++++++++
>>>>   fs/ceph/inode.c              |  3 +++
>>>>   fs/ceph/mds_client.c         |  7 +++++++
>>>>   fs/ceph/super.h              | 24 ++++++++++++++++++++++++
>>>>   include/linux/ceph/ceph_fs.h |  7 ++++---
>>>>   5 files changed, 46 insertions(+), 3 deletions(-)
>>>>
>>>> v2: allow lookups of lost+found dir inodes
>>>>      flesh out and update the CEPH_INO_* definitions
>>>>
>>>> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
>>>> index 17d8c8f4ec89..65540a4429b2 100644
>>>> --- a/fs/ceph/export.c
>>>> +++ b/fs/ceph/export.c
>>>> @@ -129,6 +129,10 @@ static struct inode *__lookup_inode(struct super_block *sb, u64 ino)
>>>>   
>>>>   	vino.ino = ino;
>>>>   	vino.snap = CEPH_NOSNAP;
>>>> +
>>>> +	if (ceph_vino_is_reserved(vino))
>>>> +		return ERR_PTR(-ESTALE);
>>>> +
>>>>   	inode = ceph_find_inode(sb, vino);
>>>>   	if (!inode) {
>>>>   		struct ceph_mds_request *req;
>>>> @@ -214,6 +218,10 @@ static struct dentry *__snapfh_to_dentry(struct super_block *sb,
>>>>   		vino.ino = sfh->ino;
>>>>   		vino.snap = sfh->snapid;
>>>>   	}
>>>> +
>>>> +	if (ceph_vino_is_reserved(vino))
>>>> +		return ERR_PTR(-ESTALE);
>>>> +
>>>>   	inode = ceph_find_inode(sb, vino);
>>>>   	if (inode)
>>>>   		return d_obtain_alias(inode);
>>>> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
>>>> index 14a1f7963625..e1c63adb196d 100644
>>>> --- a/fs/ceph/inode.c
>>>> +++ b/fs/ceph/inode.c
>>>> @@ -56,6 +56,9 @@ struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino)
>>>>   {
>>>>   	struct inode *inode;
>>>>   
>>>> +	if (ceph_vino_is_reserved(vino))
>>>> +		return ERR_PTR(-EREMOTEIO);
>>>> +
>>>>   	inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare,
>>>>   			     ceph_set_ino_cb, &vino);
>>>>   	if (!inode)
>>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>>>> index 63b53098360c..e5af591d3bd4 100644
>>>> --- a/fs/ceph/mds_client.c
>>>> +++ b/fs/ceph/mds_client.c
>>>> @@ -440,6 +440,13 @@ static int ceph_parse_deleg_inos(void **p, void *end,
>>>>   
>>>>   		ceph_decode_64_safe(p, end, start, bad);
>>>>   		ceph_decode_64_safe(p, end, len, bad);
>>>> +
>>>> +		/* Don't accept a delegation of system inodes */
>>>> +		if (start < CEPH_INO_SYSTEM_BASE) {
>>>> +			pr_warn_ratelimited("ceph: ignoring reserved inode range delegation (start=0x%llx len=0x%llx)\n",
>>>> +					start, len);
>>>> +			continue;
>>>> +		}
>>>>   		while (len--) {
>>>>   			int err = xa_insert(&s->s_delegated_inos, ino = start++,
>>>>   					    DELEGATED_INO_AVAILABLE,
>>>> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
>>>> index df0851b9240e..f1745403c9b0 100644
>>>> --- a/fs/ceph/super.h
>>>> +++ b/fs/ceph/super.h
>>>> @@ -529,10 +529,34 @@ static inline int ceph_ino_compare(struct inode *inode, void *data)
>>>>   		ci->i_vino.snap == pvino->snap;
>>>>   }
>>>>   
>>>> +/*
>>>> + * The MDS reserves a set of inodes for its own usage. These should never
>>>> + * be accessible by clients, and so the MDS has no reason to ever hand these
>>>> + * out.
>>>> + *
>>>> + * These come from src/mds/mdstypes.h in the ceph sources.
>>>> + */
>>>> +#define CEPH_MAX_MDS		0x100
>>>> +#define CEPH_NUM_STRAY		10
>>>> +#define CEPH_INO_SYSTEM_BASE	((6*CEPH_MAX_MDS) + (CEPH_MAX_MDS * CEPH_NUM_STRAY))
>>>> +
>>>> +static inline bool ceph_vino_is_reserved(const struct ceph_vino vino)
>>>> +{
>>>> +	if (vino.ino < CEPH_INO_SYSTEM_BASE &&
>>>> +	    vino.ino != CEPH_INO_ROOT &&
>>>> +	    vino.ino != CEPH_INO_LOST_AND_FOUND) {
>>>> +		WARN_RATELIMIT(1, "Attempt to access reserved inode number 0x%llx", vino.ino);
>>> This warning is quite easy to hit with inode 0x3, using generic/013.  It
>>> looks like, while looking for the snaprealm to check quotas, we will
>>> eventually get this value from the MDS.  So this function should also
>>> return 'true' if ino is CEPH_INO_GLOBAL_SNAPREALM.
>>>
>> I wonder why I'm not seeing this. What mount options are you using?
> Ah, I should have mentioned that.  I'm seeing this when using 'copyfrom'
> mount option, but (and more important!) I'm also not mounting the
> filesystem on the "real" root, i.e. I'm doing something like:
>
>   mount -t ceph <ip>:<port>:/test /mnt/ ...
>
> Not mounting on the FS root will allow us to get function
> ceph_has_realms_with_quotas() to return 'true'.
>
> For reference, I'm including the WARNING bellow.

Good catch Luis.

Yeah, when lookup the global snaprealm ino, it will hit this. I will 
update my PR in libcephfs.

Thanks

BRs


> Cheers,



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-22  1:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-20 14:06 [PATCH v2] ceph: don't allow access to MDS-private inodes Jeff Layton
2021-04-21  0:05 ` Xiubo Li
2021-04-21 16:22 ` Luis Henriques
2021-04-21 16:48   ` Jeff Layton
2021-04-21 17:46     ` Luis Henriques
2021-04-21 18:18       ` Jeff Layton
2021-04-22  1:48       ` Xiubo Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).