ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: ceph-devel@vger.kernel.org
Cc: Ulrich.Weigand@de.ibm.com, Tuan.Hoang1@ibm.com, idryomov@gmail.com
Subject: Re: [PATCH] ceph: fix inode number handling on arches with 32-bit ino_t
Date: Thu, 20 Aug 2020 17:30:57 -0400	[thread overview]
Message-ID: <4429ff0fc03ced7ab7ecabeadda913b5a08d1684.camel@kernel.org> (raw)
In-Reply-To: <20200819151645.38951-1-jlayton@kernel.org>

On Wed, 2020-08-19 at 11:16 -0400, Jeff Layton wrote:
> Tuan and Ulrich mentioned that they were hitting a problem on s390x,
> which has a 32-bit ino_t value, even though it's a 64-bit arch (for
> historical reasons).
> 
> I think the current handling of inode numbers in the ceph driver is
> wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
> actually not a problem. 32-bit arches can deal with 64-bit inode numbers
> just fine when userland code is compiled with LFS support (the common
> case these days).
> 
> What we really want to do is just use 64-bit numbers everywhere, unless
> someone has mounted with the ino32 mount option. In that case, we want
> to ensure that we hash the inode number down to something that will fit
> in 32 bits before presenting the value to userland.
> 
> Add new helper functions that do this, and only do the conversion before
> presenting these values to userland in getattr and readdir.
> 
> The inode table hashvalue is changed to just cast the inode number to
> unsigned long, as low-order bits are the most likely to vary anyway.
> 
> While it's not strictly required, we do want to put something in
> inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
> the size of the ino_t type.
> 
> Reported-by: Tuan Hoang1 <Tuan.Hoang1@ibm.com>
> Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
> URL: https://tracker.ceph.com/issues/46828
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/ceph/caps.c       | 14 ++++-----
>  fs/ceph/debugfs.c    |  4 +--
>  fs/ceph/dir.c        | 31 ++++++++-----------
>  fs/ceph/file.c       |  4 +--
>  fs/ceph/inode.c      | 19 ++++++------
>  fs/ceph/mds_client.h |  2 +-
>  fs/ceph/quota.c      |  4 +--
>  fs/ceph/super.h      | 73 +++++++++++++++++++++++---------------------
>  8 files changed, 74 insertions(+), 77 deletions(-)
> 
> v4:
> - flesh out comments in super.h
> - merge dout messages in ceph_get_inode
> - rename ceph_vino_to_ino to ceph_vino_to_ino_t
> 
> v3:
> - use ceph_ino instead of ceph_present_ino in most dout() messages
> 
> v2:
> - fix dir_emit inode number for ".."
> - fix ino_t size test
> 

FWIW, I built an i386 VM and ran a kernel with this patch through
xfstests and it seems to be ok.

To be clear though, this _will_ be a user-visible change on 32-bit
arches:

1/ inode numbers will be seen to have changed between kernel versions.
32-bit arches will see large inode numbers now instead of the hashed
ones they saw before.

2/ any really old software not built with LFS support may start failing
stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we can
do about these, but hopefully the intersection of people running such
code on ceph will be very small.

The workaround for both problems will be to mount with "-o ino32".
-- 
Jeff Layton <jlayton@kernel.org>


  reply	other threads:[~2020-08-20 21:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-18 16:23 [PATCH] ceph: fix inode number presentation on 32-bit platforms and s390x Jeff Layton
2020-08-18 16:36 ` Jeff Layton
2020-08-18 19:49 ` [PATCH v2] ceph: fix inode number handling on arches with 32-bit ino_t Jeff Layton
2020-08-19  1:58   ` Yan, Zheng
2020-08-19  7:17   ` Ilya Dryomov
2020-08-19 11:25     ` Jeff Layton
2020-08-19 13:28       ` Ilya Dryomov
2020-08-19 13:57         ` Jeff Layton
2020-08-19 14:25           ` Jeff Layton
2020-08-19 12:35 ` [PATCH v3] " Jeff Layton
2020-08-19 12:43   ` Jeff Layton
2020-08-19 15:16 ` [PATCH] " Jeff Layton
2020-08-20 21:30   ` Jeff Layton [this message]
2020-08-24 13:38   ` Yan, Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4429ff0fc03ced7ab7ecabeadda913b5a08d1684.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Tuan.Hoang1@ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).