All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 9/9] iomap: Change calling convention for zeroing
Date: Tue, 25 Aug 2020 13:40:24 +0100	[thread overview]
Message-ID: <20200825124024.GN17456@casper.infradead.org> (raw)
In-Reply-To: <20200825042711.GL12131@dread.disaster.area>

On Tue, Aug 25, 2020 at 02:27:11PM +1000, Dave Chinner wrote:
> On Mon, Aug 24, 2020 at 09:35:59PM -0600, Andreas Dilger wrote:
> > On Aug 24, 2020, at 9:26 PM, Matthew Wilcox <willy@infradead.org> wrote:
> > > 
> > > On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinner wrote:
> > >>> 	do {
> > >>> -		unsigned offset, bytes;
> > >>> -
> > >>> -		offset = offset_in_page(pos);
> > >>> -		bytes = min_t(loff_t, PAGE_SIZE - offset, count);
> > >>> +		loff_t bytes;
> > >>> 
> > >>> 		if (IS_DAX(inode))
> > >>> -			status = dax_iomap_zero(pos, offset, bytes, iomap);
> > >>> +			bytes = dax_iomap_zero(pos, length, iomap);
> > >> 
> > >> Hmmm. everything is loff_t here, but the callers are defining length
> > >> as u64, not loff_t. Is there a potential sign conversion problem
> > >> here? (sure 64 bit is way beyond anything we'll pass here, but...)
> > > 
> > > I've gone back and forth on the correct type for 'length' a few times.
> > > size_t is too small (not for zeroing, but for seek()).  An unsigned type
> > > seems right -- a length can't be negative, and we don't want to give
> > > the impression that it can.  But the return value from these functions
> > > definitely needs to be signed so we can represent an error.  So a u64
> > > length with an loff_t return type feels like the best solution.  And
> > > the upper layers have to promise not to pass in a length that's more
> > > than 2^63-1.
> > 
> > The problem with allowing a u64 as the length is that it leads to the
> > possibility of an argument value that cannot be returned.  Checking
> > length < 0 is not worse than checking length > 0x7ffffffffffffff,
> > and has the benefit of consistency with the other argument types and
> > signs...

The callee should just trust that the caller isn't going to do that.
File sizes can't be more than 2^63-1 bytes, so an extent of a file also
can't be more than 2^63-1 bytes.

> I think the problem here is that we have no guaranteed 64 bit size
> type. when that was the case with off_t, we created loff_t to always
> represent a 64 bit offset value. However, we never created one for
> the count/size that is passed alongside loff_t in many places - it
> was said that "syscalls are limited to 32 bit sizes" and
> "size_t is 64 bit on 64 bit platforms" and so on and so we still
> don't have a clean way to pass 64 bit sizes through the IO path.
> 
> We've been living with this shitty situation for a long time now, so
> perhaps it's time for us to define lsize_t for 64 bit lengths and
> start using that everywhere that needs a 64 bit clean path
> through the code, regardless of whether the arch is 32 or 64 bit...
> 
> Thoughts?

I don't think the THP patches should be blocked on this expedition.

We have a guaranteed 64 bit type -- it's u64.  I don't think defining
lsize_t is going to fix anything.  The next big problem to fix will be
supporting storage >16EiB, but I think that's a project that can start
in ten years and still be here before anyone but the TLAs have that much
storage in a single device.

Any objection to leaving this patch as-is with a u64 length?
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 9/9] iomap: Change calling convention for zeroing
Date: Tue, 25 Aug 2020 13:40:24 +0100	[thread overview]
Message-ID: <20200825124024.GN17456@casper.infradead.org> (raw)
In-Reply-To: <20200825042711.GL12131@dread.disaster.area>

On Tue, Aug 25, 2020 at 02:27:11PM +1000, Dave Chinner wrote:
> On Mon, Aug 24, 2020 at 09:35:59PM -0600, Andreas Dilger wrote:
> > On Aug 24, 2020, at 9:26 PM, Matthew Wilcox <willy@infradead.org> wrote:
> > > 
> > > On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinner wrote:
> > >>> 	do {
> > >>> -		unsigned offset, bytes;
> > >>> -
> > >>> -		offset = offset_in_page(pos);
> > >>> -		bytes = min_t(loff_t, PAGE_SIZE - offset, count);
> > >>> +		loff_t bytes;
> > >>> 
> > >>> 		if (IS_DAX(inode))
> > >>> -			status = dax_iomap_zero(pos, offset, bytes, iomap);
> > >>> +			bytes = dax_iomap_zero(pos, length, iomap);
> > >> 
> > >> Hmmm. everything is loff_t here, but the callers are defining length
> > >> as u64, not loff_t. Is there a potential sign conversion problem
> > >> here? (sure 64 bit is way beyond anything we'll pass here, but...)
> > > 
> > > I've gone back and forth on the correct type for 'length' a few times.
> > > size_t is too small (not for zeroing, but for seek()).  An unsigned type
> > > seems right -- a length can't be negative, and we don't want to give
> > > the impression that it can.  But the return value from these functions
> > > definitely needs to be signed so we can represent an error.  So a u64
> > > length with an loff_t return type feels like the best solution.  And
> > > the upper layers have to promise not to pass in a length that's more
> > > than 2^63-1.
> > 
> > The problem with allowing a u64 as the length is that it leads to the
> > possibility of an argument value that cannot be returned.  Checking
> > length < 0 is not worse than checking length > 0x7ffffffffffffff,
> > and has the benefit of consistency with the other argument types and
> > signs...

The callee should just trust that the caller isn't going to do that.
File sizes can't be more than 2^63-1 bytes, so an extent of a file also
can't be more than 2^63-1 bytes.

> I think the problem here is that we have no guaranteed 64 bit size
> type. when that was the case with off_t, we created loff_t to always
> represent a 64 bit offset value. However, we never created one for
> the count/size that is passed alongside loff_t in many places - it
> was said that "syscalls are limited to 32 bit sizes" and
> "size_t is 64 bit on 64 bit platforms" and so on and so we still
> don't have a clean way to pass 64 bit sizes through the IO path.
> 
> We've been living with this shitty situation for a long time now, so
> perhaps it's time for us to define lsize_t for 64 bit lengths and
> start using that everywhere that needs a 64 bit clean path
> through the code, regardless of whether the arch is 32 or 64 bit...
> 
> Thoughts?

I don't think the THP patches should be blocked on this expedition.

We have a guaranteed 64 bit type -- it's u64.  I don't think defining
lsize_t is going to fix anything.  The next big problem to fix will be
supporting storage >16EiB, but I think that's a project that can start
in ten years and still be here before anyone but the TLAs have that much
storage in a single device.

Any objection to leaving this patch as-is with a u64 length?

  reply	other threads:[~2020-08-25 12:40 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24 14:55 [PATCH 0/9] THP iomap patches for 5.10 Matthew Wilcox (Oracle)
2020-08-24 14:55 ` Matthew Wilcox (Oracle)
2020-08-24 14:55 ` [PATCH 1/9] iomap: Fix misplaced page flushing Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-24 23:51   ` Dave Chinner
2020-08-24 23:51     ` Dave Chinner
2020-08-25 20:47   ` Darrick J. Wong
2020-08-25 20:47     ` Darrick J. Wong
2020-08-27  8:24   ` Christoph Hellwig
2020-08-27  8:24     ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 2/9] fs: Introduce i_blocks_per_page Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-24 23:55   ` Dave Chinner
2020-08-24 23:55     ` Dave Chinner
2020-08-25 20:49   ` Darrick J. Wong
2020-08-25 20:49     ` Darrick J. Wong
2020-08-25 22:21     ` Dave Chinner
2020-08-25 22:21       ` Dave Chinner
2020-08-24 14:55 ` [PATCH 3/9] iomap: Use kzalloc to allocate iomap_page Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-24 23:56   ` Dave Chinner
2020-08-24 23:56     ` Dave Chinner
2020-08-25 20:49   ` Darrick J. Wong
2020-08-25 20:49     ` Darrick J. Wong
2020-08-24 14:55 ` [PATCH 4/9] iomap: Use bitmap ops to set uptodate bits Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-24 23:56   ` Dave Chinner
2020-08-24 23:56     ` Dave Chinner
2020-08-25 20:50   ` Darrick J. Wong
2020-08-25 20:50     ` Darrick J. Wong
2020-08-24 14:55 ` [PATCH 5/9] iomap: Support arbitrarily many blocks per page Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-24 23:59   ` Dave Chinner
2020-08-24 23:59     ` Dave Chinner
2020-08-25  0:22     ` Matthew Wilcox
2020-08-25  0:22       ` Matthew Wilcox
2020-08-25 21:02   ` Darrick J. Wong
2020-08-25 21:02     ` Darrick J. Wong
2020-08-26  2:26     ` Matthew Wilcox
2020-08-26  2:26       ` Matthew Wilcox
2020-08-26  3:32       ` Darrick J. Wong
2020-08-26  3:32         ` Darrick J. Wong
2020-08-27  8:26   ` Christoph Hellwig
2020-08-27  8:26     ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 6/9] iomap: Convert read_count to byte count Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-25  0:09   ` Dave Chinner
2020-08-25  0:09     ` Dave Chinner
2020-08-25 22:14     ` Darrick J. Wong
2020-08-25 22:14       ` Darrick J. Wong
2020-08-27  8:35   ` Christoph Hellwig
2020-08-27  8:35     ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 7/9] iomap: Convert write_count " Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-27  8:36   ` Christoph Hellwig
2020-08-27  8:36     ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 8/9] iomap: Convert iomap_write_end types Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-25  0:12   ` Dave Chinner
2020-08-25  0:12     ` Dave Chinner
2020-08-25  1:06     ` Matthew Wilcox
2020-08-25  1:06       ` Matthew Wilcox
2020-08-25  1:33       ` Dave Chinner
2020-08-25  1:33         ` Dave Chinner
2020-08-27  8:41   ` Christoph Hellwig
2020-08-27  8:41     ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 9/9] iomap: Change calling convention for zeroing Matthew Wilcox (Oracle)
2020-08-24 14:55   ` Matthew Wilcox (Oracle)
2020-08-25  0:27   ` Dave Chinner
2020-08-25  0:27     ` Dave Chinner
2020-08-25  3:26     ` Matthew Wilcox
2020-08-25  3:26       ` Matthew Wilcox
2020-08-25  3:35       ` Andreas Dilger
2020-08-25  4:27         ` Dave Chinner
2020-08-25  4:27           ` Dave Chinner
2020-08-25 12:40           ` Matthew Wilcox [this message]
2020-08-25 12:40             ` Matthew Wilcox
2020-08-25 22:05             ` Dave Chinner
2020-08-25 22:05               ` Dave Chinner
2020-08-25 22:23   ` Darrick J. Wong
2020-08-25 22:23     ` Darrick J. Wong
2020-08-27  8:39     ` Christoph Hellwig
2020-08-27  8:39       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200825124024.GN17456@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=adilger@dilger.ca \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.