All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
@ 2023-03-24 20:44 Jens Axboe
  2023-03-24 20:44 ` [PATCH 1/2] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jens Axboe @ 2023-03-24 20:44 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner

Hi,

We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
spots, as the latter is cheaper to iterate and hence saves some cycles.
I recently experimented [1] with io_uring converting single segment READV
and WRITEV into non-vectored variants, as we can save some cycles through
that as well.

But there's really no reason why we can't just do this further down,
enabling it for everyone. It's quite common to use vectored reads or
writes even with a single segment, unfortunately, even for cases where
there's no specific reason to do so. From a bit of non-scientific
testing on a vm on my laptop, I see about 60% of the import_iovec()
calls being for a single segment.

I initially was worried that we'd have callers assuming an ITER_IOVEC
iter after a call import_iovec() or import_single_range(), but an audit
of the kernel code actually looks sane in that regard. Of the ones that
do call it, I ran the ltp test cases and they all still pass.

[1] https://lore.kernel.org/io-uring/43cb1fb7-b30b-8df1-bba6-e50797d680c6@kernel.dk/

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] iov_iter: convert import_single_range() to ITER_UBUF
  2023-03-24 20:44 [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Jens Axboe
@ 2023-03-24 20:44 ` Jens Axboe
  2023-03-24 20:44 ` [PATCH 2/2] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2023-03-24 20:44 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, Jens Axboe

Since we're just importing a single vector, we don't have to turn it
into an ITER_IOVEC. Instead turn it into an ITER_UBUF, which is cheaper
to iterate.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/iov_iter.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 274014e4eafe..fc82cc42ffe6 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1866,9 +1866,7 @@ int import_single_range(int rw, void __user *buf, size_t len,
 	if (unlikely(!access_ok(buf, len)))
 		return -EFAULT;
 
-	iov->iov_base = buf;
-	iov->iov_len = len;
-	iov_iter_init(i, rw, iov, 1, len);
+	iov_iter_ubuf(i, rw, buf, len);
 	return 0;
 }
 EXPORT_SYMBOL(import_single_range);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] iov_iter: import single vector iovecs as ITER_UBUF
  2023-03-24 20:44 [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Jens Axboe
  2023-03-24 20:44 ` [PATCH 1/2] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
@ 2023-03-24 20:44 ` Jens Axboe
  2023-03-24 21:14 ` [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Linus Torvalds
  2023-03-25  4:46 ` Al Viro
  3 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2023-03-24 20:44 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, Jens Axboe

Add a special case to __import_iovec(), which imports a single segment
iovec as an ITER_UBUF rather than an ITER_IOVEC. ITER_UBUF is cheaper
to iterate than ITER_IOVEC, and for a single segment iovec, there's no
point in using a segmented iterator.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/iov_iter.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index fc82cc42ffe6..7868145fde4f 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1780,6 +1780,28 @@ struct iovec *iovec_from_user(const struct iovec __user *uvec,
 	return iov;
 }
 
+/*
+ * Single segment iovec supplied by the user, import it as ITER_UBUF.
+ */
+ssize_t __import_iovec_ubuf(int type, const struct iovec __user *uvec,
+			    struct iovec **iovp, struct iov_iter *i,
+			    bool compat)
+{
+	struct iovec *iov = *iovp;
+	ssize_t ret;
+
+	if (compat)
+		ret = copy_compat_iovec_from_user(iov, uvec, 1);
+	else
+		ret = copy_iovec_from_user(iov, uvec, 1);
+	if (unlikely(ret))
+		return ret;
+
+	import_ubuf(type, iov->iov_base, iov->iov_len, i);
+	*iovp = NULL;
+	return i->count;
+}
+
 ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 		 unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
 		 struct iov_iter *i, bool compat)
@@ -1788,6 +1810,9 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 	unsigned long seg;
 	struct iovec *iov;
 
+	if (nr_segs == 1)
+		return __import_iovec_ubuf(type, uvec, iovp, i, compat);
+
 	iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
 	if (IS_ERR(iov)) {
 		*iovp = NULL;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-24 20:44 [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Jens Axboe
  2023-03-24 20:44 ` [PATCH 1/2] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
  2023-03-24 20:44 ` [PATCH 2/2] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
@ 2023-03-24 21:14 ` Linus Torvalds
  2023-03-24 21:52   ` Jens Axboe
  2023-03-25  4:46 ` Al Viro
  3 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2023-03-24 21:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, brauner, Al Viro

On Fri, Mar 24, 2023 at 1:44 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
> spots, as the latter is cheaper to iterate and hence saves some cycles.
> I recently experimented [1] with io_uring converting single segment READV
> and WRITEV into non-vectored variants, as we can save some cycles through
> that as well.
>
> But there's really no reason why we can't just do this further down,
> enabling it for everyone. It's quite common to use vectored reads or
> writes even with a single segment, unfortunately, even for cases where
> there's no specific reason to do so. From a bit of non-scientific
> testing on a vm on my laptop, I see about 60% of the import_iovec()
> calls being for a single segment.

I obviously think this is the RightThing(tm) to do, but it's probably
too late for 6.3 since there is the worry that somebody "knows" that
it's a IOVEC somewhere.

Even if it sounds unlikely, and wrong.

Adding Al, who tends to be the main iovec person.

Al, see

   https://lore.kernel.org/all/20230324204443.45950-1-axboe@kernel.dk/

for the series if you didn't already see it on fsdevel.

                  Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-24 21:14 ` [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Linus Torvalds
@ 2023-03-24 21:52   ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2023-03-24 21:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, brauner, Al Viro

On 3/24/23 3:14?PM, Linus Torvalds wrote:
> On Fri, Mar 24, 2023 at 1:44?PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
>> spots, as the latter is cheaper to iterate and hence saves some cycles.
>> I recently experimented [1] with io_uring converting single segment READV
>> and WRITEV into non-vectored variants, as we can save some cycles through
>> that as well.
>>
>> But there's really no reason why we can't just do this further down,
>> enabling it for everyone. It's quite common to use vectored reads or
>> writes even with a single segment, unfortunately, even for cases where
>> there's no specific reason to do so. From a bit of non-scientific
>> testing on a vm on my laptop, I see about 60% of the import_iovec()
>> calls being for a single segment.
> 
> I obviously think this is the RightThing(tm) to do, but it's probably
> too late for 6.3 since there is the worry that somebody "knows" that
> it's a IOVEC somewhere.
> 
> Even if it sounds unlikely, and wrong.

Agree, wasn't really targeting 6.3 though after looking over it, I do
feel better about the whole thing.

I already ran the io_uring test and it showed a nice win, wrote a small
micro benchmark that just does 10M 4k reads from /dev/zero. First
observation from the below numbers is that copying just a single vec is
EXPENSIVE. But I already knew that from the io_uring testing, where
we're spending ~8% just on that alone. Secondly, readv(..., 1) saves
about 3% with the patches in this series.

read-zero takes on argument, which is to do vectored reads or not.

Stock kernel:

axboe@r7525 ~> time taskset -c 0 ./read-zero 0
________________________________________________________
Executed in  859.98 millis    fish           external
   usr time  210.10 millis  291.00 micros  209.81 millis
   sys time  649.42 millis    0.00 micros  649.42 millis

axboe@r7525 ~> time taskset -c 0 ./read-zero 0
________________________________________________________
Executed in  853.82 millis    fish           external
   usr time  228.45 millis  304.00 micros  228.15 millis
   sys time  624.92 millis    0.00 micros  624.92 millis

axboe@r7525 ~> time taskset -c 0 ./read-zero 1
________________________________________________________
Executed in    1.84 secs    fish           external
   usr time    0.21 secs  218.00 micros    0.21 secs
   sys time    1.63 secs  101.00 micros    1.63 secs

axboe@r7525 ~> time taskset -c 0 ./read-zero 1
________________________________________________________
Executed in    1.83 secs    fish           external
   usr time    0.18 secs  594.00 micros    0.18 secs
   sys time    1.64 secs    0.00 micros    1.64 secs

And with the patches:

axboe@r7525 ~> time taskset -c 0 ./read-zero 1
________________________________________________________
Executed in    1.78 secs    fish           external
   usr time    0.22 secs  141.00 micros    0.22 secs
   sys time    1.56 secs  141.00 micros    1.56 secs

axboe@r7525 ~> time taskset -c 0 ./read-zero 1
________________________________________________________
Executed in    1.78 secs    fish           external
   usr time    0.19 secs    0.00 micros    0.19 secs
   sys time    1.59 secs  509.00 micros    1.59 secs

read-zero 0 the same with patches, as expected.

> Adding Al, who tends to be the main iovec person.
> 
> Al, see
> 
>    https://lore.kernel.org/all/20230324204443.45950-1-axboe@kernel.dk/
> 
> for the series if you didn't already see it on fsdevel.

Yep sorry, forgot to add Al.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-24 20:44 [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (2 preceding siblings ...)
  2023-03-24 21:14 ` [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Linus Torvalds
@ 2023-03-25  4:46 ` Al Viro
  2023-03-27 18:01   ` Jens Axboe
  3 siblings, 1 reply; 12+ messages in thread
From: Al Viro @ 2023-03-25  4:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, torvalds, brauner

On Fri, Mar 24, 2023 at 02:44:41PM -0600, Jens Axboe wrote:
> Hi,
> 
> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
> spots, as the latter is cheaper to iterate and hence saves some cycles.
> I recently experimented [1] with io_uring converting single segment READV
> and WRITEV into non-vectored variants, as we can save some cycles through
> that as well.
> 
> But there's really no reason why we can't just do this further down,
> enabling it for everyone. It's quite common to use vectored reads or
> writes even with a single segment, unfortunately, even for cases where
> there's no specific reason to do so. From a bit of non-scientific
> testing on a vm on my laptop, I see about 60% of the import_iovec()
> calls being for a single segment.
> 
> I initially was worried that we'd have callers assuming an ITER_IOVEC
> iter after a call import_iovec() or import_single_range(), but an audit
> of the kernel code actually looks sane in that regard. Of the ones that
> do call it, I ran the ltp test cases and they all still pass.

Which tree was that audit on?  Mainline?  Some branch in block.git?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-25  4:46 ` Al Viro
@ 2023-03-27 18:01   ` Jens Axboe
  2023-03-27 18:42     ` Al Viro
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2023-03-27 18:01 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, torvalds, brauner

On 3/24/23 10:46 PM, Al Viro wrote:
> On Fri, Mar 24, 2023 at 02:44:41PM -0600, Jens Axboe wrote:
>> Hi,
>>
>> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
>> spots, as the latter is cheaper to iterate and hence saves some cycles.
>> I recently experimented [1] with io_uring converting single segment READV
>> and WRITEV into non-vectored variants, as we can save some cycles through
>> that as well.
>>
>> But there's really no reason why we can't just do this further down,
>> enabling it for everyone. It's quite common to use vectored reads or
>> writes even with a single segment, unfortunately, even for cases where
>> there's no specific reason to do so. From a bit of non-scientific
>> testing on a vm on my laptop, I see about 60% of the import_iovec()
>> calls being for a single segment.
>>
>> I initially was worried that we'd have callers assuming an ITER_IOVEC
>> iter after a call import_iovec() or import_single_range(), but an audit
>> of the kernel code actually looks sane in that regard. Of the ones that
>> do call it, I ran the ltp test cases and they all still pass.
> 
> Which tree was that audit on?  Mainline?  Some branch in block.git?

It was just master in -git. But looks like I did miss two spots, I've
updated the series here and will send out a v2:

https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-27 18:01   ` Jens Axboe
@ 2023-03-27 18:42     ` Al Viro
  2023-03-27 18:52       ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Al Viro @ 2023-03-27 18:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, torvalds, brauner

On Mon, Mar 27, 2023 at 12:01:08PM -0600, Jens Axboe wrote:
> On 3/24/23 10:46 PM, Al Viro wrote:
> > On Fri, Mar 24, 2023 at 02:44:41PM -0600, Jens Axboe wrote:
> >> Hi,
> >>
> >> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
> >> spots, as the latter is cheaper to iterate and hence saves some cycles.
> >> I recently experimented [1] with io_uring converting single segment READV
> >> and WRITEV into non-vectored variants, as we can save some cycles through
> >> that as well.
> >>
> >> But there's really no reason why we can't just do this further down,
> >> enabling it for everyone. It's quite common to use vectored reads or
> >> writes even with a single segment, unfortunately, even for cases where
> >> there's no specific reason to do so. From a bit of non-scientific
> >> testing on a vm on my laptop, I see about 60% of the import_iovec()
> >> calls being for a single segment.
> >>
> >> I initially was worried that we'd have callers assuming an ITER_IOVEC
> >> iter after a call import_iovec() or import_single_range(), but an audit
> >> of the kernel code actually looks sane in that regard. Of the ones that
> >> do call it, I ran the ltp test cases and they all still pass.
> > 
> > Which tree was that audit on?  Mainline?  Some branch in block.git?
> 
> It was just master in -git. But looks like I did miss two spots, I've
> updated the series here and will send out a v2:
> 
> https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf

Just to make sure - head's at 4d0ba2f0250d?

One obvious comment (just about the problems you've dealt with in that branch;
I'll go over that tree and look for other sources of trouble, will post tonight):
all 3 callers of iov_iter_iovec() in there are accompanied by the identical
chunks that deal with ITER_UBUF case; it would make more sense to teach
iov_iter_iovec() to handle that.  loop_rw_iter() would turn into
	if (!iov_iter_is_bvec(iter)) {
		iovec = iov_iter_iovec(iter);
	} else {
		iovec.iov_base = u64_to_user_ptr(rw->addr);
		iovec.iov_len = rw->len;
	}
and process_madvise() and do_loop_readv_writev() patches simply go away.

Again, I'm _not_ saying there's no other problems left, just that these are
better dealt with that way.

Something like

static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
{
	if (WARN_ON(!iter->user_backed))
		return (struct iovec) {
			.iov_base = NULL,
			.iov_len = 0
		};
	else if (iov_iter_is_ubuf(iter))
		return (struct iovec) {
			.iov_base = iter->ubuf + iter->iov_offset,
			.iov_len = iter->count
		}; 
	else
		return (struct iovec) {
			.iov_base = iter->iov->iov_base + iter->iov_offset,
			.iov_len = min(iter->count,
				       iter->iov->iov_len - iter->iov_offset),
		};
}

and no need to duplicate that logics in all callers.  Or get rid of those
elses, seeing that each alternative is a plain return - matter of taste...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-27 18:42     ` Al Viro
@ 2023-03-27 18:52       ` Jens Axboe
  2023-03-27 18:59         ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2023-03-27 18:52 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, torvalds, brauner

On 3/27/23 12:42?PM, Al Viro wrote:
> On Mon, Mar 27, 2023 at 12:01:08PM -0600, Jens Axboe wrote:
>> On 3/24/23 10:46?PM, Al Viro wrote:
>>> On Fri, Mar 24, 2023 at 02:44:41PM -0600, Jens Axboe wrote:
>>>> Hi,
>>>>
>>>> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
>>>> spots, as the latter is cheaper to iterate and hence saves some cycles.
>>>> I recently experimented [1] with io_uring converting single segment READV
>>>> and WRITEV into non-vectored variants, as we can save some cycles through
>>>> that as well.
>>>>
>>>> But there's really no reason why we can't just do this further down,
>>>> enabling it for everyone. It's quite common to use vectored reads or
>>>> writes even with a single segment, unfortunately, even for cases where
>>>> there's no specific reason to do so. From a bit of non-scientific
>>>> testing on a vm on my laptop, I see about 60% of the import_iovec()
>>>> calls being for a single segment.
>>>>
>>>> I initially was worried that we'd have callers assuming an ITER_IOVEC
>>>> iter after a call import_iovec() or import_single_range(), but an audit
>>>> of the kernel code actually looks sane in that regard. Of the ones that
>>>> do call it, I ran the ltp test cases and they all still pass.
>>>
>>> Which tree was that audit on?  Mainline?  Some branch in block.git?
>>
>> It was just master in -git. But looks like I did miss two spots, I've
>> updated the series here and will send out a v2:
>>
>> https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf
> 
> Just to make sure - head's at 4d0ba2f0250d?

Correct!

> One obvious comment (just about the problems you've dealt with in that
> branch; I'll go over that tree and look for other sources of trouble,
> will post tonight):
>
> all 3 callers of iov_iter_iovec() in there are accompanied by the identical
> chunks that deal with ITER_UBUF case; it would make more sense to teach
> iov_iter_iovec() to handle that.  loop_rw_iter() would turn into
> 	if (!iov_iter_is_bvec(iter)) {
> 		iovec = iov_iter_iovec(iter);
> 	} else {
> 		iovec.iov_base = u64_to_user_ptr(rw->addr);
> 		iovec.iov_len = rw->len;
> 	}
> and process_madvise() and do_loop_readv_writev() patches simply go away.
> 
> Again, I'm _not_ saying there's no other problems left, just that these are
> better dealt with that way.
> 
> Something like
> 
> static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
> {
> 	if (WARN_ON(!iter->user_backed))
> 		return (struct iovec) {
> 			.iov_base = NULL,
> 			.iov_len = 0
> 		};
> 	else if (iov_iter_is_ubuf(iter))
> 		return (struct iovec) {
> 			.iov_base = iter->ubuf + iter->iov_offset,
> 			.iov_len = iter->count
> 		}; 
> 	else
> 		return (struct iovec) {
> 			.iov_base = iter->iov->iov_base + iter->iov_offset,
> 			.iov_len = min(iter->count,
> 				       iter->iov->iov_len - iter->iov_offset),
> 		};
> }
> 
> and no need to duplicate that logics in all callers.  Or get rid of
> those elses, seeing that each alternative is a plain return - matter
> of taste...

That's a great idea. Two questions - do we want to make that
WARN_ON_ONCE()? And then do we want to include a WARN_ON_ONCE for a
non-supported type? Doesn't seem like high risk as they've all been used
with ITER_IOVEC until now, though.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-27 18:52       ` Jens Axboe
@ 2023-03-27 18:59         ` Jens Axboe
  2023-03-27 20:02           ` Al Viro
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2023-03-27 18:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, torvalds, brauner

On 3/27/23 12:52?PM, Jens Axboe wrote:
> On 3/27/23 12:42?PM, Al Viro wrote:
>> On Mon, Mar 27, 2023 at 12:01:08PM -0600, Jens Axboe wrote:
>>> On 3/24/23 10:46?PM, Al Viro wrote:
>>>> On Fri, Mar 24, 2023 at 02:44:41PM -0600, Jens Axboe wrote:
>>>>> Hi,
>>>>>
>>>>> We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
>>>>> spots, as the latter is cheaper to iterate and hence saves some cycles.
>>>>> I recently experimented [1] with io_uring converting single segment READV
>>>>> and WRITEV into non-vectored variants, as we can save some cycles through
>>>>> that as well.
>>>>>
>>>>> But there's really no reason why we can't just do this further down,
>>>>> enabling it for everyone. It's quite common to use vectored reads or
>>>>> writes even with a single segment, unfortunately, even for cases where
>>>>> there's no specific reason to do so. From a bit of non-scientific
>>>>> testing on a vm on my laptop, I see about 60% of the import_iovec()
>>>>> calls being for a single segment.
>>>>>
>>>>> I initially was worried that we'd have callers assuming an ITER_IOVEC
>>>>> iter after a call import_iovec() or import_single_range(), but an audit
>>>>> of the kernel code actually looks sane in that regard. Of the ones that
>>>>> do call it, I ran the ltp test cases and they all still pass.
>>>>
>>>> Which tree was that audit on?  Mainline?  Some branch in block.git?
>>>
>>> It was just master in -git. But looks like I did miss two spots, I've
>>> updated the series here and will send out a v2:
>>>
>>> https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf
>>
>> Just to make sure - head's at 4d0ba2f0250d?
> 
> Correct!
> 
>> One obvious comment (just about the problems you've dealt with in that
>> branch; I'll go over that tree and look for other sources of trouble,
>> will post tonight):
>>
>> all 3 callers of iov_iter_iovec() in there are accompanied by the identical
>> chunks that deal with ITER_UBUF case; it would make more sense to teach
>> iov_iter_iovec() to handle that.  loop_rw_iter() would turn into
>> 	if (!iov_iter_is_bvec(iter)) {
>> 		iovec = iov_iter_iovec(iter);
>> 	} else {
>> 		iovec.iov_base = u64_to_user_ptr(rw->addr);
>> 		iovec.iov_len = rw->len;
>> 	}
>> and process_madvise() and do_loop_readv_writev() patches simply go away.
>>
>> Again, I'm _not_ saying there's no other problems left, just that these are
>> better dealt with that way.
>>
>> Something like
>>
>> static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
>> {
>> 	if (WARN_ON(!iter->user_backed))
>> 		return (struct iovec) {
>> 			.iov_base = NULL,
>> 			.iov_len = 0
>> 		};
>> 	else if (iov_iter_is_ubuf(iter))
>> 		return (struct iovec) {
>> 			.iov_base = iter->ubuf + iter->iov_offset,
>> 			.iov_len = iter->count
>> 		}; 
>> 	else
>> 		return (struct iovec) {
>> 			.iov_base = iter->iov->iov_base + iter->iov_offset,
>> 			.iov_len = min(iter->count,
>> 				       iter->iov->iov_len - iter->iov_offset),
>> 		};
>> }
>>
>> and no need to duplicate that logics in all callers.  Or get rid of
>> those elses, seeing that each alternative is a plain return - matter
>> of taste...
> 
> That's a great idea. Two questions - do we want to make that
> WARN_ON_ONCE()? And then do we want to include a WARN_ON_ONCE for a
> non-supported type? Doesn't seem like high risk as they've all been used
> with ITER_IOVEC until now, though.

Scratch that last one, user_backed should double as that as well. At
least currently, where ITER_UBUF and ITER_IOVEC are the only two
iterators that hold user backed memory.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-27 18:59         ` Jens Axboe
@ 2023-03-27 20:02           ` Al Viro
  2023-03-27 20:03             ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Al Viro @ 2023-03-27 20:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, torvalds, brauner

On Mon, Mar 27, 2023 at 12:59:09PM -0600, Jens Axboe wrote:

> > That's a great idea. Two questions - do we want to make that
> > WARN_ON_ONCE()? And then do we want to include a WARN_ON_ONCE for a
> > non-supported type? Doesn't seem like high risk as they've all been used
> > with ITER_IOVEC until now, though.
> 
> Scratch that last one, user_backed should double as that as well. At
> least currently, where ITER_UBUF and ITER_IOVEC are the only two
> iterators that hold user backed memory.

Quite.  As for the WARN_ON_ONCE vs. WARN_ON...  No preferences, really.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/2] Turn single segment imports into ITER_UBUF
  2023-03-27 20:02           ` Al Viro
@ 2023-03-27 20:03             ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2023-03-27 20:03 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, torvalds, brauner

On 3/27/23 2:02 PM, Al Viro wrote:
> On Mon, Mar 27, 2023 at 12:59:09PM -0600, Jens Axboe wrote:
> 
>>> That's a great idea. Two questions - do we want to make that
>>> WARN_ON_ONCE()? And then do we want to include a WARN_ON_ONCE for a
>>> non-supported type? Doesn't seem like high risk as they've all been used
>>> with ITER_IOVEC until now, though.
>>
>> Scratch that last one, user_backed should double as that as well. At
>> least currently, where ITER_UBUF and ITER_IOVEC are the only two
>> iterators that hold user backed memory.
> 
> Quite.  As for the WARN_ON_ONCE vs. WARN_ON...  No preferences, really.

OK, I'll stick with WARN_ON_ONCE then. At least that avoids a ton of
dmesg dumping for something buggy, yet still preserving the first
trace.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-03-27 20:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24 20:44 [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Jens Axboe
2023-03-24 20:44 ` [PATCH 1/2] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
2023-03-24 20:44 ` [PATCH 2/2] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
2023-03-24 21:14 ` [PATCHSET 0/2] Turn single segment imports into ITER_UBUF Linus Torvalds
2023-03-24 21:52   ` Jens Axboe
2023-03-25  4:46 ` Al Viro
2023-03-27 18:01   ` Jens Axboe
2023-03-27 18:42     ` Al Viro
2023-03-27 18:52       ` Jens Axboe
2023-03-27 18:59         ` Jens Axboe
2023-03-27 20:02           ` Al Viro
2023-03-27 20:03             ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.