All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heinz Mauelshagen <heinzm@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: axboe@kernel.dk, caspar@linux.alibaba.com,
	io-uring@vger.kernel.org, linux-block@vger.kernel.org,
	joseph.qi@linux.alibaba.com, dm-devel@redhat.com,
	Mikulas Patocka <mpatocka@redhat.com>,
	JeffleXu <jefflexu@linux.alibaba.com>,
	hch@lst.de
Subject: Re: [dm-devel] [PATCH 4/4] dm: support I/O polling
Date: Fri, 5 Mar 2021 19:19:22 +0100	[thread overview]
Message-ID: <bf889267-35cd-1b65-c411-4a08b6b5720f@redhat.com> (raw)
In-Reply-To: <20210305180938.GA21127@redhat.com>

On 3/5/21 7:09 PM, Mike Snitzer wrote:
> On Fri, Mar 05 2021 at 12:56pm -0500,
> Heinz Mauelshagen <heinzm@redhat.com> wrote:
>
>> On 3/5/21 6:46 PM, Heinz Mauelshagen wrote:
>>> On 3/5/21 10:52 AM, JeffleXu wrote:
>>>> On 3/3/21 6:09 PM, Mikulas Patocka wrote:
>>>>> On Wed, 3 Mar 2021, JeffleXu wrote:
>>>>>
>>>>>> On 3/3/21 3:05 AM, Mikulas Patocka wrote:
>>>>>>
>>>>>>> Support I/O polling if submit_bio_noacct_mq_direct returned non-empty
>>>>>>> cookie.
>>>>>>>
>>>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>>>>>>>
>>>>>>> ---
>>>>>>>    drivers/md/dm.c |    5 +++++
>>>>>>>    1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> Index: linux-2.6/drivers/md/dm.c
>>>>>>> ===================================================================
>>>>>>> --- linux-2.6.orig/drivers/md/dm.c    2021-03-02
>>>>>>> 19:26:34.000000000 +0100
>>>>>>> +++ linux-2.6/drivers/md/dm.c    2021-03-02 19:26:34.000000000 +0100
>>>>>>> @@ -1682,6 +1682,11 @@ static void __split_and_process_bio(stru
>>>>>>>            }
>>>>>>>        }
>>>>>>>    +    if (ci.poll_cookie != BLK_QC_T_NONE) {
>>>>>>> +        while (atomic_read(&ci.io->io_count) > 1 &&
>>>>>>> +               blk_poll(ci.poll_queue, ci.poll_cookie, true)) ;
>>>>>>> +    }
>>>>>>> +
>>>>>>>        /* drop the extra reference count */
>>>>>>>        dec_pending(ci.io, errno_to_blk_status(error));
>>>>>>>    }
>>>>>> It seems that the general idea of your design is to
>>>>>> 1) submit *one* split bio
>>>>>> 2) blk_poll(), waiting the previously submitted split bio complets
>>>>> No, I submit all the bios and poll for the last one.
>>>>>
>>>>>> and then submit next split bio, repeating the above process.
>>>>>> I'm afraid
>>>>>> the performance may be an issue here, since the batch every time
>>>>>> blk_poll() reaps may decrease.
>>>>> Could you benchmark it?
>>>> I only tested dm-linear.
>>>>
>>>> The configuration (dm table) of dm-linear is:
>>>> 0 1048576 linear /dev/nvme0n1 0
>>>> 1048576 1048576 linear /dev/nvme2n1 0
>>>> 2097152 1048576 linear /dev/nvme5n1 0
>>>>
>>>>
>>>> fio script used is:
>>>> ```
>>>> $cat fio.conf
>>>> [global]
>>>> name=iouring-sqpoll-iopoll-1
>>>> ioengine=io_uring
>>>> iodepth=128
>>>> numjobs=1
>>>> thread
>>>> rw=randread
>>>> direct=1
>>>> registerfiles=1
>>>> hipri=1
>>>> runtime=10
>>>> time_based
>>>> group_reporting
>>>> randrepeat=0
>>>> filename=/dev/mapper/testdev
>>>> bs=4k
>>>>
>>>> [job-1]
>>>> cpus_allowed=14
>>>> ```
>>>>
>>>> IOPS (IRQ mode) | IOPS (iopoll mode (hipri=1))
>>>> --------------- | --------------------
>>>>              213k |           19k
>>>>
>>>> At least, it doesn't work well with io_uring interface.
>>>>
>>>>
>>>
>>> Jeffle,
>>>
>>> I ran your above fio test on a linear LV split across 3 NVMes to
>>> second your split mapping
>>> (system: 32 core Intel, 256GiB RAM) comparing io engines sync,
>>> libaio and io_uring,
>>> the latter w/ and w/o hipri (sync+libaio obviously w/o
>>> registerfiles and hipri) which resulted ok:
>>>
>>>
>>>
>>> sync  |  libaio  |  IRQ mode (hipri=0) | iopoll (hipri=1)
>>> ------|----------|---------------------|----------------- 56.3K
>>> |    290K  |                329K |             351K I can't second
>>> your drastic hipri=1 drop here...
>>
>> Sorry, email mess.
>>
>>
>> sync   |  libaio  |  IRQ mode (hipri=0) | iopoll (hipri=1)
>> -------|----------|---------------------|-----------------
>> 56.3K  |    290K  |                329K |             351K
>>
>>
>>
>> I can't second your drastic hipri=1 drop here...
> I think your result is just showcasing your powerful system's ability to
> poll every related HW queue.. whereas Jeffle's system is likely somehow
> more constrained (on a cpu level, memory, whatever).

Jeffle,

mind explaining your test system configuration a bit more (cores, ram, 
i/o config to your NVMes)
so that we have a better base for such assumption?

Thanks,
Heinz

>
> My basis for this is that Mikulas' changes simply always return an
> invalid cookie (BLK_QC_T_NONE) for purposes of intelligent IO polling.
>
> Such an implementation is completely invalid.
>
> I discussed with Jens and he said:
> "it needs to return something that f_op->iopoll() can make sense of.
> otherwise you have no option but to try everything."
>
> Mike
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel


WARNING: multiple messages have this Message-ID (diff)
From: Heinz Mauelshagen <heinzm@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: axboe@kernel.dk, caspar@linux.alibaba.com, hch@lst.de,
	linux-block@vger.kernel.org, joseph.qi@linux.alibaba.com,
	dm-devel@redhat.com, Mikulas Patocka <mpatocka@redhat.com>,
	JeffleXu <jefflexu@linux.alibaba.com>,
	io-uring@vger.kernel.org
Subject: Re: [dm-devel] [PATCH 4/4] dm: support I/O polling
Date: Fri, 5 Mar 2021 19:19:22 +0100	[thread overview]
Message-ID: <bf889267-35cd-1b65-c411-4a08b6b5720f@redhat.com> (raw)
In-Reply-To: <20210305180938.GA21127@redhat.com>

On 3/5/21 7:09 PM, Mike Snitzer wrote:
> On Fri, Mar 05 2021 at 12:56pm -0500,
> Heinz Mauelshagen <heinzm@redhat.com> wrote:
>
>> On 3/5/21 6:46 PM, Heinz Mauelshagen wrote:
>>> On 3/5/21 10:52 AM, JeffleXu wrote:
>>>> On 3/3/21 6:09 PM, Mikulas Patocka wrote:
>>>>> On Wed, 3 Mar 2021, JeffleXu wrote:
>>>>>
>>>>>> On 3/3/21 3:05 AM, Mikulas Patocka wrote:
>>>>>>
>>>>>>> Support I/O polling if submit_bio_noacct_mq_direct returned non-empty
>>>>>>> cookie.
>>>>>>>
>>>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>>>>>>>
>>>>>>> ---
>>>>>>>    drivers/md/dm.c |    5 +++++
>>>>>>>    1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> Index: linux-2.6/drivers/md/dm.c
>>>>>>> ===================================================================
>>>>>>> --- linux-2.6.orig/drivers/md/dm.c    2021-03-02
>>>>>>> 19:26:34.000000000 +0100
>>>>>>> +++ linux-2.6/drivers/md/dm.c    2021-03-02 19:26:34.000000000 +0100
>>>>>>> @@ -1682,6 +1682,11 @@ static void __split_and_process_bio(stru
>>>>>>>            }
>>>>>>>        }
>>>>>>>    +    if (ci.poll_cookie != BLK_QC_T_NONE) {
>>>>>>> +        while (atomic_read(&ci.io->io_count) > 1 &&
>>>>>>> +               blk_poll(ci.poll_queue, ci.poll_cookie, true)) ;
>>>>>>> +    }
>>>>>>> +
>>>>>>>        /* drop the extra reference count */
>>>>>>>        dec_pending(ci.io, errno_to_blk_status(error));
>>>>>>>    }
>>>>>> It seems that the general idea of your design is to
>>>>>> 1) submit *one* split bio
>>>>>> 2) blk_poll(), waiting the previously submitted split bio complets
>>>>> No, I submit all the bios and poll for the last one.
>>>>>
>>>>>> and then submit next split bio, repeating the above process.
>>>>>> I'm afraid
>>>>>> the performance may be an issue here, since the batch every time
>>>>>> blk_poll() reaps may decrease.
>>>>> Could you benchmark it?
>>>> I only tested dm-linear.
>>>>
>>>> The configuration (dm table) of dm-linear is:
>>>> 0 1048576 linear /dev/nvme0n1 0
>>>> 1048576 1048576 linear /dev/nvme2n1 0
>>>> 2097152 1048576 linear /dev/nvme5n1 0
>>>>
>>>>
>>>> fio script used is:
>>>> ```
>>>> $cat fio.conf
>>>> [global]
>>>> name=iouring-sqpoll-iopoll-1
>>>> ioengine=io_uring
>>>> iodepth=128
>>>> numjobs=1
>>>> thread
>>>> rw=randread
>>>> direct=1
>>>> registerfiles=1
>>>> hipri=1
>>>> runtime=10
>>>> time_based
>>>> group_reporting
>>>> randrepeat=0
>>>> filename=/dev/mapper/testdev
>>>> bs=4k
>>>>
>>>> [job-1]
>>>> cpus_allowed=14
>>>> ```
>>>>
>>>> IOPS (IRQ mode) | IOPS (iopoll mode (hipri=1))
>>>> --------------- | --------------------
>>>>              213k |           19k
>>>>
>>>> At least, it doesn't work well with io_uring interface.
>>>>
>>>>
>>>
>>> Jeffle,
>>>
>>> I ran your above fio test on a linear LV split across 3 NVMes to
>>> second your split mapping
>>> (system: 32 core Intel, 256GiB RAM) comparing io engines sync,
>>> libaio and io_uring,
>>> the latter w/ and w/o hipri (sync+libaio obviously w/o
>>> registerfiles and hipri) which resulted ok:
>>>
>>>
>>>
>>> sync  |  libaio  |  IRQ mode (hipri=0) | iopoll (hipri=1)
>>> ------|----------|---------------------|----------------- 56.3K
>>> |    290K  |                329K |             351K I can't second
>>> your drastic hipri=1 drop here...
>>
>> Sorry, email mess.
>>
>>
>> sync   |  libaio  |  IRQ mode (hipri=0) | iopoll (hipri=1)
>> -------|----------|---------------------|-----------------
>> 56.3K  |    290K  |                329K |             351K
>>
>>
>>
>> I can't second your drastic hipri=1 drop here...
> I think your result is just showcasing your powerful system's ability to
> poll every related HW queue.. whereas Jeffle's system is likely somehow
> more constrained (on a cpu level, memory, whatever).

Jeffle,

mind explaining your test system configuration a bit more (cores, ram, 
i/o config to your NVMes)
so that we have a better base for such assumption?

Thanks,
Heinz

>
> My basis for this is that Mikulas' changes simply always return an
> invalid cookie (BLK_QC_T_NONE) for purposes of intelligent IO polling.
>
> Such an implementation is completely invalid.
>
> I discussed with Jens and he said:
> "it needs to return something that f_op->iopoll() can make sense of.
> otherwise you have no option but to try everything."
>
> Mike
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

  reply	other threads:[~2021-03-05 18:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02 19:05 [PATCH 4/4] dm: support I/O polling Mikulas Patocka
2021-03-02 19:05 ` [dm-devel] " Mikulas Patocka
2021-03-03  2:53 ` JeffleXu
2021-03-03  2:53   ` JeffleXu
2021-03-03 10:09   ` Mikulas Patocka
2021-03-03 10:09     ` Mikulas Patocka
2021-03-04  2:57     ` JeffleXu
2021-03-04  2:57       ` JeffleXu
2021-03-04 10:09       ` Mikulas Patocka
2021-03-04 10:09         ` Mikulas Patocka
2021-03-05 18:21         ` Jens Axboe
2021-03-05 18:21           ` Jens Axboe
2021-03-04 15:01     ` Jeff Moyer
2021-03-04 15:01       ` Jeff Moyer
2021-03-04 15:11       ` Mike Snitzer
2021-03-04 15:11         ` [dm-devel] " Mike Snitzer
2021-03-04 15:12       ` Mikulas Patocka
2021-03-04 15:12         ` Mikulas Patocka
2021-03-05  9:52     ` JeffleXu
2021-03-05  9:52       ` JeffleXu
2021-03-05 17:46       ` Heinz Mauelshagen
2021-03-05 17:46         ` Heinz Mauelshagen
2021-03-05 17:56         ` Heinz Mauelshagen
2021-03-05 17:56           ` Heinz Mauelshagen
2021-03-05 18:09           ` Mike Snitzer
2021-03-05 18:09             ` [dm-devel] " Mike Snitzer
2021-03-05 18:19             ` Heinz Mauelshagen [this message]
2021-03-05 18:19               ` Heinz Mauelshagen
2021-03-08  3:54           ` JeffleXu
2021-03-08  3:54             ` JeffleXu
2021-03-08  3:55             ` Jens Axboe
2021-03-08  3:55               ` Jens Axboe
2021-03-09 11:42             ` Heinz Mauelshagen
2021-03-09 11:42               ` Heinz Mauelshagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf889267-35cd-1b65-c411-4a08b6b5720f@redhat.com \
    --to=heinzm@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=caspar@linux.alibaba.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=jefflexu@linux.alibaba.com \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-block@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.