All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Hao Xu <haoxu@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org, Joseph Qi <joseph.qi@linux.alibaba.com>
Subject: Re: [PATCH RFC 5.13 1/2] io_uring: add support for ns granularity of io_sq_thread_idle
Date: Thu, 29 Apr 2021 23:15:51 +0100	[thread overview]
Message-ID: <96ef70e8-7abf-d820-3cca-0f8aedc969d8@gmail.com> (raw)
In-Reply-To: <51308ac4-03b7-0f66-7f26-8678807195ca@linux.alibaba.com>

On 4/29/21 4:28 AM, Hao Xu wrote:
> 在 2021/4/28 下午10:07, Pavel Begunkov 写道:
>> On 4/28/21 2:32 PM, Hao Xu wrote:
>>> currently unit of io_sq_thread_idle is millisecond, the smallest value
>>> is 1ms, which means for IOPS > 1000, sqthread will very likely  take
>>> 100% cpu usage. This is not necessary in some cases, like users may
>>> don't care about latency much in low IO pressure
>>> (like 1000 < IOPS < 20000), but cpu resource does matter. So we offer
>>> an option of nanosecond granularity of io_sq_thread_idle. Some test
>>> results by fio below:
>>
>> If numbers justify it, I don't see why not do it in ns, but I'd suggest
>> to get rid of all the mess and simply convert to jiffies during ring
>> creation (i.e. nsecs_to_jiffies64()), and leave io_sq_thread() unchanged.
> 1) here I keep millisecond mode for compatibility
> 2) I saw jiffies is calculated by HZ, and HZ could be large enough
> (like HZ = 1000) to make nsecs_to_jiffies64() = 0:
> 
>  u64 nsecs_to_jiffies64(u64 n)
>  {
>  #if (NSEC_PER_SEC % HZ) == 0
>          /* Common case, HZ = 100, 128, 200, 250, 256, 500, 512, 1000 etc. */
>          return div_u64(n, NSEC_PER_SEC / HZ);
>  #elif (HZ % 512) == 0
>          /* overflow after 292 years if HZ = 1024 */
>          return div_u64(n * HZ / 512, NSEC_PER_SEC / 512);
>  #else
>          /*
>          ¦* Generic case - optimized for cases where HZ is a multiple of 3.
>          ¦* overflow after 64.99 years, exact for HZ = 60, 72, 90, 120 etc.
>          ¦*/
>          return div_u64(n * 9, (9ull * NSEC_PER_SEC + HZ / 2) / HZ);
>  #endif
>  }
> 
> say HZ = 1000, then nsec_to_jiffies64(1us) = 1e3 / (1e9 / 1e3) = 0
> iow, nsec_to_jiffies64() doesn't work for n < (1e9 / HZ).

Agree, apparently jiffies precision fractions of a second, e.g. 0.001s
But I'd much prefer to not duplicate all that. So, jiffies won't do,
ktime() may be ok but a bit heavier that we'd like it to be...

Jens, any chance you remember something in the middle? Like same source
as ktime() but without the heavy correction it does.

-- 
Pavel Begunkov

  reply	other threads:[~2021-04-29 22:15 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-28 13:32 [PATCH RFC 5.13 0/2] adaptive sqpoll and its wakeup optimization Hao Xu
2021-04-28 13:32 ` [PATCH RFC 5.13 1/2] io_uring: add support for ns granularity of io_sq_thread_idle Hao Xu
2021-04-28 14:07   ` Pavel Begunkov
2021-04-28 14:16     ` Jens Axboe
2021-04-28 14:53       ` Pavel Begunkov
2021-04-28 14:54         ` Jens Axboe
2021-04-29  3:41       ` Hao Xu
2021-04-29  9:11         ` Pavel Begunkov
2021-05-05 14:07           ` Hao Xu
2021-05-05 17:40             ` Pavel Begunkov
2021-04-29  3:28     ` Hao Xu
2021-04-29 22:15       ` Pavel Begunkov [this message]
2021-09-26 10:00         ` Hao Xu
2021-09-28 10:51           ` Pavel Begunkov
2021-09-29  7:52             ` Hao Xu
2021-09-29  9:24             ` Hao Xu
2021-09-29 11:37               ` Pavel Begunkov
2021-09-29 12:13                 ` Hao Xu
2021-09-30  8:51                   ` Pavel Begunkov
2021-09-30 12:04                     ` Pavel Begunkov
2021-10-05 15:00                       ` Hao Xu
2021-04-28 13:32 ` [PATCH RFC 5.13 2/2] io_uring: submit sqes in the original context when waking up sqthread Hao Xu
2021-04-28 14:12   ` Jens Axboe
2021-04-29  4:12     ` Hao Xu
2021-04-28 14:34   ` Pavel Begunkov
2021-04-28 14:37     ` Pavel Begunkov
2021-04-29  4:37       ` Hao Xu
2021-04-29  9:28         ` Pavel Begunkov
2021-05-05 11:20           ` Hao Xu
2021-04-28 14:39     ` Jens Axboe
2021-04-28 14:50       ` Pavel Begunkov
2021-04-28 14:53         ` Jens Axboe
2021-04-28 14:56           ` Pavel Begunkov
2021-04-28 15:09             ` Jens Axboe
2021-04-29  4:43       ` Hao Xu
2021-04-29  8:44     ` Hao Xu
2021-04-29 22:10       ` Pavel Begunkov
2021-05-05 13:10         ` Hao Xu
2021-05-05 17:44           ` Pavel Begunkov
2021-04-29 22:02   ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96ef70e8-7abf-d820-3cca-0f8aedc969d8@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=haoxu@linux.alibaba.com \
    --cc=io-uring@vger.kernel.org \
    --cc=joseph.qi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.