All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: "yukuai (C)" <yukuai3@huawei.com>
Cc: tj@kernel.org, axboe@kernel.dk, ming.lei@redhat.com,
	geert@linux-m68k.org, cgroups@vger.kernel.org,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	yi.zhang@huawei.com
Subject: Re: [PATCH -next v3 2/2] blk-throttle: fix io hung due to configuration updates
Date: Thu, 19 May 2022 18:10:26 +0200	[thread overview]
Message-ID: <20220519161026.GG16096@blackbody.suse.cz> (raw)
In-Reply-To: <a8953189-af42-0225-3031-daf61347524a@huawei.com>

On Thu, May 19, 2022 at 08:14:28PM +0800, "yukuai (C)" <yukuai3@huawei.com> wrote:
> tg_with_in_bps_limit:
>  jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
>  tmp = bps_limit * jiffy_elapsed_rnd;
>  do_div(tmp, HZ);
>  bytes_allowed = tmp; -> how many bytes are allowed in this slice,
> 		         incluing dispatched.
>  if (tg->bytes_disp[rw] + bio_size <= bytes_allowed)
>   *wait = 0 -> no need to wait if this bio is within limit
> 
>  extra_bytes = tg->bytes_disp[rw] + bio_size - bytes_allowed;
>  -> extra_bytes is based on 'bytes_disp'
> 
> For example:
> 
> 1) bps_limit is 2k, we issue two io, (1k and 9k)
> 2) the first io(1k) will be dispatched, bytes_disp = 1k, slice_start = 0
>    the second io(9k) is waiting for (9 - (2 - 1)) / 2 = 4 s

The 2nd io arrived at 1s, the wait time is 4s, i.e. it can be dispatched
at 5s (i.e. 10k/*2kB/s = 5s).

> 3) after 3 s, we update bps_limit to 1k, then new waiting is caculated:
> 
> without this patch:  bytes_disp = 0, slict_start =3:
> bytes_allowed = 1k	                            <--- why 1k and not 0?
> extra_bytes = 9k - 1k = 8k
> wait = 8s

This looks like it was calculated at time 4s (1s after new config was
set).

> 
> whth this patch: bytes_disp = 0.5k, slice_start =  0,
> bytes_allowed = 1k * 3 + 1k = 4k
> extra_bytes =  0.5k + 9k - 4k = 5.5k
> wait = 5.5s

This looks like calculated at 4s, so the IO would be waiting till
4s+5.5s = 9.5s.

As I don't know why using time 4s, I'll shift this calculation to the
time 3s (when the config changes):

bytes_disp = 0.5k, slice_start =  0,
bytes_allowed = 1k * 3  = 3k
extra_bytes =  0.5k + 9k - 3k = 7.5k
wait = 7.5s

In absolute time, the IO would wait till 3s+7.5s = 10.5s

OK, either your 9.5s or my 10.5s looks weird (although earlier than
original 4s+8s=12s).
However, the IO should ideally only wait till

    3s + (9k -   (6k    -    1k)     ) / 1k/s =
         bio - (allowed - dispatched)  / new_limit

   =3s + 4k / 1k/s = 7s

   ('allowed' is based on old limit)

Or in another example, what if you change the config from 2k/s to ∞k/s
(unlimited, let's neglect the arithmetic overflow that you handle
explicitly, imagine a big number but not so big to be greater than
division result).

In such a case, the wait time should be zero, i.e. IO should be
dispatched right at the time of config change.
(With your patch that still calculates >0 wait time (and the original
behavior gives >0 wait too.)

> I hope I can expliain it clearly...

Yes, thanks for pointing me to relevant parts.
I hope I grasped them correctly.

IOW, your patch and formula make the wait time shorter but still IO can
be delayed indefinitely if you pass a sequence of new configs. (AFAIU)

Regards,
Michal

WARNING: multiple messages have this Message-ID (diff)
From: "Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>
To: "yukuai (C)" <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
	ming.lei-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	yi.zhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org
Subject: Re: [PATCH -next v3 2/2] blk-throttle: fix io hung due to configuration updates
Date: Thu, 19 May 2022 18:10:26 +0200	[thread overview]
Message-ID: <20220519161026.GG16096@blackbody.suse.cz> (raw)
In-Reply-To: <a8953189-af42-0225-3031-daf61347524a-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

On Thu, May 19, 2022 at 08:14:28PM +0800, "yukuai (C)" <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
> tg_with_in_bps_limit:
>  jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
>  tmp = bps_limit * jiffy_elapsed_rnd;
>  do_div(tmp, HZ);
>  bytes_allowed = tmp; -> how many bytes are allowed in this slice,
> 		         incluing dispatched.
>  if (tg->bytes_disp[rw] + bio_size <= bytes_allowed)
>   *wait = 0 -> no need to wait if this bio is within limit
> 
>  extra_bytes = tg->bytes_disp[rw] + bio_size - bytes_allowed;
>  -> extra_bytes is based on 'bytes_disp'
> 
> For example:
> 
> 1) bps_limit is 2k, we issue two io, (1k and 9k)
> 2) the first io(1k) will be dispatched, bytes_disp = 1k, slice_start = 0
>    the second io(9k) is waiting for (9 - (2 - 1)) / 2 = 4 s

The 2nd io arrived at 1s, the wait time is 4s, i.e. it can be dispatched
at 5s (i.e. 10k/*2kB/s = 5s).

> 3) after 3 s, we update bps_limit to 1k, then new waiting is caculated:
> 
> without this patch:  bytes_disp = 0, slict_start =3:
> bytes_allowed = 1k	                            <--- why 1k and not 0?
> extra_bytes = 9k - 1k = 8k
> wait = 8s

This looks like it was calculated at time 4s (1s after new config was
set).

> 
> whth this patch: bytes_disp = 0.5k, slice_start =  0,
> bytes_allowed = 1k * 3 + 1k = 4k
> extra_bytes =  0.5k + 9k - 4k = 5.5k
> wait = 5.5s

This looks like calculated at 4s, so the IO would be waiting till
4s+5.5s = 9.5s.

As I don't know why using time 4s, I'll shift this calculation to the
time 3s (when the config changes):

bytes_disp = 0.5k, slice_start =  0,
bytes_allowed = 1k * 3  = 3k
extra_bytes =  0.5k + 9k - 3k = 7.5k
wait = 7.5s

In absolute time, the IO would wait till 3s+7.5s = 10.5s

OK, either your 9.5s or my 10.5s looks weird (although earlier than
original 4s+8s=12s).
However, the IO should ideally only wait till

    3s + (9k -   (6k    -    1k)     ) / 1k/s =
         bio - (allowed - dispatched)  / new_limit

   =3s + 4k / 1k/s = 7s

   ('allowed' is based on old limit)

Or in another example, what if you change the config from 2k/s to ∞k/s
(unlimited, let's neglect the arithmetic overflow that you handle
explicitly, imagine a big number but not so big to be greater than
division result).

In such a case, the wait time should be zero, i.e. IO should be
dispatched right at the time of config change.
(With your patch that still calculates >0 wait time (and the original
behavior gives >0 wait too.)

> I hope I can expliain it clearly...

Yes, thanks for pointing me to relevant parts.
I hope I grasped them correctly.

IOW, your patch and formula make the wait time shorter but still IO can
be delayed indefinitely if you pass a sequence of new configs. (AFAIU)

Regards,
Michal

  reply	other threads:[~2022-05-19 16:10 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19  8:58 [PATCH -next v3 0/2] bugfix for blk-throttle Yu Kuai
2022-05-19  8:58 ` Yu Kuai
2022-05-19  8:58 ` [PATCH -next v3 1/2] blk-throttle: fix that io throttle can only work for single bio Yu Kuai
2022-05-19  8:58   ` Yu Kuai
2022-05-19 10:42   ` Ming Lei
2022-05-19  8:58 ` [PATCH -next v3 2/2] blk-throttle: fix io hung due to configuration updates Yu Kuai
2022-05-19  8:58   ` Yu Kuai
2022-05-19  9:58   ` Michal Koutný
2022-05-19 12:14     ` yukuai (C)
2022-05-19 12:14       ` yukuai (C)
2022-05-19 16:10       ` Michal Koutný [this message]
2022-05-19 16:10         ` Michal Koutný
2022-05-20  1:22         ` yukuai (C)
2022-05-20  1:22           ` yukuai (C)
2022-05-20  1:36           ` yukuai (C)
2022-05-20  1:36             ` yukuai (C)
2022-05-20 16:03             ` Michal Koutný
2022-05-20 16:20               ` Tejun Heo
2022-05-21  3:51                 ` yukuai (C)
2022-05-21  3:51                   ` yukuai (C)
2022-05-21  5:00                   ` Tejun Heo
2022-05-21  5:00                     ` Tejun Heo
2022-05-21  3:01               ` yukuai (C)
2022-05-21  3:01                 ` yukuai (C)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220519161026.GG16096@blackbody.suse.cz \
    --to=mkoutny@suse.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tj@kernel.org \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.