All of lore.kernel.org
 help / color / mirror / Atom feed
* (unknown), 
@ 2017-09-27 17:41 Michael Lyle
  2017-09-27 17:41 ` [PATCH v3 1/5] bcache: don't write back data if reading it failed Michael Lyle
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Michael Lyle @ 2017-09-27 17:41 UTC (permalink / raw)
  To: linux-bcache

Hey everyone---

After the review comments from last night, I'm back to try again :)


Thanks everyone for your help-- comments on what's changed and how
#4 helps with future work (why it's slightly more complicated) are
below.

Mike

Changes from last night:

- Changed lots of comment formatting to match the rest of bcache-style.
- Fixed a bug noticed by Tang Junhui where contiguous I/O would not be
dispatched together.
- Changed the magic number '5' and '5000' to the macros
MAX_WRITEBACKS_IN_PASS and MAX_WRITESIZE_IN_PASS
- Slight improvements to patch logs.

The net result of all these changes is better IO utilization during
writeback.  More contiguous I/O happens (whether during idle times or
when there is more activity).  Contiguous I/O is sent in proper order
to the backing device.  The control system picks better writeback
rate targets and the system can better hit them.

This is what I plan to work on next, in subsequent patches:

- Add code to skip doing small I/Os when A) there are larger I/Os in
the set, and B) the end of disk wasn't reached when scanning.  In
other words, try writing out the bigger contiguous chunks of writeback
first; give the other blocks time to end up with a larger extent next
to them.  This depends on patch 4, because it understands the true
contiguous backing I/O size and isn't fooled by smaller extents.

- Adjust bch_next_delay to store the reciprocal of what it currently
does, and remove the bounds on maximum-sleep-time.  Instead, enforce
a maximum sleep time at the writeback loop.  This will allow us to go
a long time (hundreds of seconds) without writing to the disk at all,
while still being ready to respond quickly to any increases in requested
writeback rate.  This depends on patch 4, which slightly changes the
formulation of the delay.

- Add a "fast writeback" mode, that is for use when the disk is idle.
If enabled, and there has been no I/O, it will issue one (contiguous)
write at a time at IOPRIO_CLASS_IDLE, with no delay inbetween (bypassing
the control system).  The fact that there is only one I/O and they are
at minimum IOPRIO means that good latency for the first user I/O request
will be maintained-- because they only need to compete with one writeback
I/O in the queue which is set to low priority.  This depends on patch 4 in
order to correctly merge contiguous requests in this mode.

- Add code to plug the backing device when there are more contiguous
requests coming.  This requires patch 4 (to be able to mark requests
to expect additional contiguous requests after them) and patch 5
(to properly order the I/O for the backing device).  This will help
ensure the schduler will properly merge operations (it usually works
now, but not always).

- Add code to lower writeback IOPRIO when the rate is easily being met,
so that end-user IO requests "win".

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-10-08  5:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-27 17:41 (unknown), Michael Lyle
2017-09-27 17:41 ` [PATCH v3 1/5] bcache: don't write back data if reading it failed Michael Lyle
2017-09-27 17:41 ` [PATCH v3 2/5] bcache: implement PI controller for writeback rate Michael Lyle
2017-10-08  4:22   ` Coly Li
2017-10-08  4:57     ` Michael Lyle
2017-10-08  5:08       ` Coly Li
2017-09-27 17:41 ` [PATCH v3 3/5] bcache: smooth writeback rate control Michael Lyle
2017-09-27 17:41 ` [PATCH v3 4/5] bcache: writeback: collapse contiguous IO better Michael Lyle
2017-09-27 17:41 ` [PATCH v3 5/5] bcache: writeback: properly order backing device IO Michael Lyle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.