linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Josef Bacik <josef@toxicpanda.com>
Cc: Jan Kara <jack@suse.cz>, Paolo Valente <paolo.valente@linaro.org>,
	"Srivatsa S. Bhat" <srivatsa@csail.mit.edu>,
	linux-fsdevel@vger.kernel.org,
	linux-block <linux-block@vger.kernel.org>,
	linux-ext4@vger.kernel.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, axboe@kernel.dk, jmoyer@redhat.com,
	amakhalov@vmware.com, anishs@vmware.com, srivatsab@vmware.com
Subject: Re: CFQ idling kills I/O performance on ext4 with blkio cgroup controller
Date: Tue, 21 May 2019 15:10:33 -0400	[thread overview]
Message-ID: <20190521191033.GA4855@mit.edu> (raw)
In-Reply-To: <20190521181952.4vpruone2mzbczpw@MacBook-Pro-91.local>

On Tue, May 21, 2019 at 02:19:53PM -0400, Josef Bacik wrote:
> Chris is adding a REQ_ROOT (or something) flag that means don't throttle me now,
> but the the blkcg attached to the bio is the one that is responsible for this
> IO.  Then for io.latency we'll let the io go through unmolested but it gets
> counted to the right cgroup, and if then we're exceeding latency guarantees we
> have the ability to schedule throttling for that cgroup in a safer place.  This
> would eliminate the data=ordered issue for ext4, you guys keep doing what you
> are doing and we'll handle throttling elsewhere, just so long as the bio's are
> tagged with the correct source then all is well.  Thanks,

Great, it sounds like Chris also came up with the the entangled writes
flag idea (although with probably a better name than I did :-).  So
now all we need to do is to plumb a flag through the writeback code so
that file systems (or the VFS player) implementing syncfs(2) or
fsync(2) can arrange to have that flag set if necessary.

Speaking of syncfs(2), something which we considered doing at Google
many years ago (but never did) was to implement a hack so that someone
calling syncfs(2) or sync(2) when they were not root, would make that
sys call be a no-op.  The reason for this was on heavy loaded
machines, an SRE logged in as a non-root user might absent-mindly type
"sync", and that would cause a storm of I/O traffic that would really
mess up the machine.  The jobs that were in the low latency bucket
would be protected (since we didn't run with journalling), but those
that were in the best efforts bucket would be really unhappy.

If we have a "don't throttle me now" REQ_ROOT flag combined with
journalling, then someone running "sync", even if it's by accident,
could really ruin a low-latency job's day, and in a container
environment, there really is no reason for a non-root user to be
wanting to request a syncfs(2) or sync(2).  So maybe we should have a
way to make it be a no-op (or return an error, but that might surprise
some applications) for non-privileged users.  Maybe as a per-mount
flag/option, or via some other tunable?

						- Ted

  reply	other threads:[~2019-05-21 19:11 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17 22:16 CFQ idling kills I/O performance on ext4 with blkio cgroup controller Srivatsa S. Bhat
2019-05-18 18:39 ` Paolo Valente
2019-05-18 19:28   ` Theodore Ts'o
2019-05-20  9:15     ` Jan Kara
2019-05-20 10:45       ` Paolo Valente
2019-05-21 16:48       ` Theodore Ts'o
2019-05-21 18:19         ` Josef Bacik
2019-05-21 19:10           ` Theodore Ts'o [this message]
2019-05-20 10:38     ` Paolo Valente
2019-05-21  7:38       ` Andrea Righi
2019-05-18 20:50   ` Srivatsa S. Bhat
2019-05-20 10:19     ` Paolo Valente
2019-05-20 22:45       ` Srivatsa S. Bhat
2019-05-21  6:23         ` Paolo Valente
2019-05-21  7:19           ` Srivatsa S. Bhat
2019-05-21  9:10           ` Jan Kara
2019-05-21 16:31             ` Theodore Ts'o
2019-05-21 11:25       ` Paolo Valente
2019-05-21 13:20         ` Paolo Valente
2019-05-21 16:21           ` Paolo Valente
2019-05-21 17:38             ` Paolo Valente
2019-05-21 22:51               ` Srivatsa S. Bhat
2019-05-22  8:05                 ` Paolo Valente
2019-05-22  9:02                   ` Srivatsa S. Bhat
2019-05-22  9:12                     ` Paolo Valente
2019-05-22 10:02                       ` Srivatsa S. Bhat
2019-05-22  9:09                   ` Paolo Valente
2019-05-22 10:01                     ` Srivatsa S. Bhat
2019-05-22 10:54                       ` Paolo Valente
2019-05-23  2:30                         ` Srivatsa S. Bhat
2019-05-23  9:19                           ` Paolo Valente
2019-05-23 17:22                             ` Paolo Valente
2019-05-23 23:43                               ` Srivatsa S. Bhat
2019-05-24  6:51                                 ` Paolo Valente
2019-05-24  7:56                                   ` Paolo Valente
2019-05-29  1:09                                   ` Srivatsa S. Bhat
2019-05-29  7:41                                     ` Paolo Valente
2019-05-30  8:29                                       ` Srivatsa S. Bhat
2019-05-30 10:45                                         ` Paolo Valente
2019-06-02  7:04                                           ` Srivatsa S. Bhat
2019-06-11 22:34                                             ` Srivatsa S. Bhat
2019-06-12 13:04                                               ` Jan Kara
2019-06-12 19:36                                                 ` Srivatsa S. Bhat
2019-06-13  6:02                                                   ` Greg Kroah-Hartman
2019-06-13 19:03                                                     ` Srivatsa S. Bhat
2019-06-13  8:20                                                   ` Jan Kara
2019-06-13 19:05                                                     ` Srivatsa S. Bhat
2019-06-13  8:37                                                   ` Jens Axboe
2019-06-13  5:46                                               ` Paolo Valente
2019-06-13 19:13                                                 ` Srivatsa S. Bhat
2019-05-23 23:32                           ` Srivatsa S. Bhat
2019-05-30  8:38                             ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190521191033.GA4855@mit.edu \
    --to=tytso@mit.edu \
    --cc=amakhalov@vmware.com \
    --cc=anishs@vmware.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paolo.valente@linaro.org \
    --cc=srivatsa@csail.mit.edu \
    --cc=srivatsab@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).