All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Perepechko <anserper@yandex.ru>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: quota: dqio_mutex design
Date: Wed, 02 Aug 2017 20:52:51 +0300	[thread overview]
Message-ID: <1691224.ooLB1CWbbI@panda> (raw)
In-Reply-To: <20170802162552.GA30353@quack2.suse.cz>

> On Tue 01-08-17 15:02:42, Jan Kara wrote:
> > Hi Andrew,
> > 
> > On Fri 23-06-17 02:43:44, Andrew Perepechko wrote:
> > > The original workload was 50 threads sequentially creating files, each
> > > 
> > > thread in its own directory, over a fast RAID array.
> > 
> > OK, I can reproduce this. Actually I can reproduce on normal SATA drive.
> > Originally I've tried on ramdisk to simulate really fast drive but there
> > dq_list_lock and dq_data_lock contention is much more visible and the
> > contention on dqio_mutex is minimal (two orders of magnitude smaller). On
> > SATA drive we spend ~45% of runtime contending on dqio_mutex when creating
> > empty files.
> 
> So this was just me misinterpretting lockstat data (forgot to divide the
> wait time by number of processes) - then the result would be that each
> process waits only ~1% of its runtime for dqio_mutex.
> 
> Anyway, my patches show ~10% improvement in runtime when 50 different
> processes create empty files for 50 different users. As expected there's
> not measurable benefit when all processes create files for the same user.
> 
> > The problem is that if it is single user that is creating all these files,
> > it is not clear how we could do much better - all processes contend to
> > update the same location on disk with quota information for that user and
> > they have to be synchronized somehow. If there are more users, we could do
> > better by splitting dqio_mutex on per-dquot basis (I have some preliminary
> > patches for that).
> > 
> > One idea I have how we could make things faster is that instead of having
> > dquot dirty flag, we would have a sequence counter. So currently dquot
> > modification looks like:
> > 
> > update counters in dquot
> > dquot_mark_dquot_dirty(dquot);
> > dquot_commit(dquot)
> > 
> >   mutex_lock(dqio_mutex);
> >   if (!clear_dquot_dirty(dquot))
> >   
> >     nothing to do -> bail
> >   
> >   ->commit_dqblk(dquot)
> >   mutex_unlock(dqio_mutex);
> > 
> > When several processes race updating the same dquot, they very often all
> > end up updating dquot on disk even though another process has already
> > written dquot for them while they were waiting for dqio_sem - in my test
> > above the ratio of commit_dqblk / dquot_commit calls was 59%. What we
> > could
> > do is that dquot_mark_dquot_dirty() would return "current sequence of
> > dquot", dquot_commit() would then get sequence that is required to be
> > written and if that is already written (we would also store in dquot
> > latest
> > written sequence), it would bail out doing nothing. This should cut down
> > dqio_mutex hold times and thus wait times but I need to experiment and
> > measure that...
> 
> I've been experimenting with this today but this idea didn't bring any
> benefit in my testing. Was your setup with multiple users or a single user?
> Could you give some testing to my patches to see whether they bring some
> benefit to you?
> 
> 								Honza

Hi Jan!

My setup was with a single user. Unfortunately, it may take some time before
I can try a patched kernel other than RHEL6 or RHEL7 with the same test,
we have a lot of dependencies on these kernels.

The actual test we ran was mdtest.

By the way, we had 15+% performance improvement in creates from the
change that was discussed earlier in this thread:

           EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA]) {
+              if (test_bit(DQ_MOD_B, &dquot->dq_flags))
+                       return 0;
               dquot_mark_dquot_dirty(dquot);
               return ext4_write_dquot(dquot);

The idea was that if we know that some thread is somewhere between
mark_dirty and clear_dirty, then we can avoid blocking on dqio_mutex,
since that thread will update the ondisk dquot for us.

I think, you also mentioned that some mark_dquot_dirty callers, such
as do_set_dqblk, may not be running with an open transaction handle,
so we cannot assume this optimization is atomic. However, we don't
use do_set_dqblk and seem safe wrt journalling.

Thank you,
Andrew

  reply	other threads:[~2017-08-02 18:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-02 12:23 quota: dqio_mutex design Andrew Perepechko
2017-03-03 10:08 ` Jan Kara
2017-03-09 22:29   ` Andrew Perepechko
2017-03-13  8:44     ` Jan Kara
2017-06-21 10:52   ` Jan Kara
     [not found]     ` <4181747.CBilgxvOab@panda>
2017-08-01 13:02       ` Jan Kara
2017-08-02 16:25         ` Jan Kara
2017-08-02 17:52           ` Andrew Perepechko [this message]
2017-08-03 11:09             ` Jan Kara
2017-08-03 11:31             ` Wang Shilong
2017-08-03 12:24               ` Andrew Perepechko
2017-08-03 13:19                 ` Wang Shilong
2017-08-03 13:41                   ` Andrew Perepechko
2017-08-03 13:55                     ` Andrew Perepechko
2017-08-03 14:23                       ` Jan Kara
2017-08-03 14:36               ` Jan Kara
2017-08-03 14:39                 ` Wang Shilong
2017-08-08 16:06                   ` Jan Kara
2017-08-14  3:24                     ` Wang Shilong
2017-08-14  3:28                       ` Wang Shilong
2017-08-14  3:53                       ` Wang Shilong
2017-08-14  8:22                         ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1691224.ooLB1CWbbI@panda \
    --to=anserper@yandex.ru \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.