dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Mike Snitzer <snitzer@kernel.org>
Cc: linux-block@vger.kernel.org, keescook@chromium.org,
	heinzm@redhat.com, ejt@redhat.com, nhuck@google.com,
	ebiggers@kernel.org, dm-devel@redhat.com, mpatocka@redhat.com,
	luomeng12@huawei.com
Subject: Re: [dm-devel] [dm-6.4 PATCH v2 3/9] dm bufio: improve concurrent IO performance
Date: Fri, 24 Mar 2023 17:11:12 -0600	[thread overview]
Message-ID: <e552ad80-37fe-247e-aaf1-064572ccc154@kernel.dk> (raw)
In-Reply-To: <ZB4p2NfwIhh9raxa@redhat.com>

On 3/24/23 4:53?PM, Mike Snitzer wrote:
> On Fri, Mar 24 2023 at  3:34P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> Just some random drive-by comments.
>>
>>> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
>>> index 1de1bdcda1ce..a58f8ac3ba75 100644
>>> --- a/drivers/md/dm-bufio.c
>>> +++ b/drivers/md/dm-bufio.c
>>> +static void lru_destroy(struct lru *lru)
>>> +{
>>> +	BUG_ON(lru->cursor);
>>> +	BUG_ON(!list_empty(&lru->iterators));
>>> +}
>>
>> Ehm no, WARN_ON_ONCE() for these presumably.
> 
> Yeah, I raised concern about the BUG_ONs with Joe. Will try to rid the
> code of BUG_ONs as follow-on work.

Please do.

>>> @@ -116,9 +366,579 @@ struct dm_buffer {
>>>  #endif
>>>  };
>>>  
>>> +/*--------------------------------------------------------------*/
>>> +
>>> +/*
>>> + * The buffer cache manages buffers, particularly:
>>> + *  - inc/dec of holder count
>>> + *  - setting the last_accessed field
>>> + *  - maintains clean/dirty state along with lru
>>> + *  - selecting buffers that match predicates
>>> + *
>>> + * It does *not* handle:
>>> + *  - allocation/freeing of buffers.
>>> + *  - IO
>>> + *  - Eviction or cache sizing.
>>> + *
>>> + * cache_get() and cache_put() are threadsafe, you do not need to
>>> + * protect these calls with a surrounding mutex.  All the other
>>> + * methods are not threadsafe; they do use locking primitives, but
>>> + * only enough to ensure get/put are threadsafe.
>>> + */
>>> +
>>> +#define NR_LOCKS 64
>>> +#define LOCKS_MASK (NR_LOCKS - 1)
>>> +
>>> +struct tree_lock {
>>> +	struct rw_semaphore lock;
>>> +} ____cacheline_aligned_in_smp;
>>> +
>>> +struct dm_buffer_cache {
>>> +	/*
>>> +	 * We spread entries across multiple trees to reduce contention
>>> +	 * on the locks.
>>> +	 */
>>> +	struct tree_lock locks[NR_LOCKS];
>>> +	struct rb_root roots[NR_LOCKS];
>>> +	struct lru lru[LIST_SIZE];
>>> +};
>>
>> This:
>>
>> struct foo_tree {
>> 	struct rw_semaphore lock;
>> 	struct rb_root root;
>> 	struct lru lru;
>> } ____cacheline_aligned_in_smp;
>>
>> would be a lot better.
>>
>> And where does this NR_LOCKS come from? Don't make up magic values out
>> of thin air. Should this be per-cpu? per-node? N per node? I'll bet you
>> that 64 is way too much for most use cases, and too little for others.
> 
> I cannot speak to the 64 magic value (other than I think it worked
> well for Joe's testbed).  But the point of this exercise is to split
> the lock to avoid contention.  Using 64 accomplishes this. Having
> there be more or less isn't _that_ critical.  The hash to get the
> region index isn't a function of cpu.  We aren't after lockless here.

I don't doubt it worked well in his setup, and it'll probably be fine in
a lot of setups. But the point still stands - it's a magic value, it
should at least be documented. And 64 is likely way too much on a lot of
machines.

> Now that said, will certainly discuss further with Joe and see if we
> can be smarter here.
> 
> Your suggestion to combine members certainly makes a lot of sense.  I
> ran with it relative to the bio-prison-v1 (patch 9) changes which have
> the same layout. Definitely a win on in-core memory as well as
> avoiding cacheline thrashing while accessing the lock and then the
> rb_root members (as is always expected):

Right, this is why I suggested doing it like that. It's not very smart
to split related members like that, wastes both memory and is less
efficient than doing the right thing.

>> I stopped reading here, the patch is just too long. Surely this could be
>> split up?
>>
>>  1 file changed, 1292 insertions(+), 477 deletions(-)
>>
>> That's not a patch, that's a patch series.
> 
> I don't want to upset you or the community but saying this but:
> in this narrow instance where a sizable percentage of the file got
> rewritten: to properly review this work you need to look at the full
> scope of the changes in combination.

That's nonsense. That's like saying "to see what this series does, apply
the whole thing and compare it with the file before". With that logic,
why even split changes ever.

A big patch that could be split is harder to properly review than a lot
of small patches. Nobody ever reviews a 1000+ line patch. But I guess we
can just stop reviewing?

It should be split, it's really not up for debate.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2023-03-24 23:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-24 17:56 [dm-devel] [dm-6.4 PATCH v2 0/9] dm bufio, thin: improve concurrent IO performance Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 1/9] dm bufio: remove unused dm_bufio_release_move interface Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 2/9] dm bufio: move dm_buffer struct Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 3/9] dm bufio: improve concurrent IO performance Mike Snitzer
2023-03-24 19:34   ` Jens Axboe
2023-03-24 22:53     ` Mike Snitzer
2023-03-24 23:11       ` Jens Axboe [this message]
2023-03-25  4:21         ` Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 4/9] dm bufio: move dm_bufio_client members to avoid spanning cachelines Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 5/9] dm bufio: use waitqueue_active in __free_buffer_wake Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 6/9] dm bufio: use multi-page bio vector Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 7/9] dm thin: speed up cell_defer_no_holder() Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 8/9] dm: split discards further if target sets max_discard_granularity Mike Snitzer
2023-03-24 17:56 ` [dm-devel] [dm-6.4 PATCH v2 9/9] dm bio prison v1: improve concurrent IO performance Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e552ad80-37fe-247e-aaf1-064572ccc154@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=ebiggers@kernel.org \
    --cc=ejt@redhat.com \
    --cc=heinzm@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=luomeng12@huawei.com \
    --cc=mpatocka@redhat.com \
    --cc=nhuck@google.com \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).