All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hpe.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>,
	Dave Chinner <david@fromorbit.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Tejun Heo <tj@kernel.org>, <linux-fsdevel@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Chinner <dchinner@redhat.com>,
	Scott J Norton <scott.norton@hp.com>,
	Douglas Hatch <doug.hatch@hp.com>
Subject: Re: [RFC PATCH 1/2] lib/percpu-list: Per-cpu list with associated per-cpu locks
Date: Wed, 17 Feb 2016 13:45:35 -0500	[thread overview]
Message-ID: <56C4BFCF.30100@hpe.com> (raw)
In-Reply-To: <20160217182212.GL6357@twins.programming.kicks-ass.net>

On 02/17/2016 01:22 PM, Peter Zijlstra wrote:
> On Wed, Feb 17, 2016 at 12:41:53PM -0500, Waiman Long wrote:
>> On 02/17/2016 12:18 PM, Peter Zijlstra wrote:
>>> On Wed, Feb 17, 2016 at 12:12:57PM -0500, Waiman Long wrote:
>>>> On 02/17/2016 11:27 AM, Christoph Lameter wrote:
>>>>> On Wed, 17 Feb 2016, Waiman Long wrote:
>>>>>
>>>>>> I know we can use RCU for singly linked list, but I don't think we can use
>>>>>> that for doubly linked list as there is no easy way to make atomic changes to
>>>>>> both prev and next pointers simultaneously unless you are taking about 16b
>>>>>> cmpxchg which is only supported in some architecture.
>>>>> But its supported in the most important architecutes. You can fall back to
>>>>> spinlocks on the ones that do not support it.
>>>>>
>>>> I guess with some limitations on how the lists can be traversed, we may be
>>>> able to do that with RCU without lock. However, that will make the code more
>>>> complex and harder to verify. Given that in both my and Dave's testing that
>>>> contentions with list insertion and deletion are almost gone from the perf
>>>> profile when they used to be a bottleneck, is it really worth the effort to
>>>> do such a conversion?
>>> My initial concern was the preempt disable delay introduced by holding
>>> the spinlock over the entire iteration.
>>>
>>> There is no saying how many elements are on that list and there is no
>>> lock break.
>> But preempt_disable() is called at the beginning of the spin_lock() call. So
>> the additional preempt_disable() in percpu_list_add() is just to cover the
>> this_cpu_ptr() call to make sure that the cpu number doesn't change. So we
>> are talking about a few ns at most here.
>>
> I'm talking about the list iteration, there is no preempt_disable() in
> there, just the spin_lock, which you hold over the entire list, which
> can be many, many element.

Sorry for the misunderstanding.

The original code has one global lock and one single list that covers 
all the inodes in the filesystem. This patch essentially breaks it up 
into multiple smaller lists with one lock for each. So the lock hold 
time should have been greatly reduced unless we are unfortunately enough 
that most of the inodes are in one single list.

If lock hold time is a concern, I think in some cases we can set the an 
upper limit on how many inodes we want to process, release the lock, 
reacquire it and continue. I am just worry that using RCU and 16b 
cmpxchg will introduce too much complexity with no performance gain to show.

Cheers,
Longman

  reply	other threads:[~2016-02-17 18:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-17  1:31 [RFC PATCH 0/2] vfs: Use per-cpu list for SB's s_inodes list Waiman Long
2016-02-17  1:31 ` [RFC PATCH 1/2] lib/percpu-list: Per-cpu list with associated per-cpu locks Waiman Long
2016-02-17  9:53   ` Dave Chinner
2016-02-17 11:00     ` Peter Zijlstra
2016-02-17 11:05       ` Peter Zijlstra
2016-02-17 16:16         ` Waiman Long
2016-02-17 16:22           ` Peter Zijlstra
2016-02-17 16:27           ` Christoph Lameter
2016-02-17 17:12             ` Waiman Long
2016-02-17 17:18               ` Peter Zijlstra
2016-02-17 17:41                 ` Waiman Long
2016-02-17 18:22                   ` Peter Zijlstra
2016-02-17 18:45                     ` Waiman Long [this message]
2016-02-17 19:39                       ` Peter Zijlstra
2016-02-17 11:10       ` Dave Chinner
2016-02-17 11:26         ` Peter Zijlstra
2016-02-17 11:36           ` Peter Zijlstra
2016-02-17 15:56     ` Waiman Long
2016-02-17 16:02       ` Peter Zijlstra
2016-02-17 15:13   ` Christoph Lameter
2016-02-17  1:31 ` [RRC PATCH 2/2] vfs: Use per-cpu list for superblock's inode list Waiman Long
2016-02-17  7:16   ` Ingo Molnar
2016-02-17 15:40     ` Waiman Long
2016-02-17 10:37   ` Dave Chinner
2016-02-17 16:08     ` Waiman Long
2016-02-18 23:58 ` [RFC PATCH 0/2] vfs: Use per-cpu list for SB's s_inodes list Dave Chinner
2016-02-19 21:04   ` Long, Wai Man

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56C4BFCF.30100@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=andi@firstfloor.org \
    --cc=bfields@fieldses.org \
    --cc=cl@linux.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=doug.hatch@hp.com \
    --cc=jack@suse.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hp.com \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.