From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de ([195.135.220.15]:48464 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727087AbeJQKp0 (ORCPT ); Wed, 17 Oct 2018 06:45:26 -0400 Date: Tue, 16 Oct 2018 19:51:47 -0700 From: Davidlohr Bueso To: Waiman Long Cc: Jan Kara , Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Boqun Feng , Davidlohr Bueso Subject: Re: [PATCH v9 5/5] lib/dlock-list: Scale dlock_lists_empty() Message-ID: <20181017025147.wfk7cktcn3emlb6b@linux-r8p5> References: <1536780532-4092-1-git-send-email-longman@redhat.com> <1536780532-4092-6-git-send-email-longman@redhat.com> <20181004071600.GC29482@quack2.suse.cz> <5bcdf2a2-6d03-df21-934d-6c989549253b@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <5bcdf2a2-6d03-df21-934d-6c989549253b@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, 04 Oct 2018, Waiman Long wrote: >On 10/04/2018 03:16 AM, Jan Kara wrote: >> On Wed 12-09-18 15:28:52, Waiman Long wrote: >>> From: Davidlohr Bueso >>> >>> Instead of the current O(N) implementation, at the cost >>> of adding an atomic counter, we can convert the call to >>> an atomic_read(). The counter only serves for accounting >>> empty to non-empty transitions, and vice versa; therefore >>> only modified twice for each of the lists during the >>> lifetime of the dlock (while used). >>> >>> In addition, to be able to unaccount a list_del(), we >>> add a dlist pointer to each head, thus minimizing the >>> overall memory footprint. >>> >>> Signed-off-by: Davidlohr Bueso >>> Acked-by: Waiman Long >> So I was wondering: Is this really worth it? AFAICS we have a single call >> site for dlock_lists_empty() and that happens during umount where we don't >> really care about this optimization. So it seems like unnecessary >> complication to me at this point? If someone comes up with a usecase that >> needs fast dlock_lists_empty(), then sure, we can do this... >> > >Yes, that is true. We can skip this patch for the time being until a use >case comes up which requires dlock_lists_empty() to be used in the fast >path. So fyi I ended up porting the epoll ready-list to this api, where dlock_lists_empty() performance _does_ matter. However, the list iteration is common enough operation to put perform the benefits of the percpu add/delete operations. For example, when sending ready events to userspace (ep_send_events_proc()), each item must drop the iter lock, and also do a delete operation -- similarly for checking for ready events (ep_read_events_proc). This ends hurting more than benefiting workloads. Anyway, so yeah I have no need for this patch, and the added complexity + atomics is unjustified. Thanks, Davidlohr