From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750794AbcL1DQE (ORCPT ); Tue, 27 Dec 2016 22:16:04 -0500 Received: from mga14.intel.com ([192.55.52.115]:42037 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750705AbcL1DQC (ORCPT ); Tue, 27 Dec 2016 22:16:02 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,420,1477983600"; d="scan'208";a="916711517" From: "Huang\, Ying" To: Minchan Kim Cc: "Huang\, Ying" , Tim Chen , Andrew Morton , , , , , , Hugh Dickins , Shaohua Li , Rik van Riel , Andrea Arcangeli , "Kirill A . Shutemov" , Vladimir Davydov , Johannes Weiner , Michal Hocko , Hillf Danton , "Christian Borntraeger" , Jonathan Corbet , Subject: Re: [PATCH v4 0/9] mm/swap: Regular page swap optimizations References: <20161227074503.GA10616@bbox> <87d1gc4y3w.fsf@yhuang-dev.intel.com> <20161228023739.GA12634@bbox> Date: Wed, 28 Dec 2016 11:15:57 +0800 In-Reply-To: <20161228023739.GA12634@bbox> (Minchan Kim's message of "Wed, 28 Dec 2016 11:37:39 +0900") Message-ID: <8760m43frm.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Minchan Kim writes: > Hi Huang, > > On Wed, Dec 28, 2016 at 09:54:27AM +0800, Huang, Ying wrote: > > < snip > > >> > The patchset has used several techniqueus to reduce lock contention, for example, >> > batching alloc/free, fine-grained lock and cluster distribution to avoid cache >> > false-sharing. Each items has different complexity and benefits so could you >> > show the number for each step of pathchset? It would be better to include the >> > nubmer in each description. It helps how the patch is important when we consider >> > complexitiy of the patch. >> >> One common problem of scalability optimization is that, after you have >> optimized one lock, the end result may be not very good, because another >> lock becomes heavily contended. Similar problem occurs here, there are >> mainly two locks during swap out/in, one protects swap cache, the other >> protects swap device. We can achieve good scalability only after having >> optimized the two locks. > > Yes. You can describe that situation into the description. For example, > "with this patch, we can watch less swap_lock contention with perf but > overall performance is not good because swap cache lock still is still > contended heavily like below data so next patch will solve the problem". > > It will make patch's justficiation clear. > >> >> You cannot say that one patch is not important just because the test >> result for that single patch is not very good. Because without that, >> the end result of the whole series will be not very good. > > I know that but this patchset are lack of number too much to justify > each works. You can show just raw number itself of a techniqueue > although it is not huge benefit or even worse. You can explain the reason > why it was not good, which would be enough motivation for next patch. > > Number itself wouldn't be important but justfication is really crucial > to review/merge patchset and number will help it a lot in especially > MM community. > >> >> >> >> >> Patch 1 is a clean up patch. >> > >> > Could it be separated patch? >> > >> >> Patch 2 creates a lock per cluster, this gives us a more fine graind lock >> >> that can be used for accessing swap_map, and not lock the whole >> >> swap device >> > >> > I hope you make three steps to review easier. You can create some functions like >> > swap_map_lock and cluster_lock which are wrapper functions just hold swap_lock. >> > It doesn't change anything performance pov but it clearly shows what kinds of lock >> > we should use in specific context. >> > >> > Then, you can introduce more fine-graind lock in next patch and apply it into >> > those wrapper functions. >> > >> > And last patch, you can adjust cluster distribution to avoid false-sharing. >> > And the description should include how it's bad in testing so it's worth. >> > >> > Frankly speaking, although I'm huge user of bit_spin_lock(zram/zsmalloc >> > have used it heavily), I don't like swap subsystem uses it. >> > During zram development, it really hurts debugging due to losing lockdep. >> > The reason zram have used it is by size concern of embedded world but server >> > would be not critical so please consider trade-off of spinlock vs. bit_spin_lock. >> >> There will be one struct swap_cluster_info for every 1MB swap space. >> So, for example, for 1TB swap space, the number of struct >> swap_cluster_info will be one million. To reduce the RAM usage, we >> choose to use bit_spin_lock, otherwise, spinlock is better. The code >> will be used by embedded, PC and server, so the RAM usage is important. > > It seems you already increase swap_cluster_info 4 byte to support > bit_spin_lock. The increment only occurs on 64bit platform. On 32bit platform, the size is the same as before. > Compared to that, how much memory does spin_lock increase? The size of struct swap_cluster_info will increase from 4 bytes to 16 bytes on 64bit platform. I guess it will increase from 4 bytes to 8 bytes on 32bit platform at least, but I did not test that. Best Regards, Huang, Ying