From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1423361AbcBQP4V (ORCPT ); Wed, 17 Feb 2016 10:56:21 -0500 Received: from mail-bl2on0121.outbound.protection.outlook.com ([65.55.169.121]:57323 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1423130AbcBQP4T (ORCPT ); Wed, 17 Feb 2016 10:56:19 -0500 Authentication-Results: fromorbit.com; dkim=none (message not signed) header.d=none;fromorbit.com; dmarc=none action=none header.from=hpe.com; Message-ID: <56C4981A.8040705@hpe.com> Date: Wed, 17 Feb 2016 10:56:10 -0500 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Dave Chinner CC: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter , , , Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Scott J Norton , Douglas Hatch Subject: Re: [RFC PATCH 1/2] lib/percpu-list: Per-cpu list with associated per-cpu locks References: <1455672680-7153-1-git-send-email-Waiman.Long@hpe.com> <1455672680-7153-2-git-send-email-Waiman.Long@hpe.com> <20160217095318.GO14668@dastard> In-Reply-To: <20160217095318.GO14668@dastard> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [71.168.64.11] X-ClientProxiedBy: BN3PR0401CA0032.namprd04.prod.outlook.com (25.162.159.170) To DF4PR84MB0138.NAMPRD84.PROD.OUTLOOK.COM (25.162.192.24) X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0138;2:oGh6ouvTC4x2lZURvArRDknHdjsRxBd5zzPcR7F7gQbYY9tuJD3o4oKbqR3L+q/WjzQWb9K3H9w0TzzCl0kC7NVt9n6Gtwav5OO1mw8Pz9NYLN1f4g/4eTNGOwTlYBUf4mrsomZQwBm4qwd6gm3TcQ==;3:G9f3JYDZYYGYvUutooj9UPHDP3DXC6R6P0cZnG2lLbk+eVOb6BnKOLU7bJbQekgIMOEYMqe+M9eSiNo66GkHApAu24RAr5UkwXd90P5ivkoCT5XSlgRVkncEZKt4e5Te;25:ilIirNYSrVAFY08sgFUxZAr9+81HXx36In/URyDBCcYN9q1R1lGwWj77DYOLTbfq2Qq929XfQ86y09LEm0a0p/87F3rcRIw8p3BzftuBEJd76HYw/I0TmQksGljK1DNcbtvNVO0TTBLrJ1dDGZZVDoFH/ogwJFnHvZpi/DBCnptalFOUhxJqH+acp8L/mtZPPH7B1ilzfIyBHewJTsHY4eVO6gqgBQ8Wet/KxN3jhTFfTIGdFRlf5FOkvl19PjKIZy5qkbTxcX2XQ2IWXHp6AMs3Oh3HQkUwjic3alQ/MJyG6wzZ6MtqIv2Ej94bEERu;20:8ugsAa42jCIdqU8VzCQ7tklzckK1tUmTsjHvd6DpcXbCI/kU4kBreFojTJPvBxpa6VXe2z/lglEcltqcMot/uFX4qFOKWY5cQ+/0ltxN+GKIEcHchVeq8+lGSr0yg+c8EjJKPRChOn5DeTKym/kMUWwY94jgkIiVtoe0/EWbSVJV6TWNXxf+aZQPf1Frwz65MfZj8IzkTTcGWYUGPcG1OuA+KZOYWcvNv+oLvZpOiMdfk7A3W5s6mTtoqmSuRhHa X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DF4PR84MB0138; X-MS-Office365-Filtering-Correlation-Id: f0e7eea6-df28-4d45-79c7-08d337b2defb X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:DF4PR84MB0138;BCL:0;PCL:0;RULEID:;SRVR:DF4PR84MB0138; X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0138;4:HNX2sfQBYHp+BdiwvEyluZ7v+eyF2KhnqL0PYaRdbjwC+UH/DvZ+o7oUNR8E20l77WfzCID0qPQgvGqXr2eVs4MzuK4+tOSp6WgYBB2XCaZGyY2d5iDXFxVuzR8YdYPImpzUl1UuBS+Q9pYadnz2AfmXDcm5uj4SHTve+g9DEAKH3YytI1QT4WLBN5dp6Ser+8i0vfoWsNrtuaGvplEI9jj+yiiNIV9X+YupLY4cqRaLRZMTkaUqKFzwS98cdYg82vnu9WghS0uHqi5cMCmJcxiuFTxrp6HEJBJ8sYBB5fhZpS8dwAdpmsc/nnesdjVhfCPybyJsOkQtvk3PHaM4WULQSVIofUokoBgX1AGSeW2Yna0vLG9D8+fCeRpwqdSt X-Forefront-PRVS: 085551F5A8 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(6049001)(24454002)(479174004)(377454003)(5001960100002)(230700001)(92566002)(54356999)(50986999)(40100003)(4001350100001)(36756003)(6116002)(4326007)(586003)(76176999)(87266999)(65816999)(2906002)(5008740100001)(110136002)(5004730100002)(3846002)(189998001)(23756003)(65956001)(47776003)(66066001)(80316001)(33656002)(19580395003)(117156001)(77096005)(2950100001)(87976001)(1096002)(83506001)(65806001)(42186005)(86362001)(7059030)(217873001);DIR:OUT;SFP:1102;SCL:1;SRVR:DF4PR84MB0138;H:[192.168.142.188];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;DF4PR84MB0138;23:qytHs6xC4vS3LmBLcMeE/t6Uv17+nTBTNYBbKx7?= =?iso-8859-1?Q?ksRibqmFDiF4cNz9LKx4+jWCur7N0b4ln9NkK/rL8Uo3pptlFlawIroHXe?= =?iso-8859-1?Q?19VLnawrsW5dOBQ8K5xDVTbH+sZufMwLWooc8FLe4vTnj1O+Oe1rx/rVNW?= =?iso-8859-1?Q?Ih4AHOOlVyFMyipbLTd7X/J9uXlCp7ysYOY7nODb0EbC/+3g+KMkLap3SV?= =?iso-8859-1?Q?mYMuGVAHwIGxQDD6WKwmQ47p+5zXDV2D8x6xjKW2B3j0a4hhcOTiOXQA7A?= =?iso-8859-1?Q?3fkvVtaKZqMCkB+WfFMVYP8damb2A0AANTMuIHdSulGItgWOhdHbRqmtY+?= =?iso-8859-1?Q?77+fgWpC4vZ7GFGxCuN55mBSo98VNrt0MsCwpe3fNVG6sPzp9HAAKo/nCa?= =?iso-8859-1?Q?g9hn3ADVTPmHbvXJS5q4r8dG6m/ugKRPcIDAwnOShMTtaT5BpRxc5qm5UR?= =?iso-8859-1?Q?ELJ6vbavmeF/wGZCQM8HECzbo3neVQb9haS1JDltf2h/WzOmtx0+Bgh/3N?= =?iso-8859-1?Q?YlY2gnbE8T6OYubj0gZPBN8YYrpQRhH0Mrfh1in3k8MAS5diYRE+WWsAlg?= =?iso-8859-1?Q?YBiqgtNtznA8dhwvlDEoWsFHWvyUad+9HPTB9StWc3Rb/WBXkSdB07hztr?= =?iso-8859-1?Q?OBxylVGOHjx3M+WhU7kKFH//7HXi5VLvpRYb3+1CGa5AhTsmCbmcP9103j?= =?iso-8859-1?Q?PE3Wr+a9r7U1feFJnnaTtfckFngKlzGw06F9HbvZKmfZsW6ucIAXKkWXhx?= =?iso-8859-1?Q?ApVXgibZe3tWGlP4UOKse1lVY5HaB5v5y+zufZM79HQZ8TpuiA/JHowu0B?= =?iso-8859-1?Q?dDarFTqoU5RmdE4Z0UDyUVIN0Xg+NxDAu2nRtbMH0FfhGZoEqdrcdNsscG?= =?iso-8859-1?Q?C2rx4ecbh4rx24OkZaftjCesA4eobDDafVcRlGOxUICpPBas7Ca+QtTn8e?= =?iso-8859-1?Q?kWyIK1SjccbExKlhWSuq1X6ZFyIaZFIapSFH9CGNXBD6urFVLgIuJpzIIA?= =?iso-8859-1?Q?z1PjChgEE0iLb4pIfIdiBQX/OHpq+QoUaTw+yGwHbcVGPLO/pxI/jhoDwp?= =?iso-8859-1?Q?AIGzisF9r2GUDbEh3mUtGOcfCaGK3cc56RaTa1Q9FFVS/yUbi2uGZTGidK?= =?iso-8859-1?Q?thLvYuN+44Raz3ySJXjylE/3KQszZ1Blu4JCU2oc4hnSQ1eHo48GkXkFM6?= =?iso-8859-1?Q?tsBhqJCoh//nx+7s1M0fQkMJBlMOVJPXA=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0138;5:gvXk8S3o3BowAKfbiF5eqzouA8dB4y8HLhw/W6YM+XhErtxX9TpEkGGhnhZRRLezTjaBlYIY2XQ/b2d0Uhii6UsQtGZE8lpRFFoCvzEIyCDFdKJ7++BmpsuTLyLgsFQvDXI7ZpnhLsAywYZM8Kp78A==;24:LBXwZTWACCU5NNzRKkLLTRZbOD8d9Ev6XiLZ5ycgZRu0axMQ8GYonkmHGeTpkixF4zWdZBXiNeximHM1AHSIuOzcTN7lT00YIEddhVUs3Qg= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Feb 2016 15:56:16.1912 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DF4PR84MB0138 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/17/2016 04:53 AM, Dave Chinner wrote: > On Tue, Feb 16, 2016 at 08:31:19PM -0500, Waiman Long wrote: >> Linked list is used everywhere in the Linux kernel. However, if many >> threads are trying to add or delete entries into the same linked list, >> it can create a performance bottleneck. >> >> This patch introduces a new per-cpu list subystem with associated >> per-cpu locks for protecting each of the lists individually. This >> allows list entries insertion and deletion operations to happen in >> parallel instead of being serialized with a global list and lock. >> >> List entry insertion is strictly per cpu. List deletion, however, can >> happen in a cpu other than the one that did the insertion. So we still >> need lock to protect the list. Because of that, there may still be >> a small amount of contention when deletion is being done. >> >> A new header file include/linux/percpu-list.h will be added with the >> associated percpu_list structure. The following functions are used >> to manage the per-cpu list: >> >> 1. int init_percpu_list_head(struct percpu_list **pclist_handle) >> 2. void percpu_list_add(struct percpu_list *new, >> struct percpu_list *head) >> 3. void percpu_list_del(struct percpu_list *entry) > A few comments on the code > >> + * A per-cpu list protected by a per-cpu spinlock. >> + * >> + * The list head percpu_list structure contains the spinlock, the other >> + * entries in the list contain the spinlock pointer. >> + */ >> +struct percpu_list { >> + struct list_head list; >> + union { >> + spinlock_t lock; /* For list head */ >> + spinlock_t *lockptr; /* For other entries */ >> + }; >> +}; > This union is bad for kernels running spinlock debugging - the size > of the spinlock can blow out, and that increases the size of any > object that has a percpu_list in it. I've only got basic spinlock > debugging turned on, and the spinlock_t is 24 bytes with that > config. If I turn on lockdep, it gets much larger again.... > > So it might be best to separate the list head and list entry > structures, similar to a hash list? Right. I will split it into 2 separate structure in the next iteration of the patch. >> +static inline void INIT_PERCPU_LIST_HEAD(struct percpu_list *pcpu_list) >> +{ >> + INIT_LIST_HEAD(&pcpu_list->list); >> + pcpu_list->lock = __SPIN_LOCK_UNLOCKED(&pcpu_list->lock); >> +} >> + >> +static inline void INIT_PERCPU_LIST_ENTRY(struct percpu_list *pcpu_list) >> +{ >> + INIT_LIST_HEAD(&pcpu_list->list); >> + pcpu_list->lockptr = NULL; >> +} > These function names don't need to shout. I was just following the convention used in list init functions. I can certainly change them to lowercase. > >> +/** >> + * for_all_percpu_list_entries - iterate over all the per-cpu list with locking >> + * @pos: the type * to use as a loop cursor for the current . >> + * @next: an internal type * variable pointing to the next entry >> + * @pchead: an internal struct list * of percpu list head >> + * @pclock: an internal variable for the current per-cpu spinlock >> + * @head: the head of the per-cpu list >> + * @member: the name of the per-cpu list within the struct >> + */ >> +#define for_all_percpu_list_entries(pos, next, pchead, pclock, head, member)\ >> + { \ >> + int cpu; \ >> + for_each_possible_cpu (cpu) { \ >> + typeof(*pos) *next; \ >> + spinlock_t *pclock = per_cpu_ptr(&(head)->lock, cpu); \ >> + struct list_head *pchead =&per_cpu_ptr(head, cpu)->list;\ >> + spin_lock(pclock); \ >> + list_for_each_entry_safe(pos, next, pchead, member.list) >> + >> +#define end_all_percpu_list_entries(pclock) spin_unlock(pclock); } } > This is a bit of a landmine - the code inside he iteration is under > a spinlock hidden in the macros. People are going to forget about > that, and it's needs documenting about how it needs to be treated > w.r.t. dropping and regaining the lock so sleeping operations can be > performed on objects on the list being iterated. > > Can we also think up some shorter names - names that are 30-40 > characters long are getting out out of hand given we're supposed > tobe sticking to 80 character widths and we lost 8 of them in the > first indent... > > Cheers, > > Dave. I will try to shorten the name and better document the macro. This is probably the most tricky part of the whole part. Cheers, Longman