From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1424002AbcBQPkk (ORCPT ); Wed, 17 Feb 2016 10:40:40 -0500 Received: from mail-bn1bbn0105.outbound.protection.outlook.com ([157.56.111.105]:28653 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1422745AbcBQPki (ORCPT ); Wed, 17 Feb 2016 10:40:38 -0500 Authentication-Results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=hpe.com; Message-ID: <56C4946B.10102@hpe.com> Date: Wed, 17 Feb 2016 10:40:27 -0500 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Ingo Molnar CC: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter , , , Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Scott J Norton , Douglas Hatch , Linus Torvalds , Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: Re: [RRC PATCH 2/2] vfs: Use per-cpu list for superblock's inode list References: <1455672680-7153-1-git-send-email-Waiman.Long@hpe.com> <1455672680-7153-3-git-send-email-Waiman.Long@hpe.com> <20160217071632.GA18403@gmail.com> In-Reply-To: <20160217071632.GA18403@gmail.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [71.168.64.11] X-ClientProxiedBy: BLUPR0101CA0038.prod.exchangelabs.com (25.163.116.176) To AT5PR84MB0130.NAMPRD84.PROD.OUTLOOK.COM (25.162.137.24) X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0130;2:sFNa4G9JDkycubO28j61MpEuWmtcfOgS8i+NHmdI0C63hmQ4LPx2WCrRPSd6tS+Zr5AYx6Vt8spdIjJNDZxHHyI6aZfZ9+8dbwbtGNmWDhqYTfaiZQsYB0fUZ/TAzQZb6RVpWVDPyeFRPUBxghsrsA==;3:xsmZ0DAQUyH1yVYeHPHOQykqWHbceLlmlOPr14F/l4LK+TrLp51+OEKVVwNjIRYjDTcc5/HhpytIprMjLA6MEu/sgsslC81YZRCWgS/JpT4HEi5oIV7PGaXA+DWirrAp;25:JoJ7ggtGTEPSHKrSQ2hg1aA6H+EFm8+R7nb1jAE6maDhqYEI/vx8k7rezDXUngpp6EuoQImSlpPEgOiWFCoNimaXGevSqX5LI8z6ml7omG6qGeHeGgr2tHix7GaH1DHjPZPdl+KpsjC7NBlaD6HASrJmhXCfuaT+amn75mNN2BxQYagbgwm3/iFf89A3XkOOfihyRGIvLasuZentbd2LFW9JTS/3df3I1qnOuxYviir2oq+sJZmInF6eo82FcflSlIJvYl4jAfhUZApdbHaZgs0icAoW3CXv1MvZeyQ/6972qWLY65XI6tKoN3eA3lqa;20:NXvpRWjFPgW/TnDOKfn55IFbXp+p8mHOEAWuXP43O3TRSgBzx4s0n/TKGAMD7wiMiHL498gF5RzpnrKQCng5ZulhsCQUHTQAzQj3IM2wQnISZ4EVajOtw0Q5V1CnD/++aqlyHT3eNc0GDRpKhHpsI/E0kBFar9KliyrCVBzYghqKW0/dLKkY/SPhsKmZIkj4SMBFuusRP5oMDFwK5HW/zAQyn7bn9SeIxZP2KI8JD4rqtBjmzAHvLzPxRmZn9+EC X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0130; X-MS-Office365-Filtering-Correlation-Id: 923d903a-28f9-4a9e-d1b6-08d337b0ade8 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(227479698468861); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:AT5PR84MB0130;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0130; X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0130;4:xtMofTvMhTUJ7tAD5lig0XXvVnT0a56pHSPT8LoeKWGbu1RAdz6rofVNp3VeCWta+4uE54OhDxohqTe+M5D+rPx0ljP06FP/MQIHQleRMPhBjJysFOMEXTStlzPZ2jlfT7c4vIBe49UPcQk1odZlIlYiDeXdCNL3VWY4KvgaK+6PHV/ewFSPNaKRSDqQt2uaRemW39rBGq/jP3t0LyxeOIKinjY0+Jfglompjyikfbun4SG4uaiP+P1JXaQ1umX6J3PefZEuuuB/M8kI35YcTtO8aPDQlTa0lnJ76vTRZ/utx9+xa9lM+5xPlY+oleEe9ByIuQ1NMfCQSzbVCvkOPLD3+0bWcQ9SehayfHX2CF9pdTTA/FyZUTOQoKV/YXSFC+BQxPC/qEgddHZB/c9rcn8MQvI2UrE/PstHNyOftwM= X-Forefront-PRVS: 085551F5A8 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(6049001)(164054003)(24454002)(479174004)(377454003)(5890100001)(4001350100001)(86362001)(6116002)(110136002)(189998001)(19580405001)(5001960100002)(42186005)(83506001)(5004730100002)(36756003)(230700001)(4326007)(23756003)(2950100001)(77096005)(5008740100001)(2906002)(3846002)(76176999)(54356999)(19580395003)(65816999)(117156001)(50986999)(40100003)(87976001)(66066001)(1096002)(47776003)(586003)(92566002)(65806001)(65956001)(33656002)(7059030)(217873001);DIR:OUT;SFP:1102;SCL:1;SRVR:AT5PR84MB0130;H:[192.168.142.188];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;AT5PR84MB0130;23:sKvF77m9NyDPVl4bmprGO1mbAC5GJJjNQcwCNc1?= =?iso-8859-1?Q?+nDhRTsx/5akVjs/QDmYanSECRxHOhMBQgh8TCsc0T4X/v3XC8qZ//4Xw3?= =?iso-8859-1?Q?BtveqOI1EU1fO346qXhZRL3/VkvwbVEGe17NmtjVtBPKue29+hrMEjAk9L?= =?iso-8859-1?Q?CcFJzG53Oj0OJiOliaIl1WZuXZO8WHTu7L+Tx8SSffJf47HU5XTl+hBtma?= =?iso-8859-1?Q?elxl8ka3RDRdvyg6pmdxwcgeLjcsrEXZdHe1P4oh/jE6NiikNQWHqYRakx?= =?iso-8859-1?Q?T3lrLw02fPhQwcyDrOt6BDH4bKZYn0nRekY5SRdJTWp5IAM2Kq7yteKQ8z?= =?iso-8859-1?Q?cQnLPtofnsnRcO5NkjcWF6+WO54UOcfNWHJ7hNML3AhKE90zavpDee8BzO?= =?iso-8859-1?Q?uuHSJ5JQVWhAsPo7vZPohtJBNT+aIZbvmzInh0rjD9h+mHCwab0qslTm0k?= =?iso-8859-1?Q?xTzP/s0DutaPwjzNNUEuyV9//2iQpFaYs2cUFlmyn+RAyDG2c2oUzHOJZ3?= =?iso-8859-1?Q?TTpainTv3dtYD/TDjnSMKhuJ6otH3PIceajyjnycIpEPx1dQvekqiX/JS8?= =?iso-8859-1?Q?BS//nNvOobUtsQsUQANMSfLOOvFyTWsSmP4g9DYS07FrG/DqqYRmhlWwT/?= =?iso-8859-1?Q?GmoJTsNtyXfiFP8bfMhigOkX1F3UNDlFHAcux4GaHsmX/SYpIYtYmxucVl?= =?iso-8859-1?Q?v26BiWcZrOvtpvemwh5MLvE2tZ3T02B3Ase5JLbZw55ri2Ltro0pTuWfgO?= =?iso-8859-1?Q?Adi/wLRMfxVU5s8LsduOo9MPc2lA9lbznE2ZdS16O6z+d3uyDAhRh4lql+?= =?iso-8859-1?Q?+ePXJSHS8xJgmawm4SDr6GLv8km2pBCuWxE+brP9mzNS6y1R65hG5HKCQe?= =?iso-8859-1?Q?htWmhYU6+2MBC5btFb0YtCh0ARuexboCDCnTpCLwaFwYM9c5+pUDJlgpAw?= =?iso-8859-1?Q?jYQv5EkRw0L7NT21uJKrqU+klphI9Mi6WqFcfq6hlqth4bMm4KOOnCae3F?= =?iso-8859-1?Q?Xn1IYRDJuHAzS5dOzrHft713ULJSSMucHp7YOxWQc3oFHJhRjJzGHZutCN?= =?iso-8859-1?Q?yhNYW5x8e+ORwC/R+oCY9yyOVKthlCm4VuZAaFptQ9li4D8n2GkoIEk6FC?= =?iso-8859-1?Q?KURBwRKrf/nq4JToE8suvqzLDAfwDCa+Mx2cFtaCe9ugW5/m3xkisNpxWH?= =?iso-8859-1?Q?9AGiOomzRe7/0993tJVv6F0VgZts8QSNqnY5qKM6ZfzXyLib2nuAkk=3D?= X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0130;5:gofYjHe6mzSw65o4jcs09hgE46wmH+KURNEeIi2+xvuJIsLNzJvckDiB++/TqmY5KjJqeORlRx+gMAvNXWb3m4OVzRKZOQC/6RdViDw+u/NItVtefWZYyYopP9azNeLj365pwcsYnyW8KQdZsIYtLA==;24:l+2E2SfHMLJYowWAeNx36ODXGCQzcyDxoErmPmrcl9EaMIRy7vUPy42/NrUmFVfExogSne1M9/USziTKdlbPNHMePiAOoVxAap+IehyXmS4= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Feb 2016 15:40:35.0078 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AT5PR84MB0130 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/17/2016 02:16 AM, Ingo Molnar wrote: > * Waiman Long wrote: > >> When many threads are trying to add or delete inode to or from >> a superblock's s_inodes list, spinlock contention on the list can >> become a performance bottleneck. >> >> This patch changes the s_inodes field to become a per-cpu list with >> per-cpu spinlocks. >> >> With an exit microbenchmark that creates a large number of threads, >> attachs many inodes to them and then exits. The runtimes of that >> microbenchmark with 1000 threads before and after the patch on a >> 4-socket Intel E7-4820 v3 system (40 cores, 80 threads) were as >> follows: >> >> Kernel Elapsed Time System Time >> ------ ------------ ----------- >> Vanilla 4.5-rc4 65.29s 82m14s >> Patched 4.5-rc4 22.81s 23m03s >> >> Before the patch, spinlock contention at the inode_sb_list_add() >> function at the startup phase and the inode_sb_list_del() function at >> the exit phase were about 79% and 93% of total CPU time respectively >> (as measured by perf). After the patch, the percpu_list_add() >> function consumed only about 0.04% of CPU time at startup phase. The >> percpu_list_del() function consumed about 0.4% of CPU time at exit >> phase. There were still some spinlock contention, but they happened >> elsewhere. > Pretty impressive IMHO! > > Just for the record, here's your former 'batched list' number inserted into the > above table: > > Kernel Elapsed Time System Time > ------ ------------ ----------- > Vanilla [v4.5-rc4] 65.29s 82m14s > batched list [v4.4] 45.69s 49m44s > percpu list [v4.5-rc4] 22.81s 23m03s > > i.e. the proper per CPU data structure and the resulting improvement in cache > locality gave another doubling in performance. > > Just out of curiosity, could you post the profile of the latest patches - is there > any (bigger) SMP overhead left, or is the profile pretty flat now? > > Thanks, > > Ingo Yes, there were still spinlock contention elsewhere in the exit path. Now the bulk of the CPU times was in: - 79.23% 79.23% a.out [kernel.kallsyms] [k] native_queued_spin - native_queued_spin_lock_slowpath - 99.99% queued_spin_lock_slowpath - 100.00% _raw_spin_lock - 99.98% list_lru_del - d_lru_del - 100.00% select_collect detach_and_collect d_walk d_invalidate proc_flush_task release_task do_exit do_group_exit get_signal do_signal exit_to_usermode_loop syscall_return_slowpath int_ret_from_sys_call The locks that were being contended were nlru->lock. For a 4-node system that I used, there will be four of those. Cheers, Longman