From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754691AbdCTTPy (ORCPT ); Mon, 20 Mar 2017 15:15:54 -0400 Received: from mail-bl2nam02on0105.outbound.protection.outlook.com ([104.47.38.105]:23008 "EHLO NAM02-BL2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753207AbdCTTPw (ORCPT ); Mon, 20 Mar 2017 15:15:52 -0400 Authentication-Results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=hpe.com; Date: Mon, 20 Mar 2017 14:15:36 -0500 From: Alex Thorlton To: Aaron Lu CC: Alex Thorlton , , , Dave Hansen , Tim Chen , Andrew Morton , Ying Huang Subject: Re: [PATCH v2 0/5] mm: support parallel free of memory Message-ID: <20170320191536.GG196487@stormcage.americas.sgi.com> References: <1489568404-7817-1-git-send-email-aaron.lu@intel.com> <20170316193844.GA110825@stormcage.americas.sgi.com> <20170317022158.GB18964@aaronlu.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20170317022158.GB18964@aaronlu.sh.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Originating-IP: [192.48.192.5] X-ClientProxiedBy: AM5P194CA0006.EURP194.PROD.OUTLOOK.COM (10.175.19.144) To TU4PR84MB0285.NAMPRD84.PROD.OUTLOOK.COM (10.162.186.19) X-MS-Office365-Filtering-Correlation-Id: c83874f4-4d34-4975-ed74-08d46fc5853e X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(48565401081);SRVR:TU4PR84MB0285; X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0285;3:X1ZkRuw2eGfu3iKUYwJNkk8DBdK9O5umDPN1upxQgUvoDCU9cqK0KLE/Agm6XUqyJoAfAajl8hckGJ0T6a/n0tsUaCQ2nqQ0xKr4XEb0bxy/5AqJUmXhfRCIgyfYVzMBe3A7o7eLtKBjEK7/VVbxMcpwuyq2WikxcxZnygcar4s4dPXNNvxWrLU/rVmWRdWyDIlY6XOUpWK67mP0bW3xC3x2+NmfwzgcG3hifbAAFYWeCWtwhcxH+YWuDjBfrkfh7TqJbds+dzLNmok6geq8cZ92+/XtCue+xnyX5zEvv+g=;25:M5uLWkWsraDYf2zV12c73gtkUU5ECECM/N5INfQzsmMopncKVI0hEGp3Hi1MhaIhYtD2qsrZ9bTuUb2CBr+kAjvTMoVufrDbQo1I6Fzx7hesTd96wyszRtcxrWI1nolHaDSRbt1ZBZFE6zkSXm2qJwK4CQb1/xCJ5k41N8nFVVXNmL7VLkGmxni7SqBYLoNoKeahVaw4cvy6SSEzexyiUHEj1g9lLZ527pWNeEHdb37eZW3mizqeymOm85ZPLa4A/Yyba6xJyQC+41t9iibgiKisgDlm0AKdqC9tDl4QpKOZpTPbLe+cadZjmJfUou6CH4Nqudl35FsUR3H3f/hMcl1Ad8x/K5LBeXHjrrD5E33Kcd89SAmsIptyxatf3aplWLQ8/AOcITQrTfka19l/fMpLkTM9h6XyXhgeGO60dC7o4Oq4DomvaRZ1yb59zyF7ocrUXhuvqZ/5LQZGXle2Mg== X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0285;31:8LeiDunGoWTSJGEVxbFMtzXmQAUkq/421FjlBB5m8Hd9VDjjxk6Ae7LMt5gXwsFoFNmfcNY0ArIGAnEJs8XmW0/4WKmcmJjsv2ZTec/jad6WDjesm+NPNTUbvYp8htpeUnmSUJ8+1CHvlPqpO9mW5Frn1/qQStjLYLdHTdGDwHntL4Z4sLtxtCCft0GWdHLwF5COlomcU2ZVA3t+IaBo8SGRy8EabAaDo1Eev15hrnM=;20:QEfOoGeZO4QWlMLKirnz/A3ffHMrhv289lG3G6en8w09w5V6jFKOdyN8owJ5JfqxbCzIhS+sua06AvCA5L+4pyrydv4b+H+jfGI50dhY6I+toxDI1fGubCvtqCbY9y5HtlBJqBJUyg9p2RS/YpGNiUhro5BFhfN87IwhjsKCuvMW2ZUWLydlvKjmEb0FSrXfeKm5Yv6NI2EPVwSQWGj3wh6EtxCxSqyTxR0ITyYpGdS6krrqpqYzlqEvEoP3CiXa0CMcS0hYBGNh264ZdEKnxBAwWUSJK3lGF5VlNrZo+6zYNlEQbb63WDNERjB4A+XGDrTahGYPP9cO8KgoNEWAeLMLVap4d7iWYfJlJTkGfUn6G+2+C8ZvjcxZbe15rdzehY/DRrd8waiJf4vkaRMTqffU8UpGP7tlXu0+v5q6t8WsrdSmtoS5g1sJSwJAljK6zHx2o25buKqUIjzqErhfZV6hby63Rhod34RgSrXa73Ie+7dj2IVfJasflIomO9DC X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026)(6041248)(20161123555025)(20161123560025)(20161123564025)(20161123558025)(20161123562025)(6072148);SRVR:TU4PR84MB0285;BCL:0;PCL:0;RULEID:;SRVR:TU4PR84MB0285; X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0285;4:bJpyTUG4aH/Jk+K2pKonzsVQ9CMGezfaARLzmO0Tn9UAlG8vg+8nHG2TjKEV+jLXggW/qf38OGMiKyu/PLAzAadWPzUD+ZnfobVPJVYV2+kMcOsGNbOtos4ozv5/4EYD1dJZ8h/vRfSI0ssNQ//hJYA1qc/0tm5z1uK6B4KeVYPjRvU0gMVrM/vSkWuuazXsgltVuzn0iGUPr6Qz3x8nnFbrUZ0bTvJd+ufXjgHcJ9GDW6SFvuTC9KHDD9c6/GEYKPp3E3OYrp1CEV3J0lc8vNsjLY9ApePkLZZXulhE6sZnAlXIzLu28ieJgjrC5WSdnLxsx1Sdd4fBRuhabixfzMe2TvJ1+XYZx9hXynDQY/0kiefBPVZc6/J1NfMOcpnNfiGzZ9X/DGKJmPjrjWuF4jewqIZQeUkRNqacmWsxNY2YuQfOlUb9bwqEQS6d7jFS8xO3KK+n/43Pq138KPvQmrf4gD30dQrUxCttvGgjnUY20k6Kh0qOTFZxWAViTfkS7UZgCTS4k/TuyJD+1i0Rp8eR56Fu3idHe+Adfu4lQZJ51apQwDJM9hgKEx31B12zA1QFToWx1psnJQOW3xQ11iZWtP6chWFt64s3dJHWshvS0ffUo1kHXoUdyuzBJuON X-Forefront-PRVS: 02524402D6 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(39840400002)(39850400002)(39410400002)(39860400002)(39450400003)(24454002)(54534003)(189998001)(53936002)(110136004)(81166006)(38730400002)(33656002)(50466002)(8676002)(6246003)(97756001)(305945005)(46406003)(6506006)(7736002)(6306002)(42186005)(23726003)(76176999)(54356999)(50986999)(55016002)(9686003)(54906002)(229853002)(1076002)(47776003)(66066001)(3846002)(2906002)(83506001)(86362001)(4001350100001)(6116002)(5660300001)(2950100002)(6916009)(4326008)(6666003)(18370500001);DIR:OUT;SFP:1102;SCL:1;SRVR:TU4PR84MB0285;H:stormcage.americas.sgi.com;FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;TU4PR84MB0285;23:csHSXyvv7NcEFs5/ceqU0h63UFImLw46a2u7ZTHP7?= =?us-ascii?Q?IvGQ0ipfDuVEHsh9iMnIM3omBFkMQ5nLzI4Ek2dDRChrHbJEr+AqpsWHfX+S?= =?us-ascii?Q?cSDiIjHO0TqI0DWPNvUSTdXmzVIcMfi/NRAQ6KVMwwr0Glf8xXfALmjUHBh8?= =?us-ascii?Q?GMTaP1+RbLd+i912z7atR2VzzPY627nvouRRVo39GNNj6hwxUD4PlvpmGa57?= =?us-ascii?Q?dH8Bvv5kWMpo5IKpBNUZmZlk/TLjTP3Bq8m5+hAQavVZbe3Cvmf92Au4Ma4z?= =?us-ascii?Q?X4Ja3bX0vJyUzc+hJZBHmRfw6V+Bs+6rFox03l1f7YM9lv3NMry9G6Grncj9?= =?us-ascii?Q?0tEVgrW615fRQrNN2fNUEDtX/uuF3qU4ut5257Tdy33UG9JyDHlJ0skw5qCN?= =?us-ascii?Q?VMOnY3NsCKFWbAB9KevaUtDSLVej6ebKf08f/76xLcp470IG6MdSJBMA0RdC?= =?us-ascii?Q?JdY3b8Zh+ZuQ/lJu/HDaVWujl6XQ4VOqwUJIwkUhKq1LFRbypzTLe+atabAZ?= =?us-ascii?Q?vAqx9IYuTKuv2cPdkRrCQ076GHGeXEC/KQpI4B84sYMoC1XJPDwmCfc1Sf8t?= =?us-ascii?Q?ew8zNW3vMQ7oBSxzAZP028GZts2O2PA79flJlQ65+8Gtoy0cETxM4REWoJgy?= =?us-ascii?Q?WDJ69ahpVLaNj8lGFn0MBwAmKS3EAKPlQztmR81Sui5ZclcSSfRQ5aN4k1It?= =?us-ascii?Q?coHE1GoNSWw1PmskbA3hS7xjBvHCybAweYWioUYG7isXzS6aSfxWvGCGZ+xc?= =?us-ascii?Q?kISKBUrSlssC96XivM7juhxei9uRH8nNUZfW+lLjk4IvIgTbEsPYtPgwRgiA?= =?us-ascii?Q?lzIX3q4LWjJE/xzc7SZnavVn7eQyZZ7TgDqYc59HXvTnZqafmV/7vL6V15UJ?= =?us-ascii?Q?g5bczdGGWKhX3RqSv2kgVgNflytmpcYZOXfbVXtWKFCW9XYxHo0KaynuiVcg?= =?us-ascii?Q?7luKifCmt1xZ45Tf4SRdNijgeaFMtBfr4Bnub6Mma9+mCdfQaJRsrMJjQNwu?= =?us-ascii?Q?Tm0NLY47EByaIvT4ros91wh6IS6OC+4WX3TuuRP7+Et+iG08tUmMHs74xNAp?= =?us-ascii?Q?w3H2Y2goENs6SorGISXo59FPRkLO/0JU1jWfZsigqbcwSLK8yzBq9J3QzSL6?= =?us-ascii?Q?p0+k1p4ivVDeqc4EhyN8qnl2q05XJYSQz/Bv+0deuOGidQi4ix0pQfEd6O3o?= =?us-ascii?Q?W0yHn2Y5jkA6UUvX5RnV0DyrPcCvPApqLNG?= X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0285;6:DDmDo0p7ZC53ELEM8+/mrqA7VR9qOZNO07034aWkzQapEuTgo3OwilVnE9O9MSNnBpH9p9wajqb4LMAMpV5QWIQ+FEyi1dRIa9R3/Th63UhtQQrbkm66jA4To6KwYVu9cicANSdAp0l1VFmACv35KVK+dNiGnfYjFjbBm9uH54ulgMWq1IBxpE9qhA42byaTlLNeCXBu5LbKHpO9TTomNfCEbtIlqjYYfCqsRrpZBZVGFW/ZxaU7VVAxs5HINkpq2TXqhHc2GeiwIVMdGX/CiWRkdA9HF0EQFXhBOnKjlu4oMyrgx7MrsThGBu3gUPXZ9Pit2mfmUQQxCyyGxgHyeMrh3GjSrpe7CDAysJxtWiPZqh8x85Is1aXeI+qg72/PeVYOa5KzU3jzpWXBAo+pHDrgapJstw9DhYEWyMJQydc=;5:e/O7vIjXaivwcrLYbOODAreZbOW7qHh1on0wQH/GhGxYl0PCR4++YEVC+IMIJMln/ZGF4NWckfdcGPaTJhNdoaUdfeV0UyRIGY6O+OjcwUkh23PMt4AqJon8B2opEADFa+HdM1x/NUGA7ynw9Dv8aA==;24:pyWxV4OBbh8cHCQ2kOF84IgDvOBHcVLw0u0oW+9f29T54XXszJZuQ1u0DZgrqxpKRoka/eTXLqoAo/s0FnphuYNCt0l+/z3CL5KCuN77hzM= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0285;7:pzKF/9y119i1FeckcRqI8kD2IgCuR9ahx6InxhBLi3iK46YKyYEkPvgo3MJLY8WPaZhVsjpkvGdS7BJbS+xj9qD/btvlocE4RothFPJ/rRuN0+qe2aQ+NJx3gLZOOSmEvHDWo+PaKXzUUU6heLkeJf3EwHKKFV9FEBoUj1Sq7KN1AehjhhTDOemsw23DEcyOt+aFqdBOuXUMY3q0NDx9hvTylChDGA3zVWyP311gyKM8Y5mMymD3mC/Vyaa7d/Fe+PcyiNQpdbUPMLQ2HYQP9/SjCF0jFDXbbqIG7EMIN2A6kt4/3luRuHRHVZojKKzj8mYB1XYx9hOR+6nk78FYhQ== X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Mar 2017 19:15:47.6865 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: TU4PR84MB0285 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 17, 2017 at 10:21:58AM +0800, Aaron Lu wrote: > On Thu, Mar 16, 2017 at 02:38:44PM -0500, Alex Thorlton wrote: > > On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote: > > > v2 changes: Nothing major, only minor ones. > > > - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41; > > > - use list_add_tail instead of list_add to add worker to tlb's worker > > > list so that when doing flush, the first queued worker gets flushed > > > first(based on the comsumption that the first queued worker has a > > > better chance of finishing its job than those later queued workers); > > > - use bool instead of int for variable free_batch_page in function > > > tlb_flush_mmu_free_batches; > > > - style change according to ./scripts/checkpatch; > > > - reword some of the changelogs to make it more readable. > > > > > > v1 is here: > > > https://lkml.org/lkml/2017/2/24/245 > > > > I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads > > and 8TB of RAM, with a 1TB malloc. The average free() time for a 1TB > > malloc on a vanilla kernel was 41.69s, the patched kernel averaged > > 21.56s for the same test. > > Thanks a lot for the test result. > > > > > I am testing v2 now and will report back with results in the next day or > > so. > > Testing plain v2 shouldn't bring any surprise/difference You're right! Not much difference here. v2 averaged a 23.17s free time for a 1T allocation. > better set the > following param before the test(I'm planning to make them default in the > next version): > # echo 64 > /sys/devices/virtual/workqueue/batch_free_wq/max_active > # echo 1030 > /sys/kernel/debug/parallel_free/max_gather_batch_count 10 test runs with these params set averaged 22.22s to free 1T. So, we're still seeing a nearly 50% decrease in free time vs. the unpatched kernel. - Alex From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f198.google.com (mail-qt0-f198.google.com [209.85.216.198]) by kanga.kvack.org (Postfix) with ESMTP id 4CE346B0388 for ; Mon, 20 Mar 2017 15:15:52 -0400 (EDT) Received: by mail-qt0-f198.google.com with SMTP id j30so124745153qta.2 for ; Mon, 20 Mar 2017 12:15:52 -0700 (PDT) Received: from NAM02-BL2-obe.outbound.protection.outlook.com (mail-bl2nam02on0131.outbound.protection.outlook.com. [104.47.38.131]) by mx.google.com with ESMTPS id 22si13703814qku.104.2017.03.20.12.15.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 20 Mar 2017 12:15:51 -0700 (PDT) Date: Mon, 20 Mar 2017 14:15:36 -0500 From: Alex Thorlton Subject: Re: [PATCH v2 0/5] mm: support parallel free of memory Message-ID: <20170320191536.GG196487@stormcage.americas.sgi.com> References: <1489568404-7817-1-git-send-email-aaron.lu@intel.com> <20170316193844.GA110825@stormcage.americas.sgi.com> <20170317022158.GB18964@aaronlu.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20170317022158.GB18964@aaronlu.sh.intel.com> Sender: owner-linux-mm@kvack.org List-ID: To: Aaron Lu Cc: Alex Thorlton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dave Hansen , Tim Chen , Andrew Morton , Ying Huang On Fri, Mar 17, 2017 at 10:21:58AM +0800, Aaron Lu wrote: > On Thu, Mar 16, 2017 at 02:38:44PM -0500, Alex Thorlton wrote: > > On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote: > > > v2 changes: Nothing major, only minor ones. > > > - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41; > > > - use list_add_tail instead of list_add to add worker to tlb's worker > > > list so that when doing flush, the first queued worker gets flushed > > > first(based on the comsumption that the first queued worker has a > > > better chance of finishing its job than those later queued workers); > > > - use bool instead of int for variable free_batch_page in function > > > tlb_flush_mmu_free_batches; > > > - style change according to ./scripts/checkpatch; > > > - reword some of the changelogs to make it more readable. > > > > > > v1 is here: > > > https://lkml.org/lkml/2017/2/24/245 > > > > I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads > > and 8TB of RAM, with a 1TB malloc. The average free() time for a 1TB > > malloc on a vanilla kernel was 41.69s, the patched kernel averaged > > 21.56s for the same test. > > Thanks a lot for the test result. > > > > > I am testing v2 now and will report back with results in the next day or > > so. > > Testing plain v2 shouldn't bring any surprise/difference You're right! Not much difference here. v2 averaged a 23.17s free time for a 1T allocation. > better set the > following param before the test(I'm planning to make them default in the > next version): > # echo 64 > /sys/devices/virtual/workqueue/batch_free_wq/max_active > # echo 1030 > /sys/kernel/debug/parallel_free/max_gather_batch_count 10 test runs with these params set averaged 22.22s to free 1T. So, we're still seeing a nearly 50% decrease in free time vs. the unpatched kernel. - Alex -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org