From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C28EC433DB for ; Thu, 7 Jan 2021 18:57:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3D11B23435 for ; Thu, 7 Jan 2021 18:57:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729239AbhAGS4y (ORCPT ); Thu, 7 Jan 2021 13:56:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726326AbhAGS4y (ORCPT ); Thu, 7 Jan 2021 13:56:54 -0500 Received: from ZenIV.linux.org.uk (zeniv.linux.org.uk [IPv6:2002:c35c:fd02::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4256C0612F5 for ; Thu, 7 Jan 2021 10:56:13 -0800 (PST) Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kxaRz-0086Pu-TK; Thu, 07 Jan 2021 18:55:59 +0000 Date: Thu, 7 Jan 2021 18:55:59 +0000 From: Al Viro To: Linus Torvalds Cc: kernel test robot , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Peter Zijlstra , LKML , lkp@lists.01.org, kernel test robot , "Huang, Ying" , Feng Tang , zhengjun.xing@intel.com Subject: Re: [x86] d55564cfc2: will-it-scale.per_thread_ops -5.8% regression Message-ID: <20210107185559.GI3579531@ZenIV.linux.org.uk> References: <20210107134723.GA28532@xsang-OptiPlex-9020> <20210107183358.GG3579531@ZenIV.linux.org.uk> <20210107184058.GH3579531@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210107184058.GH3579531@ZenIV.linux.org.uk> Sender: Al Viro Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 07, 2021 at 06:40:58PM +0000, Al Viro wrote: > do_sys_poll(): do the wholesale copyout > > Don't bother with patching up just one field - 16 bits out of each 64. > The amount of memory traffic is not going to be greater (might be > smaller, actually) and the loop in copy_to_user() is optimized for > bulk copy. BTW, considering the access pattern, I would expect it to be considerably cheaper in a lot of cases; basically, we have a copy of userland array of 64bit values, then we do a non-trivial amount of work and modify 16 bits out of each 64. Then we want that propagated back to the original array. I suspect that copying just those 16bit fields out is going to cost more that a bulk copy of the entire thing, and not just on s390 and similar oddball cases. Comments? From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============7511704217346106735==" MIME-Version: 1.0 From: Al Viro To: lkp@lists.01.org Subject: Re: [x86] d55564cfc2: will-it-scale.per_thread_ops -5.8% regression Date: Thu, 07 Jan 2021 18:55:59 +0000 Message-ID: <20210107185559.GI3579531@ZenIV.linux.org.uk> In-Reply-To: <20210107184058.GH3579531@ZenIV.linux.org.uk> List-Id: --===============7511704217346106735== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Thu, Jan 07, 2021 at 06:40:58PM +0000, Al Viro wrote: > do_sys_poll(): do the wholesale copyout > = > Don't bother with patching up just one field - 16 bits out of each 64. > The amount of memory traffic is not going to be greater (might be > smaller, actually) and the loop in copy_to_user() is optimized for > bulk copy. BTW, considering the access pattern, I would expect it to be considerably cheaper in a lot of cases; basically, we have a copy of userland array of 64bit values, then we do a non-trivial amount of work and modify 16 bits out of each 64. Then we want that propagated back to the original array. I suspect that copying just those 16bit fields out is going to cost more that a bulk copy of the entire thing, and not just on s390 and similar oddball cases. Comments? --===============7511704217346106735==--