From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21720C2BB1D for ; Tue, 14 Apr 2020 19:40:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DFAC1206A2 for ; Tue, 14 Apr 2020 19:40:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DFAC1206A2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 92D7C8E0013; Tue, 14 Apr 2020 15:40:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B63D8E0001; Tue, 14 Apr 2020 15:40:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CC7D8E0013; Tue, 14 Apr 2020 15:40:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id 6232B8E0001 for ; Tue, 14 Apr 2020 15:40:37 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 247D9824556B for ; Tue, 14 Apr 2020 19:40:37 +0000 (UTC) X-FDA: 76707477714.17.army51_227645234f62f X-HE-Tag: army51_227645234f62f X-Filterd-Recvd-Size: 3384 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Tue, 14 Apr 2020 19:40:36 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 4F686AF2B; Tue, 14 Apr 2020 19:40:34 +0000 (UTC) Date: Tue, 14 Apr 2020 21:40:33 +0200 From: Michal Hocko To: Prathu Baronia Cc: alexander.duyck@gmail.com, chintan.pandya@oneplus.com, ying.huang@intel.com, akpm@linux-foundation.com, linux-mm@kvack.org, gregkh@linuxfoundation.com, gthelen@google.com, jack@suse.cz, ken.lin@oneplus.com, gasine.xu@oneplus.com Subject: Re: [PATCH v2] mm: Optimized hugepage zeroing & copying from user Message-ID: <20200414194033.GU4629@dhcp22.suse.cz> References: <20200414153829.GA15230@oneplus.com> <20200414170312.GR4629@dhcp22.suse.cz> <20200414184743.GB2097@oneplus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200414184743.GB2097@oneplus.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 15-04-20 00:17:44, Prathu Baronia wrote: > The 04/14/2020 19:03, Michal Hocko wrote: > > I still have hard time to see why kmap machinery should introduce any > > slowdown here. Previous data posted while discussing v1 didn't really > > show anything outside of the noise. > > > You are right, the multiple barriers are not responsible for the slowdown, but > removal of kmap_atomic() allows us to call memset and memcpy for larger sizes. > I will re-frame this part of the commit text when we proceed towards v3 to > present it more cleanly. While this might be OK for 2MB huge pages, does the same apply to other larger sizes? E.g. 512MG or 1G or even larger huge pages? You should consider !PREEMPT kernels. [...] > > No. There is absolutely zero reason to add a config option for this. The > > kernel should have all the information to make an educated guess. > > > I will try to incorporate this in v3. But currently I don't have any idea on how > to go about implementing the guessing logic. Would really appreciate if you can > suggest some way to go about it. If you cannot guess the proper sizing then how is a poor user who tries to configure the kernel supposed to do it? > > Also before going any further. The patch which has introduced the > > optimization was c79b57e462b5 ("mm: hugetlb: clear target sub-page last > > when clearing huge page"). It is based on an artificial benchmark which > > to my knowledge doesn't represent any real workload. Your measurements > > are based on a different benchmark. Your numbers clearly show that some > > assumptions used for the optimization are not architecture neutral. > > > But oneshot numbers are significantly better on both the archs. I think > theoretically the oneshot approach should provide better results on all the > architectures when compared with serial approach. Isn't it a fair assumption to > go ahead with the oneshot approach? What is this assumption based on? Also please consider that all these numbers are based on artificial microbenchmarks. Can you see any difference on real world huge page users? The same applies to the regression you can see with the existing code. -- Michal Hocko SUSE Labs