From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACBEFC38A2A for ; Wed, 6 May 2020 22:37:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6972D2075E for ; Wed, 6 May 2020 22:37:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MTekYnJP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6972D2075E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0394D8E0005; Wed, 6 May 2020 18:37:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2D5D8E0003; Wed, 6 May 2020 18:37:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E42198E0005; Wed, 6 May 2020 18:37:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id CCC488E0003 for ; Wed, 6 May 2020 18:37:06 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 832A7181AC9BF for ; Wed, 6 May 2020 22:37:06 +0000 (UTC) X-FDA: 76787756052.05.shoe34_5b1d83b61ec00 X-HE-Tag: shoe34_5b1d83b61ec00 X-Filterd-Recvd-Size: 5898 Received: from mail-il1-f194.google.com (mail-il1-f194.google.com [209.85.166.194]) by imf20.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 May 2020 22:37:06 +0000 (UTC) Received: by mail-il1-f194.google.com with SMTP id r2so3396448ilo.6 for ; Wed, 06 May 2020 15:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EqjtatwoK6H8Lcx1Rhpj+GxNACQFKFH+9PeapNxu2JM=; b=MTekYnJP90JNHZTmsIuBlkEn4myBtRaTxukh4xi/ctVgI0KxdRIcicfubDzM7VNSZ1 RGnmh3eNtWnS6SV/MGvc8hBdCQxR5ICDyfnkBTkDuDtjc9eW0Mzg+iCG6ErR8hfnkO2b yFHfKAQy8Q5u8XYIetUR8EACxcHMqHfa64OXKnnnPhaPPklkvNf9o6hVjGzqN2REgr6h TTrpAsZIxGLfpn6AhkibRAy2gsztNp7IS/7EPP2Kaq7AsykcWwzSFZXfzFQpyFXwwrdT xmX/YuqCf66apZ5qP/laKCkxPO7cjwyn7lyNtSqhqJRANdjsYjfmeWhoeTvMeiUeIA/L xMnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EqjtatwoK6H8Lcx1Rhpj+GxNACQFKFH+9PeapNxu2JM=; b=mKG2Fz5BuM5Xu+4vU0asG6U0KvTupIB9Ok56rtO/t276nUzgcjmbpN+N2XC3h2J4Ch dTTfrW2g8BL5Et1lS7Op1ybJy6rXw4YlXE8aVWht8sQXDWQgiFO/IcBL+43ULFDzAGiz kUeT3KBN0cQ7Ff8yOJ+wrcKrEjeXTbMkqW2MkFf/LRdNkMWioxpYezK9AxMZEx04xQAG n1fWnoMqqBVILFSGjC3Q+dWSureo5mE1vJKKv0iJdTa/8snZ2JknmNup9zGVc5UNIfyM 7+QAo8lF0Ewi6urybzaKzLeABRUzdbG2L4HJkjgEtNeVZAJhio80gSoXQsu0E6riTJ4i ZdNQ== X-Gm-Message-State: AGi0Pub/EouQPu/DOnRI8nvVo/M3ixIz2rel1ORaGscmiC68if7ZVCJU qr6s8vgNsm93eOLDX5MdJXIdrmFv1Bni9L+rYhw= X-Google-Smtp-Source: APiQypJh1i3df2NxoyGwheWAG3Ch1BHPcr0BNM6f4zyqUF6px9kD/RgBXFKKX/b6OduGEY0oNq/GyoKLcUIe4Bk+1bQ= X-Received: by 2002:a92:858b:: with SMTP id f133mr10241065ilh.97.1588804625313; Wed, 06 May 2020 15:37:05 -0700 (PDT) MIME-Version: 1.0 References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> <20200430201125.532129-7-daniel.m.jordan@oracle.com> <3C3C62BE-6363-41C3-834C-C3124EB3FFAB@joshtriplett.org> <20200505014844.ulp4rtih7adtcicm@ca-dmjordan1.us.oracle.com> <20200505020916.mve4ijrg4z5h7eh5@ca-dmjordan1.us.oracle.com> <20200506222127.l3p2a2vjavwz2bdl@ca-dmjordan1.us.oracle.com> In-Reply-To: <20200506222127.l3p2a2vjavwz2bdl@ca-dmjordan1.us.oracle.com> From: Alexander Duyck Date: Wed, 6 May 2020 15:36:54 -0700 Message-ID: Subject: Re: [PATCH 6/7] mm: parallelize deferred_init_memmap() To: Daniel Jordan Cc: Josh Triplett , Andrew Morton , Herbert Xu , Steffen Klassert , Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm , LKML Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, May 6, 2020 at 3:21 PM Daniel Jordan wrote: > > On Tue, May 05, 2020 at 07:55:43AM -0700, Alexander Duyck wrote: > > One question about this data. What is the power management > > configuration on the systems when you are running these tests? I'm > > just curious if CPU frequency scaling, C states, and turbo are > > enabled? > > Yes, intel_pstate is loaded in active mode without hwp and with turbo enabled > (those power management docs are great by the way!) and intel_idle is in use > too. > > > I ask because that is what I have seen usually make the > > difference in these kind of workloads as the throughput starts > > dropping off as you start seeing the core frequency lower and more > > cores become active. > > If I follow, you're saying there's a chance performance would improve with the > above disabled, but how often would a system be configured that way? Even if > it were faster, the machine is configured how it's configured, or am I missing > your point? I think you might be missing my point. What I was getting at is that I know for performance testing sometimes C states and P states get disabled in order to get consistent results between runs, it sounds like you have them enabled though. I was just wondering if you had disabled them or not. If they were disabled then you wouldn't get the benefits of turbo and as such adding more cores wouldn't come at a penalty, while with it enabled the first few cores should start to slow down as they fell out of turbo mode. So it may be part of the reason why you are only hitting about 10x at full core count. As it stands I think your code may speed up a bit if you split the work up based on section instead of max order. That would get rid of any cache bouncing you may be doing on the pageblock flags and reduce the overhead for splitting the work up into individual pieces since each piece will be bigger.