From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E25D8C43460 for ; Fri, 7 May 2021 15:06:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 56B166145A for ; Fri, 7 May 2021 15:06:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56B166145A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 842AF8D0016; Fri, 7 May 2021 11:06:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F2168D0014; Fri, 7 May 2021 11:06:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66FB38D0016; Fri, 7 May 2021 11:06:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0161.hostedemail.com [216.40.44.161]) by kanga.kvack.org (Postfix) with ESMTP id 400B38D0014 for ; Fri, 7 May 2021 11:06:02 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E88F8A8D4 for ; Fri, 7 May 2021 15:06:01 +0000 (UTC) X-FDA: 78114760122.24.56A8ECE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf30.hostedemail.com (Postfix) with ESMTP id D1564E00011F for ; Fri, 7 May 2021 15:05:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620399960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=/DB65YNoEgxP5HpgVRVZmEz1jbZGLdQwB91acLC9QWI=; b=cnzudtUVV+nkDuz0UhaLZOu0WLqmVRfgI6INWNbg3Y8r8PkoztaK4uhFTSmSKEpp5eiPg7 c3B6AjqNM66+OayKofz/iDcEYqRH1HPSEEn8Ji4HyyaxiVRQqPVkrQ5tVpeDKpiOXc2YhZ fOneKj4bsm3JtPyyExDpNPX7lM+cnoM= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-174-hkHNuKO_Nfe9KtGZbJmr2g-1; Fri, 07 May 2021 11:05:59 -0400 X-MC-Unique: hkHNuKO_Nfe9KtGZbJmr2g-1 Received: by mail-qv1-f72.google.com with SMTP id r11-20020a0cb28b0000b02901c87a178503so6777346qve.22 for ; Fri, 07 May 2021 08:05:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=jme+1muNnSxmiWot4aHYQfofdaH3rlfDZZkhrxiBlfE=; b=fKk+cwvlxcGVqoBREdisnjZXjfxuznNhqVbRJ71FSYITbVbuqTnOLqA9HvN8vRgcdj z5dvhq5Hz3ZPQBJ7yoWi660oPfjHeONAKqAY4sC1R6fAy0DxBxppVhUYoAdlyr89mewv CG0u1CB8GUuFhaRjeY6mlrRxcCT+W1fgpP8FU2CxJb/OXip0xonnMaUtEwmiL5wa7B8Q pcW2eOtWiKLdQm/4SUDKp6aQbfemb8t3eHtA9GgqzqM/TJweQBwXDPhA7rCWVrhna1mf cWuupFjaW9aogGNCUDzH7uUAvja7K4Eo0A+SppEyttOgB7n1kU/QTcrgzuVA1pAdUsEP iTwQ== X-Gm-Message-State: AOAM530/92zCDdkS6nQc5uUIjJ1pQAzClJOSQ3hTkjcKn9kgLl0SaVWD ICCipN369AlE+UwikmawAVo4XRseCot0xqNp4xCbQUTKXEB28Lq4Gfssggl0+2XDMtMQ1/ZXMI2 F/XkEHWYJh46h+7EGygtk71fjAl1WcdRj18zcE3U1tjp4EKsBIf2UjUrSLgJz X-Received: by 2002:a0c:8521:: with SMTP id n30mr10355697qva.53.1620399958265; Fri, 07 May 2021 08:05:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6iNpXW8zu2lVEvglmvL2SUsll0B2uc4sU/K5mc6vFGwxy9TWhQtnJxZful8KoM1m0JooOGw== X-Received: by 2002:a0c:8521:: with SMTP id n30mr10355618qva.53.1620399957696; Fri, 07 May 2021 08:05:57 -0700 (PDT) Received: from t490s.redhat.com (bras-base-toroon474qw-grc-72-184-145-4-219.dsl.bell.ca. [184.145.4.219]) by smtp.gmail.com with ESMTPSA id c141sm950456qke.12.2021.05.07.08.05.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 May 2021 08:05:54 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Jan Kara , John Hubbard , peterx@redhat.com, Linus Torvalds , Michal Hocko , Kirill Tkhai , Kirill Shutemov , Oleg Nesterov , Andrew Morton , Jann Horn , Andrea Arcangeli , Jason Gunthorpe , Matthew Wilcox , Hugh Dickins Subject: [PATCH v2 0/3] mm/gup: Fix pin page write cache bouncing on has_pinned Date: Fri, 7 May 2021 11:05:50 -0400 Message-Id: <20210507150553.208763-1-peterx@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D1564E00011F X-Stat-Signature: sjinkm6axmbiz9u7h3ncsnn3c8aqubiq Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cnzudtUV; spf=none (imf30.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620399934-536392 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: v2:=0D - patch 1: rename s/threads/nthreads/; assert() on pthread create/destroy [= John]=0D - patch 2:=0D - rewrite commit message [John, Linus]=0D - use parentheses [Linus]=0D - patch 3:=0D - define mm_set_has_pinned_flag() helper and use it [John, Linus, Matthew= ]=0D - keep has_pinned comment but move to MMF_HAS_PINNED [John]=0D =0D This series contains 3 patches, the 1st one enables threading for gup_bench= mark=0D in the kselftest. The latter two patches are collected from Andrea's local= =0D branch which can fix write cache bouncing issue with pinning fast-gup.=0D =0D To be explicit on the latter two patches:=0D =0D - the 2nd patch fixes the perf degrade when introducing has_pinned, then= =0D =0D - the last patch tries to remove the has_pinned with a bit in mm->flags= =0D =0D For patch 3: originally I think we had a plan to reuse has_pinned into a=0D counter very soon, however that's not happening at least until today, so ma= ybe=0D it proves that we can remove it until we really want such a counter for=0D whatever reason. As the commit message stated, it saves 4 bytes for each m= m=0D without observable regressions.=0D =0D Regarding testing: we can reference to the commit message of patch 2 for so= me=0D detailed testing with will-is-scale. Meanwhile I did patch 1 just because = then=0D we can even easily verify the patchset using the existing kselftest facilit= ies=0D or even regress test it in the future with the repo if we want.=0D =0D Below numbers are extra verification tests that I did besides commit messag= e of=0D patch 2 using the new gup_benchmark and 256 cpus. Below test is done on 40= =0D cpus host with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz, and I can get sim= ilar=0D result (of course the write cache bouncing get severe with even more cores)= .=0D =0D After patch 1 applied (only test patch, so using old kernel):=0D =0D $ sudo chrt -f 1 ./gup_test -a -m 512 -j 40=0D PIN_FAST_BENCHMARK: Time: get:459632 put:5990 us=0D PIN_FAST_BENCHMARK: Time: get:461967 put:5840 us=0D PIN_FAST_BENCHMARK: Time: get:464521 put:6140 us=0D PIN_FAST_BENCHMARK: Time: get:465176 put:7100 us=0D PIN_FAST_BENCHMARK: Time: get:465960 put:6733 us=0D PIN_FAST_BENCHMARK: Time: get:465324 put:6781 us=0D PIN_FAST_BENCHMARK: Time: get:466018 put:7130 us=0D PIN_FAST_BENCHMARK: Time: get:466362 put:7118 us=0D PIN_FAST_BENCHMARK: Time: get:465118 put:6975 us=0D PIN_FAST_BENCHMARK: Time: get:466422 put:6602 us=0D PIN_FAST_BENCHMARK: Time: get:465791 put:6818 us=0D PIN_FAST_BENCHMARK: Time: get:467091 put:6298 us=0D PIN_FAST_BENCHMARK: Time: get:467694 put:5432 us=0D PIN_FAST_BENCHMARK: Time: get:469575 put:5581 us=0D PIN_FAST_BENCHMARK: Time: get:468124 put:6055 us=0D PIN_FAST_BENCHMARK: Time: get:468877 put:6720 us=0D PIN_FAST_BENCHMARK: Time: get:467212 put:4961 us=0D PIN_FAST_BENCHMARK: Time: get:467834 put:6697 us=0D PIN_FAST_BENCHMARK: Time: get:470778 put:6398 us=0D PIN_FAST_BENCHMARK: Time: get:469788 put:6310 us=0D PIN_FAST_BENCHMARK: Time: get:488277 put:7113 us=0D PIN_FAST_BENCHMARK: Time: get:486613 put:7085 us=0D PIN_FAST_BENCHMARK: Time: get:486940 put:7202 us=0D PIN_FAST_BENCHMARK: Time: get:488728 put:7101 us=0D PIN_FAST_BENCHMARK: Time: get:487570 put:7327 us=0D PIN_FAST_BENCHMARK: Time: get:489260 put:7027 us=0D PIN_FAST_BENCHMARK: Time: get:488846 put:6866 us=0D PIN_FAST_BENCHMARK: Time: get:488521 put:6745 us=0D PIN_FAST_BENCHMARK: Time: get:489950 put:6459 us=0D PIN_FAST_BENCHMARK: Time: get:489777 put:6617 us=0D PIN_FAST_BENCHMARK: Time: get:488224 put:6591 us=0D PIN_FAST_BENCHMARK: Time: get:488644 put:6477 us=0D PIN_FAST_BENCHMARK: Time: get:488754 put:6711 us=0D PIN_FAST_BENCHMARK: Time: get:488875 put:6743 us=0D PIN_FAST_BENCHMARK: Time: get:489290 put:6657 us=0D PIN_FAST_BENCHMARK: Time: get:490264 put:6684 us=0D PIN_FAST_BENCHMARK: Time: get:489631 put:6737 us=0D PIN_FAST_BENCHMARK: Time: get:488434 put:6655 us=0D PIN_FAST_BENCHMARK: Time: get:492213 put:6297 us=0D PIN_FAST_BENCHMARK: Time: get:491124 put:6173 us=0D =0D After the whole series applied (new fixed kernel):=0D =0D $ sudo chrt -f 1 ./gup_test -a -m 512 -j 40=0D PIN_FAST_BENCHMARK: Time: get:82038 put:7041 us=0D PIN_FAST_BENCHMARK: Time: get:82144 put:6817 us=0D PIN_FAST_BENCHMARK: Time: get:83417 put:6674 us=0D PIN_FAST_BENCHMARK: Time: get:82540 put:6594 us=0D PIN_FAST_BENCHMARK: Time: get:83214 put:6681 us=0D PIN_FAST_BENCHMARK: Time: get:83444 put:6889 us=0D PIN_FAST_BENCHMARK: Time: get:83194 put:7499 us=0D PIN_FAST_BENCHMARK: Time: get:84876 put:7369 us=0D PIN_FAST_BENCHMARK: Time: get:86092 put:10289 us=0D PIN_FAST_BENCHMARK: Time: get:86153 put:10415 us=0D PIN_FAST_BENCHMARK: Time: get:85026 put:7751 us=0D PIN_FAST_BENCHMARK: Time: get:85458 put:7944 us=0D PIN_FAST_BENCHMARK: Time: get:85735 put:8154 us=0D PIN_FAST_BENCHMARK: Time: get:85851 put:8299 us=0D PIN_FAST_BENCHMARK: Time: get:86323 put:9617 us=0D PIN_FAST_BENCHMARK: Time: get:86288 put:10496 us=0D PIN_FAST_BENCHMARK: Time: get:87697 put:9346 us=0D PIN_FAST_BENCHMARK: Time: get:87980 put:8382 us=0D PIN_FAST_BENCHMARK: Time: get:88719 put:8400 us=0D PIN_FAST_BENCHMARK: Time: get:87616 put:8588 us=0D PIN_FAST_BENCHMARK: Time: get:86730 put:9563 us=0D PIN_FAST_BENCHMARK: Time: get:88167 put:8673 us=0D PIN_FAST_BENCHMARK: Time: get:86844 put:9777 us=0D PIN_FAST_BENCHMARK: Time: get:88068 put:11774 us=0D PIN_FAST_BENCHMARK: Time: get:86170 put:15676 us=0D PIN_FAST_BENCHMARK: Time: get:87967 put:12827 us=0D PIN_FAST_BENCHMARK: Time: get:95773 put:7652 us=0D PIN_FAST_BENCHMARK: Time: get:87734 put:13650 us=0D PIN_FAST_BENCHMARK: Time: get:89833 put:14237 us=0D PIN_FAST_BENCHMARK: Time: get:96186 put:8029 us=0D PIN_FAST_BENCHMARK: Time: get:95532 put:8886 us=0D PIN_FAST_BENCHMARK: Time: get:95351 put:5826 us=0D PIN_FAST_BENCHMARK: Time: get:96401 put:8407 us=0D PIN_FAST_BENCHMARK: Time: get:96473 put:8287 us=0D PIN_FAST_BENCHMARK: Time: get:97177 put:8430 us=0D PIN_FAST_BENCHMARK: Time: get:98120 put:5263 us=0D PIN_FAST_BENCHMARK: Time: get:96271 put:7757 us=0D PIN_FAST_BENCHMARK: Time: get:99628 put:10467 us=0D PIN_FAST_BENCHMARK: Time: get:99344 put:10045 us=0D PIN_FAST_BENCHMARK: Time: get:94212 put:15485 us=0D =0D Summary:=0D =0D Old kernel: 477729.97 (+-3.79%)=0D New kernel: 89144.65 (+-11.76%)=0D =0D I'm not sure whether I should add Fixes for patch 2. If to add it'll be:= =0D =0D Fixes: 008cfe4418b3d ("mm: Introduce mm_struct.has_pinned")=0D =0D Then cc stable for 5.9+. However I'll skip adding it if no one asks, as th= is=0D is a perf fix, and frequent+concurrent pinning should not really happen tha= t much.=0D =0D Please review, thanks.=0D =0D Andrea Arcangeli (2):=0D mm: gup: allow FOLL_PIN to scale in SMP=0D mm: gup: pack has_pinned in MMF_HAS_PINNED=0D =0D Peter Xu (1):=0D mm/gup_benchmark: Support threading=0D =0D fs/proc/task_mmu.c | 2 +-=0D include/linux/mm.h | 2 +-=0D include/linux/mm_types.h | 10 ---=0D include/linux/sched/coredump.h | 8 +++=0D kernel/fork.c | 1 -=0D mm/gup.c | 15 ++++-=0D tools/testing/selftests/vm/gup_test.c | 96 ++++++++++++++++++---------=0D 7 files changed, 88 insertions(+), 46 deletions(-)=0D =0D --=20=0D 2.31.1=0D =0D