From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22C39C433E3 for ; Sun, 26 Jul 2020 17:42:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F1DBC206D8 for ; Sun, 26 Jul 2020 17:42:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F1DBC206D8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=bugs.launchpad.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42844 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jzkf2-0005Ds-78 for qemu-devel@archiver.kernel.org; Sun, 26 Jul 2020 13:42:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38192) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jzkdv-0004OW-1o for qemu-devel@nongnu.org; Sun, 26 Jul 2020 13:41:00 -0400 Received: from indium.canonical.com ([91.189.90.7]:50078) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jzkdt-0001MI-5F for qemu-devel@nongnu.org; Sun, 26 Jul 2020 13:40:58 -0400 Received: from loganberry.canonical.com ([91.189.90.37]) by indium.canonical.com with esmtp (Exim 4.86_2 #2 (Debian)) id 1jzkdr-0004VR-Jg for ; Sun, 26 Jul 2020 17:40:55 +0000 Received: from loganberry.canonical.com (localhost [127.0.0.1]) by loganberry.canonical.com (Postfix) with ESMTP id 931112E8025 for ; Sun, 26 Jul 2020 17:40:55 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Date: Sun, 26 Jul 2020 17:30:46 -0000 From: Heiko Sieger <1856335@bugs.launchpad.net> To: qemu-devel@nongnu.org X-Launchpad-Notification-Type: bug X-Launchpad-Bug: product=qemu; status=New; importance=Undecided; assignee=None; X-Launchpad-Bug-Information-Type: Public X-Launchpad-Bug-Private: no X-Launchpad-Bug-Security-Vulnerability: no X-Launchpad-Bug-Commenters: babumoger djdatte h-sieger janklos sanjaybmd X-Launchpad-Bug-Reporter: Damir (djdatte) X-Launchpad-Bug-Modifier: Heiko Sieger (h-sieger) References: <157625616239.22064.10423897892496347105.malonedeb@gac.canonical.com> Message-Id: <159578464674.4721.18077055128270760457.malone@wampee.canonical.com> Subject: [Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs X-Launchpad-Message-Rationale: Subscriber (QEMU) @qemu-devel-ml X-Launchpad-Message-For: qemu-devel-ml Precedence: bulk X-Generated-By: Launchpad (canonical.com); Revision="e85d0ab92e2924d39b8285aeae075a01d25eff06"; Instance="production-secrets-lazr.conf" X-Launchpad-Hash: 0e93664379977cb28080893fbe9f8f83329f3cdf Received-SPF: none client-ip=91.189.90.7; envelope-from=bounces@canonical.com; helo=indium.canonical.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/26 13:40:56 X-ACL-Warn: Detected OS = Linux 3.11 and newer [fuzzy] X-Spam_score_int: -58 X-Spam_score: -5.9 X-Spam_bar: ----- X-Spam_report: (-5.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Bug 1856335 <1856335@bugs.launchpad.net> Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" @sanjaybmd I'm glad to read that it worked for you. In fact, since I posted the XML I didn't have the time to do benchmarking, now my motherboard is dead and I have to wait for repair/replacement. Do you have any data to quantify the performance gain? As to the number of cores, you will notice that my 3900X has only 12 physical cores, that is 24 threads. Yet I assigned 32 vcpus in total. 8 of them are disabled. This is to align the vcpus to the actual CCX topology of 3 cores per CCX. QEMU thinks the cores per CCX should be a multiple of 2, e.g. 2, 4, etc. cores. So I assign 4 cores =3D 8 vcpus, and disable 2 vcpus to simulate the actual topology. If your CPU has more cores, you could scale it up. Be aware that the 3950X should not have this issue as it has 4 cores per CCX, if I remember correctly. Note: I took this idea from a Reddit post (see link somewhere above). -- = You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1856335 Title: Cache Layout wrong on many Zen Arch CPUs Status in QEMU: New Bug description: AMD CPUs have L3 cache per 2, 3 or 4 cores. Currently, TOPOEXT seems to always map Cache ass if it was an 4-Core per CCX CPU, which is incorrect, and costs upwards 30% performance (more realistically 10%) in L3 Cache Layout aware applications. Example on a 4-CCX CPU (1950X /w 8 Cores and no SMT): =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0EPYC-IBPB =C2=A0=C2=A0=C2=A0=C2=A0AMD =C2=A0=C2=A0=C2=A0=C2=A0 In windows, coreinfo reports correctly: ****---- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64 ----**** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64 On a 3-CCX CPU (3960X /w 6 cores and no SMT): =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0EPYC-IBPB =C2=A0=C2=A0=C2=A0=C2=A0AMD =C2=A0=C2=A0=C2=A0=C2=A0 in windows, coreinfo reports incorrectly: ****-- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64 ----** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64 Validated against 3.0, 3.1, 4.1 and 4.2 versions of qemu-kvm. With newer Qemu there is a fix (that does behave correctly) in using the = dies parameter: =C2=A0 The problem is that the dies are exposed differently than how AMD does it natively, they are exposed to Windows as sockets, which means, that if you are nto a business user, you can't ever have a machine with more than two CCX (6 cores) as consumer versions of Windows only supports two sockets. (Should this be reported as a separate bug?) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1856335/+subscriptions