From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A369C433F5 for ; Wed, 4 May 2022 17:02:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F9E96B0073; Wed, 4 May 2022 13:02:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A9056B0074; Wed, 4 May 2022 13:02:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 049296B0075; Wed, 4 May 2022 13:02:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EC9AD6B0073 for ; Wed, 4 May 2022 13:02:25 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BED132BFAC for ; Wed, 4 May 2022 17:02:25 +0000 (UTC) X-FDA: 79428679050.13.B4F75A2 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf23.hostedemail.com (Postfix) with ESMTP id EB6B814008F for ; Wed, 4 May 2022 17:02:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651683744; x=1683219744; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=CTg7Xyv27hMl+wBuAu9Sk2WDDBF7tqVYHYYBMKvmxj0=; b=ecIiuKUjS+Bt27cy/ongaTX34JTfATA9zZwlyPG4vyfFlbpmiW+h5Ssy zgqW8c60l+CDgbkpYUE4Td9jy/mpfIUp4lZNDrHby8CVU+O7eNF6mVZs8 aX22pq9mJYQP1DJ3gaTaKoEdvd54qEeV7qbE58vTw0C9uX/MQiJzrkq1F lQniZNVAsSrUZ5bVSw4FPhuf3tEndaqaMczlW3fkxk5QjrG6XhNBSBHxL cU3tKapZU6cniugP5KqHgoFYt6A4MeI7UB9RXrEkYfv8gIEizSDZ5C8qr 7O+Dp88OZdHjZdXmTib6y4JYz7ig2krE/ewAKR4UnClalDNLs/iDSH9qZ g==; X-IronPort-AV: E=McAfee;i="6400,9594,10337"; a="330814337" X-IronPort-AV: E=Sophos;i="5.91,198,1647327600"; d="scan'208";a="330814337" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 May 2022 10:02:02 -0700 X-IronPort-AV: E=Sophos;i="5.91,198,1647327600"; d="scan'208";a="562814255" Received: from jrhamric-mobl.amr.corp.intel.com (HELO [10.212.121.177]) ([10.212.121.177]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 May 2022 10:02:00 -0700 Message-ID: <52541497-c097-5a51-4718-feed13660255@intel.com> Date: Wed, 4 May 2022 10:02:20 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: RFC: Memory Tiering Kernel Interfaces Content-Language: en-US To: Wei Xu Cc: Alistair Popple , Davidlohr Bueso , Andrew Morton , Dave Hansen , Huang Ying , Dan Williams , Yang Shi , Linux MM , Greg Thelen , "Aneesh Kumar K.V" , Jagdish Gediya , Linux Kernel Mailing List , Michal Hocko , Baolin Wang , Brice Goglin , Feng Tang , Jonathan Cameron References: <20220501175813.tvytoosygtqlh3nn@offworld> <87o80eh65f.fsf@nvdebian.thelocal> <87mtfygoxs.fsf@nvdebian.thelocal> <9fb22767-54de-d316-7e6b-5aac375c9c49@intel.com> From: Dave Hansen In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: EB6B814008F X-Stat-Signature: 6yc91zomqfn6is5f59s9xacq5k81zp6t X-Rspam-User: Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ecIiuKUj; spf=none (imf23.hostedemail.com: domain of dave.hansen@intel.com has no SPF policy when checking 134.134.136.100) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam09 X-HE-Tag: 1651683734-778966 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/3/22 18:31, Wei Xu wrote: >> Well, x86 CPUs have performance monitoring hardware that can >> theoretically collect physical access information too. But, this >> performance monitoring hardware wasn't designed for this specific use >> case in mind. So, in practice, these events (PEBS) weren't very useful >> for driving memory tiering. > The PEBS events without any filtering might not be useful for memory > tiering, but the PEBS events with hardware-based data source filtering > can be useful in driving promotions in memory tiering. Certainly, > because these events are not designed for this specific use case in > mind, there are inefficiencies using them for memory tiering, e.g. > instead of just getting a heat counter for each hot page, we can get > events repeatedly on the hot pages. Also, I believe the addresses that come out of the PEBS events are virtual addresses (Data Linear Addresses according to the SDM). If the events are written from a KVM guest, you get guest linear addresses. That means a lot of page table and EPT walks to map those linear addresses back to physical. That adds to the inefficiency. In the end, you get big PEBS buffers with lots of irrelevant data that needs significant post-processing to make sense of it. The folks at Intel that tried this really struggled to take this mess and turn it into a successful hot-page tracking. Maybe someone else will find a better way to do it, but we tried and gave up.