From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id DCE9C222F4E01 for ; Fri, 22 Dec 2017 14:27:06 -0800 (PST) Date: Fri, 22 Dec 2017 15:31:54 -0700 From: Ross Zwisler Subject: Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT Message-ID: <20171222223154.GC25711@linux.intel.com> References: <20171214021019.13579-1-ross.zwisler@linux.intel.com> <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Anshuman Khandual Cc: "Box, David E" , Dave Hansen , "Zheng, Lv" , linux-nvdimm@lists.01.org, "Rafael J. Wysocki" , Anaczkowski,, Robert, Lukasz, "Erik , Len Brown" , John Hubbard , Jerome Glisse , devel@acpica.org, Kogut,, "Marcin , Brice Goglin , Nachimuthu, Murugasamy" , "Rafael J. Wysocki" , linux-kernel@vger.kernel.org, Koziej,, "Joonas , Andrew Morton , Tim Chen" List-ID: On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote: > On 12/14/2017 07:40 AM, Ross Zwisler wrote: <> > > We solve this issue by providing userspace with performance information on > > individual memory ranges. This performance information is exposed via > > sysfs: > > > > # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null > > mem_tgt2/firmware_id:1 > > mem_tgt2/is_cached:0 > > mem_tgt2/local_init/read_bw_MBps:40960 > > mem_tgt2/local_init/read_lat_nsec:50 > > mem_tgt2/local_init/write_bw_MBps:40960 > > mem_tgt2/local_init/write_lat_nsec:50 <> > We will enlist properties for all possible "source --> target" on the system? Nope, just 'local' initiator/target pairs. I talk about the reasoning for this in the cover letter for patch 3: https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html > Right now it shows only bandwidth and latency properties, can it accommodate > other properties as well in future ? We also have an 'is_cached' attribute for the memory targets if they are involved in a caching hierarchy, but right now those are all the things we expose. We can potentially expose whatever we want that is present in the HMAT, but those seemed like a good start. I noticed that in your presentation you had some other examples of attributes you cared about: * reliability * power consumption * density The HMAT doesn't provide this sort of information at present, but we could/would add them to sysfs if the HMAT ever grew support for them. > > This allows applications to easily find the memory that they want to use. > > We expect that the existing NUMA APIs will be enhanced to use this new > > information so that applications can continue to use them to select their > > desired memory. > > I had presented a proposal for NUMA redesign in the Plumbers Conference this > year where various memory devices with different kind of memory attributes > can be represented in the kernel and be used explicitly from the user space. > Here is the link to the proposal if you feel interested. The proposal is > very intrusive and also I dont have a RFC for it yet for discussion here. > > https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf > > Problem is, designing the sysfs interface for memory attribute detection > from user space without first thinking about redesigning the NUMA for > heterogeneous memory may not be a good idea. Will look into this further. I took another look at your presentation, and overall I think that if/when a NUMA redesign like this takes place ACPI systems with HMAT tables will be able to participate. But I think we are probably a ways away from that, and like I said in my previous mail ACPI systems with memory-only NUMA nodes are going to exist and need to be supported with the current NUMA scheme. Hence I don't think that this patch series conflicts with your proposal. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT Date: Fri, 22 Dec 2017 15:31:54 -0700 Message-ID: <20171222223154.GC25711@linux.intel.com> References: <20171214021019.13579-1-ross.zwisler@linux.intel.com> <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mga03.intel.com ([134.134.136.65]:58535 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756425AbdLVWb5 (ORCPT ); Fri, 22 Dec 2017 17:31:57 -0500 Content-Disposition: inline In-Reply-To: <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> Sender: linux-acpi-owner@vger.kernel.org List-Id: linux-acpi@vger.kernel.org To: Anshuman Khandual Cc: Ross Zwisler , linux-kernel@vger.kernel.org, "Anaczkowski, Lukasz" , "Box, David E" , "Kogut, Jaroslaw" , "Koss, Marcin" , "Koziej, Artur" , "Lahtinen, Joonas" , "Moore, Robert" , "Nachimuthu, Murugasamy" , "Odzioba, Lukasz" , "Rafael J. Wysocki" , "Rafael J. Wysocki" , "Schmauss, Erik" , "Verma, Vishal L" , "Zheng, Lv" , Andrew Morton , Balbir On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote: > On 12/14/2017 07:40 AM, Ross Zwisler wrote: <> > > We solve this issue by providing userspace with performance information on > > individual memory ranges. This performance information is exposed via > > sysfs: > > > > # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null > > mem_tgt2/firmware_id:1 > > mem_tgt2/is_cached:0 > > mem_tgt2/local_init/read_bw_MBps:40960 > > mem_tgt2/local_init/read_lat_nsec:50 > > mem_tgt2/local_init/write_bw_MBps:40960 > > mem_tgt2/local_init/write_lat_nsec:50 <> > We will enlist properties for all possible "source --> target" on the system? Nope, just 'local' initiator/target pairs. I talk about the reasoning for this in the cover letter for patch 3: https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html > Right now it shows only bandwidth and latency properties, can it accommodate > other properties as well in future ? We also have an 'is_cached' attribute for the memory targets if they are involved in a caching hierarchy, but right now those are all the things we expose. We can potentially expose whatever we want that is present in the HMAT, but those seemed like a good start. I noticed that in your presentation you had some other examples of attributes you cared about: * reliability * power consumption * density The HMAT doesn't provide this sort of information at present, but we could/would add them to sysfs if the HMAT ever grew support for them. > > This allows applications to easily find the memory that they want to use. > > We expect that the existing NUMA APIs will be enhanced to use this new > > information so that applications can continue to use them to select their > > desired memory. > > I had presented a proposal for NUMA redesign in the Plumbers Conference this > year where various memory devices with different kind of memory attributes > can be represented in the kernel and be used explicitly from the user space. > Here is the link to the proposal if you feel interested. The proposal is > very intrusive and also I dont have a RFC for it yet for discussion here. > > https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf > > Problem is, designing the sysfs interface for memory attribute detection > from user space without first thinking about redesigning the NUMA for > heterogeneous memory may not be a good idea. Will look into this further. I took another look at your presentation, and overall I think that if/when a NUMA redesign like this takes place ACPI systems with HMAT tables will be able to participate. But I think we are probably a ways away from that, and like I said in my previous mail ACPI systems with memory-only NUMA nodes are going to exist and need to be supported with the current NUMA scheme. Hence I don't think that this patch series conflicts with your proposal. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756868AbdLVWcC (ORCPT ); Fri, 22 Dec 2017 17:32:02 -0500 Received: from mga03.intel.com ([134.134.136.65]:58535 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756425AbdLVWb5 (ORCPT ); Fri, 22 Dec 2017 17:31:57 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,442,1508828400"; d="scan'208";a="20534151" Date: Fri, 22 Dec 2017 15:31:54 -0700 From: Ross Zwisler To: Anshuman Khandual Cc: Ross Zwisler , linux-kernel@vger.kernel.org, "Anaczkowski, Lukasz" , "Box, David E" , "Kogut, Jaroslaw" , "Koss, Marcin" , "Koziej, Artur" , "Lahtinen, Joonas" , "Moore, Robert" , "Nachimuthu, Murugasamy" , "Odzioba, Lukasz" , "Rafael J. Wysocki" , "Rafael J. Wysocki" , "Schmauss, Erik" , "Verma, Vishal L" , "Zheng, Lv" , Andrew Morton , Balbir Singh , Brice Goglin , Dan Williams , Dave Hansen , Jerome Glisse , John Hubbard , Len Brown , Tim Chen , devel@acpica.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Subject: Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT Message-ID: <20171222223154.GC25711@linux.intel.com> References: <20171214021019.13579-1-ross.zwisler@linux.intel.com> <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote: > On 12/14/2017 07:40 AM, Ross Zwisler wrote: <> > > We solve this issue by providing userspace with performance information on > > individual memory ranges. This performance information is exposed via > > sysfs: > > > > # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null > > mem_tgt2/firmware_id:1 > > mem_tgt2/is_cached:0 > > mem_tgt2/local_init/read_bw_MBps:40960 > > mem_tgt2/local_init/read_lat_nsec:50 > > mem_tgt2/local_init/write_bw_MBps:40960 > > mem_tgt2/local_init/write_lat_nsec:50 <> > We will enlist properties for all possible "source --> target" on the system? Nope, just 'local' initiator/target pairs. I talk about the reasoning for this in the cover letter for patch 3: https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html > Right now it shows only bandwidth and latency properties, can it accommodate > other properties as well in future ? We also have an 'is_cached' attribute for the memory targets if they are involved in a caching hierarchy, but right now those are all the things we expose. We can potentially expose whatever we want that is present in the HMAT, but those seemed like a good start. I noticed that in your presentation you had some other examples of attributes you cared about: * reliability * power consumption * density The HMAT doesn't provide this sort of information at present, but we could/would add them to sysfs if the HMAT ever grew support for them. > > This allows applications to easily find the memory that they want to use. > > We expect that the existing NUMA APIs will be enhanced to use this new > > information so that applications can continue to use them to select their > > desired memory. > > I had presented a proposal for NUMA redesign in the Plumbers Conference this > year where various memory devices with different kind of memory attributes > can be represented in the kernel and be used explicitly from the user space. > Here is the link to the proposal if you feel interested. The proposal is > very intrusive and also I dont have a RFC for it yet for discussion here. > > https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf > > Problem is, designing the sysfs interface for memory attribute detection > from user space without first thinking about redesigning the NUMA for > heterogeneous memory may not be a good idea. Will look into this further. I took another look at your presentation, and overall I think that if/when a NUMA redesign like this takes place ACPI systems with HMAT tables will be able to participate. But I think we are probably a ways away from that, and like I said in my previous mail ACPI systems with memory-only NUMA nodes are going to exist and need to be supported with the current NUMA scheme. Hence I don't think that this patch series conflicts with your proposal. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl0-f72.google.com (mail-pl0-f72.google.com [209.85.160.72]) by kanga.kvack.org (Postfix) with ESMTP id 866176B0038 for ; Fri, 22 Dec 2017 17:31:58 -0500 (EST) Received: by mail-pl0-f72.google.com with SMTP id x1so14321897plb.2 for ; Fri, 22 Dec 2017 14:31:58 -0800 (PST) Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id f129si3309239pgc.402.2017.12.22.14.31.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 22 Dec 2017 14:31:57 -0800 (PST) Date: Fri, 22 Dec 2017 15:31:54 -0700 From: Ross Zwisler Subject: Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT Message-ID: <20171222223154.GC25711@linux.intel.com> References: <20171214021019.13579-1-ross.zwisler@linux.intel.com> <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Anshuman Khandual Cc: Ross Zwisler , linux-kernel@vger.kernel.org, "Anaczkowski, Lukasz" , "Box, David E" , "Kogut, Jaroslaw" , "Koss, Marcin" , "Koziej, Artur" , "Lahtinen, Joonas" , "Moore, Robert" , "Nachimuthu, Murugasamy" , "Odzioba, Lukasz" , "Rafael J. Wysocki" , "Rafael J. Wysocki" , "Schmauss, Erik" , "Verma, Vishal L" , "Zheng, Lv" , Andrew Morton , Balbir Singh , Brice Goglin , Dan Williams , Dave Hansen , Jerome Glisse , John Hubbard , Len Brown , Tim Chen , devel@acpica.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote: > On 12/14/2017 07:40 AM, Ross Zwisler wrote: <> > > We solve this issue by providing userspace with performance information on > > individual memory ranges. This performance information is exposed via > > sysfs: > > > > # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null > > mem_tgt2/firmware_id:1 > > mem_tgt2/is_cached:0 > > mem_tgt2/local_init/read_bw_MBps:40960 > > mem_tgt2/local_init/read_lat_nsec:50 > > mem_tgt2/local_init/write_bw_MBps:40960 > > mem_tgt2/local_init/write_lat_nsec:50 <> > We will enlist properties for all possible "source --> target" on the system? Nope, just 'local' initiator/target pairs. I talk about the reasoning for this in the cover letter for patch 3: https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html > Right now it shows only bandwidth and latency properties, can it accommodate > other properties as well in future ? We also have an 'is_cached' attribute for the memory targets if they are involved in a caching hierarchy, but right now those are all the things we expose. We can potentially expose whatever we want that is present in the HMAT, but those seemed like a good start. I noticed that in your presentation you had some other examples of attributes you cared about: * reliability * power consumption * density The HMAT doesn't provide this sort of information at present, but we could/would add them to sysfs if the HMAT ever grew support for them. > > This allows applications to easily find the memory that they want to use. > > We expect that the existing NUMA APIs will be enhanced to use this new > > information so that applications can continue to use them to select their > > desired memory. > > I had presented a proposal for NUMA redesign in the Plumbers Conference this > year where various memory devices with different kind of memory attributes > can be represented in the kernel and be used explicitly from the user space. > Here is the link to the proposal if you feel interested. The proposal is > very intrusive and also I dont have a RFC for it yet for discussion here. > > https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf > > Problem is, designing the sysfs interface for memory attribute detection > from user space without first thinking about redesigning the NUMA for > heterogeneous memory may not be a good idea. Will look into this further. I took another look at your presentation, and overall I think that if/when a NUMA redesign like this takes place ACPI systems with HMAT tables will be able to participate. But I think we are probably a ways away from that, and like I said in my previous mail ACPI systems with memory-only NUMA nodes are going to exist and need to be supported with the current NUMA scheme. Hence I don't think that this patch series conflicts with your proposal. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============3584464407775001216==" MIME-Version: 1.0 From: Ross Zwisler Subject: Re: [Devel] [PATCH v3 0/3] create sysfs representation of ACPI HMAT Date: Fri, 22 Dec 2017 15:31:54 -0700 Message-ID: <20171222223154.GC25711@linux.intel.com> In-Reply-To: 2d6420f7-0a95-adfe-7390-a2aea4385ab2@linux.vnet.ibm.com List-ID: To: devel@acpica.org --===============3584464407775001216== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote: > On 12/14/2017 07:40 AM, Ross Zwisler wrote: <> > > We solve this issue by providing userspace with performance information= on > > individual memory ranges. This performance information is exposed via > > sysfs: > > = > > # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null > > mem_tgt2/firmware_id:1 > > mem_tgt2/is_cached:0 > > mem_tgt2/local_init/read_bw_MBps:40960 > > mem_tgt2/local_init/read_lat_nsec:50 > > mem_tgt2/local_init/write_bw_MBps:40960 > > mem_tgt2/local_init/write_lat_nsec:50 <> > We will enlist properties for all possible "source --> target" on the sys= tem? Nope, just 'local' initiator/target pairs. I talk about the reasoning for this in the cover letter for patch 3: https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html > Right now it shows only bandwidth and latency properties, can it accommod= ate > other properties as well in future ? We also have an 'is_cached' attribute for the memory targets if they are involved in a caching hierarchy, but right now those are all the things we expose. We can potentially expose whatever we want that is present in the HMAT, but those seemed like a good start. I noticed that in your presentation you had some other examples of attribut= es you cared about: * reliability * power consumption * density The HMAT doesn't provide this sort of information at present, but we could/would add them to sysfs if the HMAT ever grew support for them. > > This allows applications to easily find the memory that they want to us= e. > > We expect that the existing NUMA APIs will be enhanced to use this new > > information so that applications can continue to use them to select the= ir > > desired memory. > = > I had presented a proposal for NUMA redesign in the Plumbers Conference t= his > year where various memory devices with different kind of memory attributes > can be represented in the kernel and be used explicitly from the user spa= ce. > Here is the link to the proposal if you feel interested. The proposal is > very intrusive and also I dont have a RFC for it yet for discussion here. > = > https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/origina= l/Hierarchical_NUMA_Design_Plumbers_2017.pdf > = > Problem is, designing the sysfs interface for memory attribute detection > from user space without first thinking about redesigning the NUMA for > heterogeneous memory may not be a good idea. Will look into this further. I took another look at your presentation, and overall I think that if/when a NUMA redesign like this takes place ACPI systems with HMAT tables will be a= ble to participate. But I think we are probably a ways away from that, and lik= e I said in my previous mail ACPI systems with memory-only NUMA nodes are going= to exist and need to be supported with the current NUMA scheme. Hence I don't think that this patch series conflicts with your proposal. --===============3584464407775001216==--