linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liubo(OS Lab)" <liubo95@huawei.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: <linux-kernel@vger.kernel.org>,
	"Anaczkowski, Lukasz" <lukasz.anaczkowski@intel.com>,
	"Box, David E" <david.e.box@intel.com>,
	"Kogut, Jaroslaw" <Jaroslaw.Kogut@intel.com>,
	"Koss, Marcin" <marcin.koss@intel.com>,
	"Koziej, Artur" <artur.koziej@intel.com>,
	"Lahtinen, Joonas" <joonas.lahtinen@intel.com>,
	"Moore, Robert" <robert.moore@intel.com>,
	"Nachimuthu, Murugasamy" <murugasamy.nachimuthu@intel.com>,
	"Odzioba, Lukasz" <lukasz.odzioba@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	"Schmauss, Erik" <erik.schmauss@intel.com>,
	"Verma, Vishal L" <vishal.l.verma@intel.com>,
	"Zheng, Lv" <lv.zheng@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Balbir Singh <bsingharora@gmail.com>,
	Brice Goglin <brice.goglin@gmail.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Jerome Glisse <jglisse@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>, "Len Brown" <lenb@kernel.org>,
	Tim Chen <tim.c.chen@linux.intel.com>, <devel@acpica.org>,
	<linux-acpi@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-nvdimm@lists.01.org>
Subject: Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT
Date: Mon, 25 Dec 2017 10:05:01 +0800	[thread overview]
Message-ID: <794c531c-eb96-b6ef-97a1-64c81779dc72@huawei.com> (raw)
In-Reply-To: <20171222223154.GC25711@linux.intel.com>

On 2017/12/23 6:31, Ross Zwisler wrote:
> On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote:
>> On 12/14/2017 07:40 AM, Ross Zwisler wrote:
> <>
>>> We solve this issue by providing userspace with performance information on
>>> individual memory ranges.  This performance information is exposed via
>>> sysfs:
>>>
>>>   # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null
>>>   mem_tgt2/firmware_id:1
>>>   mem_tgt2/is_cached:0
>>>   mem_tgt2/local_init/read_bw_MBps:40960
>>>   mem_tgt2/local_init/read_lat_nsec:50
>>>   mem_tgt2/local_init/write_bw_MBps:40960
>>>   mem_tgt2/local_init/write_lat_nsec:50
> <>
>> We will enlist properties for all possible "source --> target" on the system?
> 
> Nope, just 'local' initiator/target pairs.  I talk about the reasoning for
> this in the cover letter for patch 3:
> 
> https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html
> 
>> Right now it shows only bandwidth and latency properties, can it accommodate
>> other properties as well in future ?
> 
> We also have an 'is_cached' attribute for the memory targets if they are
> involved in a caching hierarchy, but right now those are all the things we
> expose.  We can potentially expose whatever we want that is present in the
> HMAT, but those seemed like a good start.
> 
> I noticed that in your presentation you had some other examples of attributes
> you cared about:
> 
>  * reliability
>  * power consumption
>  * density
> 
> The HMAT doesn't provide this sort of information at present, but we
> could/would add them to sysfs if the HMAT ever grew support for them.
> 
>>> This allows applications to easily find the memory that they want to use.
>>> We expect that the existing NUMA APIs will be enhanced to use this new
>>> information so that applications can continue to use them to select their
>>> desired memory.
>>
>> I had presented a proposal for NUMA redesign in the Plumbers Conference this
>> year where various memory devices with different kind of memory attributes
>> can be represented in the kernel and be used explicitly from the user space.
>> Here is the link to the proposal if you feel interested. The proposal is
>> very intrusive and also I dont have a RFC for it yet for discussion here.
>>
>> https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf
>>
>> Problem is, designing the sysfs interface for memory attribute detection
>> from user space without first thinking about redesigning the NUMA for
>> heterogeneous memory may not be a good idea. Will look into this further.
> 
> I took another look at your presentation, and overall I think that if/when a
> NUMA redesign like this takes place ACPI systems with HMAT tables will be able
> to participate.  But I think we are probably a ways away from that, and like I

I'm afraid not, there are cache-coherent bus like CCIX/OpenCAPI come out soon.
No matter to say System-on-Chip already with internal bus linked DDR、HBM、CPU、Accelerator..

> said in my previous mail ACPI systems with memory-only NUMA nodes are going to
> exist and need to be supported with the current NUMA scheme.  Hence I don't

And not only memory-only, but the accelerators can also be a master like CPU.

> think that this patch series conflicts with your proposal.

Didn't see conflict neither, but perhaps we should think for a longer-term solution and cover more
situations/platforms.
Anshuman's proposal is really a good start point to us.

Cheers,
Bob Liu

      reply	other threads:[~2017-12-25  2:05 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-14  2:10 [PATCH v3 0/3] create sysfs representation of ACPI HMAT Ross Zwisler
2017-12-14  2:10 ` [PATCH v3 1/3] acpi: HMAT support in acpi_parse_entries_array() Ross Zwisler
2017-12-15  0:49   ` Rafael J. Wysocki
2017-12-15  1:10   ` Dan Williams
2017-12-16  1:53     ` Rafael J. Wysocki
2017-12-16  1:57       ` Dan Williams
2017-12-16  2:15         ` Rafael J. Wysocki
2017-12-14  2:10 ` [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support Ross Zwisler
2017-12-15  0:52   ` Rafael J. Wysocki
2017-12-15 20:53     ` Ross Zwisler
2017-12-14  2:10 ` [PATCH v3 3/3] hmat: add performance attributes Ross Zwisler
2017-12-14 13:00 ` [PATCH v3 0/3] create sysfs representation of ACPI HMAT Michal Hocko
2017-12-18 20:35   ` Ross Zwisler
2017-12-20 16:41     ` Ross Zwisler
2017-12-21 13:18       ` Michal Hocko
2017-12-20 18:19     ` Matthew Wilcox
2017-12-20 20:22       ` Dave Hansen
2017-12-20 21:16         ` Matthew Wilcox
2017-12-20 21:24           ` Ross Zwisler
2017-12-20 22:29             ` Dan Williams
2017-12-20 22:41               ` Ross Zwisler
2017-12-21 20:31                 ` Brice Goglin
2017-12-22 22:53                   ` Dan Williams
2017-12-22 23:22                     ` Ross Zwisler
2017-12-22 23:57                       ` Dan Williams
2017-12-23  1:14                         ` Rafael J. Wysocki
2017-12-27  9:10                     ` Brice Goglin
2017-12-30  6:58                       ` Matthew Wilcox
2017-12-30  9:19                         ` Brice Goglin
2017-12-20 21:13       ` Ross Zwisler
2017-12-21  1:41         ` Elliott, Robert (Persistent Memory)
2017-12-22 21:46           ` Ross Zwisler
2017-12-21 12:50       ` Michael Ellerman
2017-12-22  3:09 ` Anshuman Khandual
2017-12-22 10:31   ` Kogut, Jaroslaw
2017-12-22 14:37     ` Anshuman Khandual
2017-12-22 17:13   ` Dave Hansen
2017-12-23  5:14     ` Anshuman Khandual
2017-12-22 22:13   ` Ross Zwisler
2017-12-23  6:56     ` Anshuman Khandual
2017-12-22 22:31   ` Ross Zwisler
2017-12-25  2:05     ` Liubo(OS Lab) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=794c531c-eb96-b6ef-97a1-64c81779dc72@huawei.com \
    --to=liubo95@huawei.com \
    --cc=Jaroslaw.Kogut@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=artur.koziej@intel.com \
    --cc=brice.goglin@gmail.com \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david.e.box@intel.com \
    --cc=devel@acpica.org \
    --cc=erik.schmauss@intel.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joonas.lahtinen@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=lukasz.anaczkowski@intel.com \
    --cc=lukasz.odzioba@intel.com \
    --cc=lv.zheng@intel.com \
    --cc=marcin.koss@intel.com \
    --cc=murugasamy.nachimuthu@intel.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=robert.moore@intel.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).