qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Tao Xu <tao3.xu@intel.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	Jingqi Liu <jingqi.liu@intel.com>, "Du, Fan" <fan.du@intel.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	daniel@linux.ibm.com,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes
Date: Wed, 14 Aug 2019 14:29:31 -0700	[thread overview]
Message-ID: <CAPcyv4jCuy6zvM8NcacXhvpUBUyp_BYMcEtBuA_ck3AhkyGNsQ@mail.gmail.com> (raw)
In-Reply-To: <a73f7c81-0363-c10f-8ae1-9344abc55449@intel.com>

On Tue, Aug 13, 2019 at 10:14 PM Tao Xu <tao3.xu@intel.com> wrote:
>
> On 8/14/2019 10:39 AM, Dan Williams wrote:
> > On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov <imammedo@redhat.com> wrote:
> >>
> >> On Fri,  9 Aug 2019 14:57:25 +0800
> >> Tao <tao3.xu@intel.com> wrote:
> >>
> >>> From: Tao Xu <tao3.xu@intel.com>
> >>>
> >>> In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
> >>> The initiator represents processor which access to memory. And in 5.2.27.3
> >>> Memory Proximity Domain Attributes Structure, the attached initiator is
> >>> defined as where the memory controller responsible for a memory proximity
> >>> domain. With attached initiator information, the topology of heterogeneous
> >>> memory can be described.
> >>>
> >>> Extend CLI of "-numa node" option to indicate the initiator numa node-id.
> >>> In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
> >>> the platform's HMAT tables.
> >>>
> >>> Reviewed-by: Jingqi Liu <Jingqi.liu@intel.com>
> >>> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> >>> Signed-off-by: Tao Xu <tao3.xu@intel.com>
> >>> ---
> >>>
> >>> No changes in v9
> >>> ---
> >>>   hw/core/machine.c     | 24 ++++++++++++++++++++++++
> >>>   hw/core/numa.c        | 13 +++++++++++++
> >>>   include/sysemu/numa.h |  3 +++
> >>>   qapi/machine.json     |  6 +++++-
> >>>   qemu-options.hx       | 27 +++++++++++++++++++++++----
> >>>   5 files changed, 68 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/hw/core/machine.c b/hw/core/machine.c
> >>> index 3c55470103..113184a9df 100644
> >>> --- a/hw/core/machine.c
> >>> +++ b/hw/core/machine.c
> >>> @@ -640,6 +640,7 @@ void machine_set_cpu_numa_node(MachineState *machine,
> >>>                                  const CpuInstanceProperties *props, Error **errp)
> >>>   {
> >>>       MachineClass *mc = MACHINE_GET_CLASS(machine);
> >>> +    NodeInfo *numa_info = machine->numa_state->nodes;
> >>>       bool match = false;
> >>>       int i;
> >>>
> >>> @@ -709,6 +710,16 @@ void machine_set_cpu_numa_node(MachineState *machine,
> >>>           match = true;
> >>>           slot->props.node_id = props->node_id;
> >>>           slot->props.has_node_id = props->has_node_id;
> >>> +
> >>> +        if (numa_info[props->node_id].initiator_valid &&
> >>> +            (props->node_id != numa_info[props->node_id].initiator)) {
> >>> +            error_setg(errp, "The initiator of CPU NUMA node %" PRId64
> >>> +                       " should be itself.", props->node_id);
> >>> +            return;
> >>> +        }
> >>> +        numa_info[props->node_id].initiator_valid = true;
> >>> +        numa_info[props->node_id].has_cpu = true;
> >>> +        numa_info[props->node_id].initiator = props->node_id;
> >>>       }
> >>>
> >>>       if (!match) {
> >>> @@ -1050,6 +1061,7 @@ static void machine_numa_finish_cpu_init(MachineState *machine)
> >>>       GString *s = g_string_new(NULL);
> >>>       MachineClass *mc = MACHINE_GET_CLASS(machine);
> >>>       const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(machine);
> >>> +    NodeInfo *numa_info = machine->numa_state->nodes;
> >>>
> >>>       assert(machine->numa_state->num_nodes);
> >>>       for (i = 0; i < possible_cpus->len; i++) {
> >>> @@ -1083,6 +1095,18 @@ static void machine_numa_finish_cpu_init(MachineState *machine)
> >>>               machine_set_cpu_numa_node(machine, &props, &error_fatal);
> >>>           }
> >>>       }
> >>> +
> >>> +    for (i = 0; i < machine->numa_state->num_nodes; i++) {
> >>> +        if (numa_info[i].initiator_valid &&
> >>> +            !numa_info[numa_info[i].initiator].has_cpu) {
> >>                            ^^^^^^^^^^^^^^^^^^^^^^ possible out of bounds read, see bellow
> >>
> >>> +            error_report("The initiator-id %"PRIu16 " of NUMA node %d"
> >>> +                         " does not exist.", numa_info[i].initiator, i);
> >>> +            error_printf("\n");
> >>> +
> >>> +            exit(1);
> >>> +        }
> >> it takes care only about nodes that have cpus or memory-only ones that have
> >> initiator explicitly provided on CLI. And leaves possibility to have
> >> memory-only nodes without initiator mixed with nodes that have initiator.
> >> Is it valid to have mixed configuration?
> >> Should we forbid it?
> >
> > The spec talks about the "Proximity Domain for the Attached Initiator"
> > field only being valid if the memory controller for the memory can be
> > identified by an initiator id in the SRAT. So I expect the only way to
> > define a memory proximity domain without this local initiator is to
> > allow specifying a node-id that does not have an entry in the SRAT.
> >
> Hi Dan,
>
> So there may be a situation for the Attached Initiator field is not
> valid? If true, I would allow user to input Initiator invalid.

Yes it's something the OS needs to consider because the platform may
not be able to meet the constraint that a single initiator is
associated with the memory controller for a given memory target. In
retrospect it would have been nice if the spec reserved 0xffffffff for
this purpose, but it seems "not in SRAT" is the only way to identify
memory that is not attached to any single initiator.

> > That would be a useful feature for testing OS HMAT parsing behavior,
> > and may match platforms that exist in practice.


  reply	other threads:[~2019-08-14 21:31 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09  6:57 [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT) Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 01/11] hw/arm: simplify arm_load_dtb Tao
2019-08-13 21:55   ` Alistair Francis
2019-08-14  1:19     ` Andrew Jeffery
2019-08-13 21:55   ` Eduardo Habkost
2019-08-14 13:08     ` Cédric Le Goater
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 02/11] numa: move numa global variable nb_numa_nodes into MachineState Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 03/11] numa: move numa global variable have_numa_distance " Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 04/11] numa: move numa global variable numa_info " Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes Tao
2019-08-13 15:00   ` Igor Mammedov
2019-08-14  2:24     ` Tao Xu
2019-08-16 14:47       ` Igor Mammedov
2019-08-14  2:39     ` Dan Williams
2019-08-14  5:13       ` Tao Xu
2019-08-14 21:29         ` Dan Williams [this message]
2019-08-15  1:56           ` Tao Xu
2019-08-15  2:31             ` Dan Williams
2019-08-16 14:57               ` Igor Mammedov
2019-08-20  8:34                 ` Tao Xu
2019-08-27 13:12                   ` Igor Mammedov
2019-08-28  1:09                     ` Tao Xu
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 06/11] hmat acpi: Build Memory Proximity Domain Attributes Structure(s) Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 07/11] hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 08/11] hmat acpi: Build Memory Side Cache " Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information Tao
2019-08-12  5:13   ` Daniel Black
2019-08-12  6:11     ` Tao Xu
2019-08-13 15:11   ` Eric Blake
2019-08-14  2:58     ` Tao Xu
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 10/11] numa: Extend the CLI to provide memory side cache information Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 11/11] tests/bios-tables-test: add test cases for ACPI HMAT Tao
2019-08-09 11:11 ` [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT) no-reply
2019-08-13  8:53 ` Tao Xu
2019-08-14 20:57   ` Eduardo Habkost
2019-08-15  0:53     ` Tao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4jCuy6zvM8NcacXhvpUBUyp_BYMcEtBuA_ck3AhkyGNsQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=daniel@linux.ibm.com \
    --cc=ehabkost@redhat.com \
    --cc=fan.du@intel.com \
    --cc=imammedo@redhat.com \
    --cc=jingqi.liu@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tao3.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).