All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshi Kani <toshi.kani@hp.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"hch@lst.de" <hch@lst.de>
Subject: Re: [PATCH v2 15/17] libnvdimm: Set numa_node to NVDIMM devices
Date: Thu, 25 Jun 2015 15:51:43 -0600	[thread overview]
Message-ID: <1435269103.11808.349.camel@misato.fc.hp.com> (raw)
In-Reply-To: <CAPcyv4jhDc1+TGyo4J6vC7bRBaGbfAXyLP2qB9X5DAJRBhCRyQ@mail.gmail.com>

On Thu, 2015-06-25 at 14:31 -0700, Dan Williams wrote:
> On Thu, Jun 25, 2015 at 11:34 AM, Williams, Dan J
> <dan.j.williams@intel.com> wrote:
> > On Thu, 2015-06-25 at 11:45 -0600, Toshi Kani wrote:
> >> On Thu, 2015-06-25 at 05:37 -0400, Dan Williams wrote:
> >> > From: Toshi Kani <toshi.kani@hp.com>
> >> >
> >> > ACPI NFIT table has System Physical Address Range Structure entries that
> >> > describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is
> >> > set in the flags.
> >> >
> >> > Change acpi_nfit_register_region() to map a proximity ID to its node ID,
> >> > and set it to a new numa_node field of nd_region_desc, which is then
> >> > conveyed to the nd_region device.
> >> >
> >> > The device core arranges for btt and namespace devices to inherit their
> >> > node from their parent region.
> >> >
> >> > Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> >> > [djbw: move set_dev_node() from region 'probe' to 'create']
> >>
> >> Sorry, I failed to mention other issue, which led me call set_dev_node()
> >> in probe.  nd_async_device_register() calls device_add(), which does:
> >>
> >>         /* use parent numa_node */
> >>         if (parent)
> >>                 set_dev_node(dev, dev_to_node(parent));
> >>
> >> and overwrites numa_node to -1.  Since region's parent is ndbusN, we
> >> cannot set numa_node to the parent.  So, I had to set it in probe.
> >
> > In general, I still don't like leaving it up to ->probe() which is
> > within its rights to fail and not set the node.  How about the following
> > that moves it to the bus uevent code?  Should get triggered before probe
> > so the numa_node is valid before userspace is ever notified about the
> > device.
> >
> > device_add() does:
> >
> >         kobject_uevent(&dev->kobj, KOBJ_ADD);
> >         bus_probe_device(dev);
> >
> > ...so I think we're good, agree?  I also added a missing init of
> > ndr_desc.numa_node in arch/x86/kernel/pmem.c, see below.
> 
> This looks good in a quick manual test.  It's interesting/illustrative
> that I inadvertently broke the one bit of the libnvdimm sysfs
> interface that did not have unit test coverage.

Sorry I had some interrupt.  Yes, this works fine for region &
namespace.  I'd like to check with you for btt since the attach logic
has changed in v2.

Previously, as described in patch 16/17, bttN bound to pmem had a valid
numa_node value, and seeding btt0 had -1.

  /sys/bus/nd/devices
  |-- btt0/numa_node:-1
  |-- btt1/numa_node:0

In this version, there are unbound (seeding?) btt0-3 for every region
(there are 4 regions) and btt4 & 5 bound to pmem0 & 3 on my system.

btt0/numa_node:0
btt1/numa_node:0
btt2/numa_node:1
btt3/numa_node:1
btt4/numa_node:0
btt5/numa_node:1

btt0
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0
btt1
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt1
btt2
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2
btt3
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt3
btt4
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4
btt5
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5

And unbound bttNs attach to different regions across a reboot.

btt0/numa_node:0
btt1/numa_node:1
btt2/numa_node:1
btt3/numa_node:0
btt4/numa_node:0
btt5/numa_node:1

btt0
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0
btt1
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt1
btt2
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2
btt3
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt3
btt4
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4
btt5
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5

Is this how you'd expect btt to work in this version?  (I have not
looked at the btt changes yet)

Thanks,
-Toshi


WARNING: multiple messages have this Message-ID (diff)
From: Toshi Kani <toshi.kani@hp.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"hch@lst.de" <hch@lst.de>
Subject: Re: [PATCH v2 15/17] libnvdimm: Set numa_node to NVDIMM devices
Date: Thu, 25 Jun 2015 15:51:43 -0600	[thread overview]
Message-ID: <1435269103.11808.349.camel@misato.fc.hp.com> (raw)
In-Reply-To: <CAPcyv4jhDc1+TGyo4J6vC7bRBaGbfAXyLP2qB9X5DAJRBhCRyQ@mail.gmail.com>

On Thu, 2015-06-25 at 14:31 -0700, Dan Williams wrote:
> On Thu, Jun 25, 2015 at 11:34 AM, Williams, Dan J
> <dan.j.williams@intel.com> wrote:
> > On Thu, 2015-06-25 at 11:45 -0600, Toshi Kani wrote:
> >> On Thu, 2015-06-25 at 05:37 -0400, Dan Williams wrote:
> >> > From: Toshi Kani <toshi.kani@hp.com>
> >> >
> >> > ACPI NFIT table has System Physical Address Range Structure entries that
> >> > describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is
> >> > set in the flags.
> >> >
> >> > Change acpi_nfit_register_region() to map a proximity ID to its node ID,
> >> > and set it to a new numa_node field of nd_region_desc, which is then
> >> > conveyed to the nd_region device.
> >> >
> >> > The device core arranges for btt and namespace devices to inherit their
> >> > node from their parent region.
> >> >
> >> > Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> >> > [djbw: move set_dev_node() from region 'probe' to 'create']
> >>
> >> Sorry, I failed to mention other issue, which led me call set_dev_node()
> >> in probe.  nd_async_device_register() calls device_add(), which does:
> >>
> >>         /* use parent numa_node */
> >>         if (parent)
> >>                 set_dev_node(dev, dev_to_node(parent));
> >>
> >> and overwrites numa_node to -1.  Since region's parent is ndbusN, we
> >> cannot set numa_node to the parent.  So, I had to set it in probe.
> >
> > In general, I still don't like leaving it up to ->probe() which is
> > within its rights to fail and not set the node.  How about the following
> > that moves it to the bus uevent code?  Should get triggered before probe
> > so the numa_node is valid before userspace is ever notified about the
> > device.
> >
> > device_add() does:
> >
> >         kobject_uevent(&dev->kobj, KOBJ_ADD);
> >         bus_probe_device(dev);
> >
> > ...so I think we're good, agree?  I also added a missing init of
> > ndr_desc.numa_node in arch/x86/kernel/pmem.c, see below.
> 
> This looks good in a quick manual test.  It's interesting/illustrative
> that I inadvertently broke the one bit of the libnvdimm sysfs
> interface that did not have unit test coverage.

Sorry I had some interrupt.  Yes, this works fine for region &
namespace.  I'd like to check with you for btt since the attach logic
has changed in v2.

Previously, as described in patch 16/17, bttN bound to pmem had a valid
numa_node value, and seeding btt0 had -1.

  /sys/bus/nd/devices
  |-- btt0/numa_node:-1
  |-- btt1/numa_node:0

In this version, there are unbound (seeding?) btt0-3 for every region
(there are 4 regions) and btt4 & 5 bound to pmem0 & 3 on my system.

btt0/numa_node:0
btt1/numa_node:0
btt2/numa_node:1
btt3/numa_node:1
btt4/numa_node:0
btt5/numa_node:1

btt0
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0
btt1
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt1
btt2
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2
btt3
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt3
btt4
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4
btt5
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5

And unbound bttNs attach to different regions across a reboot.

btt0/numa_node:0
btt1/numa_node:1
btt2/numa_node:1
btt3/numa_node:0
btt4/numa_node:0
btt5/numa_node:1

btt0
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0
btt1
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt1
btt2
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2
btt3
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt3
btt4
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4
btt5
-> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5

Is this how you'd expect btt to work in this version?  (I have not
looked at the btt changes yet)

Thanks,
-Toshi


  reply	other threads:[~2015-06-25 21:51 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-25  9:36 [PATCH v2 00/17] libnvdimm: ->rw_bytes(), BLK, BTT, PMEM api, and unit tests Dan Williams
2015-06-25  9:36 ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 01/17] libnvdimm: infrastructure for btt devices Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 02/17] nd_btt: atomic sector updates Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 03/17] libnvdimm, nfit, nd_blk: driver for BLK-mode access persistent memory Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 04/17] tools/testing/nvdimm: libnvdimm unit test infrastructure Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 05/17] libnvdimm: Non-Volatile Devices Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 06/17] fs/block_dev.c: skip rw_page if bdev has integrity Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:36 ` [PATCH v2 07/17] libnvdimm, btt: add support for blk integrity Dan Williams
2015-06-25  9:36   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 08/17] libnvdimm, blk: " Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 09/17] libnvdimm, pmem: fix up max_hw_sectors Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 10/17] pmem: make_request cleanups Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 11/17] libnvdimm: enable iostat Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 12/17] pmem: flag pmem block devices as non-rotational Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 13/17] libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 14/17] acpi: Add acpi_map_pxm_to_online_node() Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 15/17] libnvdimm: Set numa_node to NVDIMM devices Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-25 17:45   ` Toshi Kani
2015-06-25 17:45     ` Toshi Kani
2015-06-25 17:47     ` Dan Williams
2015-06-25 17:47       ` Dan Williams
2015-06-25 18:34     ` Williams, Dan J
2015-06-25 18:34       ` Williams, Dan J
2015-06-25 21:31       ` Dan Williams
2015-06-25 21:31         ` Dan Williams
2015-06-25 21:51         ` Toshi Kani [this message]
2015-06-25 21:51           ` Toshi Kani
2015-06-25 22:00           ` Dan Williams
2015-06-25 22:00             ` Dan Williams
2015-06-25 22:11             ` Toshi Kani
2015-06-25 22:11               ` Toshi Kani
2015-06-25 22:34               ` Dan Williams
2015-06-25 22:34                 ` Dan Williams
2015-06-25 22:55                 ` Toshi Kani
2015-06-25 22:55                   ` Toshi Kani
2015-06-25 23:42                   ` Williams, Dan J
2015-06-25 23:42                     ` Williams, Dan J
2015-06-26  0:55                     ` Toshi Kani
2015-06-26  0:55                       ` Toshi Kani
2015-06-26  1:08                       ` Dan Williams
2015-06-26  1:08                         ` Dan Williams
2015-06-26  1:21                         ` Toshi Kani
2015-06-26  1:21                           ` Toshi Kani
2015-06-25  9:37 ` [PATCH v2 16/17] libnvdimm: Add sysfs " Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-26  2:21   ` Toshi Kani
2015-06-26  2:21     ` Toshi Kani
2015-06-26 15:26     ` Dan Williams
2015-06-26 15:26       ` Dan Williams
2015-06-25  9:37 ` [PATCH v2 17/17] arch, x86: pmem api for ensuring durability of persistent memory updates Dan Williams
2015-06-25  9:37   ` Dan Williams
2015-06-30 10:21   ` Dan Carpenter
2015-06-30 16:23     ` Williams, Dan J
2015-06-30 16:23       ` Williams, Dan J

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1435269103.11808.349.camel@misato.fc.hp.com \
    --to=toshi.kani@hp.com \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.