All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
To: Mina Almasry <almasrymina@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Michael Sammler <sammler@google.com>,
	 Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	 Ira Weiny <ira.weiny@intel.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	nvdimm@lists.linux.dev,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] virtio_pmem: populate numa information
Date: Wed, 16 Nov 2022 15:28:46 +0100	[thread overview]
Message-ID: <CAM9Jb+heC0nu6P+Pt-kH46Q0W3YSJvcV8VMgLCDSC7a8h6h7dg@mail.gmail.com> (raw)
In-Reply-To: <CAHS8izNgmwjwyTyFzXWKrM==nTO0CJEW3+mUoKmtYjPushL5-g@mail.gmail.com>

> > > > > > > > Compute the numa information for a virtio_pmem device from the memory
> > > > > > > > range of the device. Previously, the target_node was always 0 since
> > > > > > > > the ndr_desc.target_node field was never explicitly set. The code for
> > > > > > > > computing the numa node is taken from cxl_pmem_region_probe in
> > > > > > > > drivers/cxl/pmem.c.
> > > > > > > >
> > > > > > > > Signed-off-by: Michael Sammler <sammler@google.com>
> > >
> > > Tested-by: Mina Almasry <almasrymina@google.com>
> > >
> > > I don't have much expertise on this driver, but with the help of this
> > > patch I was able to get memory tiering [1] emulation going on qemu. As
> > > far as I know there is no alternative to this emulation, and so I
> > > would love to see this or equivalent merged, if possible.
> > >
> > > This is what I have going to get memory tiering emulation:
> > >
> > > In qemu, added these configs:
> > >       -object memory-backend-file,id=m4,share=on,mem-path="$path_to_virtio_pmem_file",size=2G
> > > \
> > >       -smp 2,sockets=2,maxcpus=2  \
> > >       -numa node,nodeid=0,memdev=m0 \
> > >       -numa node,nodeid=1,memdev=m1 \
> > >       -numa node,nodeid=2,memdev=m2,initiator=0 \
> > >       -numa node,nodeid=3,initiator=0 \
> > >       -device virtio-pmem-pci,memdev=m4,id=nvdimm1 \
> > >
> > > On boot, ran these commands:
> > >     ndctl_static create-namespace -e namespace0.0 -m devdax -f 1&> /dev/null
> > >     echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
> > >     echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
> > >     for i in `ls /sys/devices/system/memory/`; do
> > >       state=$(cat "/sys/devices/system/memory/$i/state" 2&>/dev/null)
> > >       if [ "$state" == "offline" ]; then
> > >         echo online_movable > "/sys/devices/system/memory/$i/state"
> > >       fi
> > >     done
> >
> > Nice to see the way to handle the virtio-pmem device memory through kmem driver
> > and online the corresponding memory blocks to 'zone_movable'.
> >
> > This also opens way to use this memory range directly irrespective of attached
> > block device. Of course there won't be any persistent data guarantee. But good
> > way to simulate memory tiering inside guest as demonstrated below.
> > >
> > > Without this CL, I see the memory onlined in node 0 always, and is not
> > > a separate memory tier. With this CL and qemu configs, the memory is
> > > onlined in node 3 and is set as a separate memory tier, which enables
> > > qemu-based development:
> > >
> > > ==> /sys/devices/virtual/memory_tiering/memory_tier22/nodelist <==
> > > 3
> > > ==> /sys/devices/virtual/memory_tiering/memory_tier4/nodelist <==
> > > 0-2
> > >
> > > AFAIK there is no alternative to enabling memory tiering emulation in
> > > qemu, and would love to see this or equivalent merged, if possible.
> >
> > Just wondering if Qemu vNVDIMM device can also achieve this?
> >
>
> I spent a few minutes on this. Please note I'm really not familiar
> with these drivers, but as far as I can tell the qemu vNVDIMM device
> has the same problem and needs a similar fix to this to what Michael
> did here. What I did with vNVDIMM qemu device:
>
> - Added these qemu configs:
>       -object memory-backend-file,id=m4,share=on,mem-path=./hello,size=2G,readonly=off
> \
>       -device nvdimm,id=nvdimm1,memdev=m4,unarmed=off \
>
> - Ran the same commands in my previous email (they seem to apply to
> the vNVDIMM device without modification):
>     ndctl_static create-namespace -e namespace0.0 -m devdax -f 1&> /dev/null
>     echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
>     echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
>     for i in `ls /sys/devices/system/memory/`; do
>       state=$(cat "/sys/devices/system/memory/$i/state" 2&>/dev/null)
>       if [ "$state" == "offline" ]; then
>         echo online_movable > "/sys/devices/system/memory/$i/state"
>       fi
>     done
>
> I see the memory from the vNVDIMM device get onlined on node0, and is
> not detected as a separate memory tier. I suspect that driver needs a
> similar fix to this one.

Thanks for trying. It seems vNVDIMM device already has an option to provide
the target node[1].

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg827765.html

  reply	other threads:[~2022-11-16 14:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17 17:11 [PATCH v1] virtio_pmem: populate numa information Michael Sammler
2022-10-21 11:42 ` Pankaj Gupta
2022-10-21 17:08   ` Michael Sammler
2022-10-26 12:12     ` Pankaj Gupta
2022-10-26 21:49       ` Dan Williams
2022-11-11 21:54         ` Mina Almasry
2022-11-13 17:44           ` Pankaj Gupta
2022-11-14 19:14             ` Mina Almasry
2022-11-16 14:28               ` Pankaj Gupta [this message]
2022-10-26 21:28 ` Pasha Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM9Jb+heC0nu6P+Pt-kH46Q0W3YSJvcV8VMgLCDSC7a8h6h7dg@mail.gmail.com \
    --to=pankaj.gupta.linux@gmail.com \
    --cc=almasrymina@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=sammler@google.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.