From: Dan Williams <dan.j.williams@intel.com> To: Richard Palethorpe <rpalethorpe@suse.com> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Coly Li <colyli@suse.com> Subject: Re: [PATCH v2] nvdimm: Avoid race between probe and reading device attributes Date: Mon, 1 Feb 2021 15:19:37 -0800 [thread overview] Message-ID: <CAPcyv4jzfnnOTJTK5WKYpt_qOm1UWv-PZ7ZH3GiXf7x_oz6jQw@mail.gmail.com> (raw) In-Reply-To: <20200615074723.12163-1-rpalethorpe@suse.com> Yikes, sorry this languished so long, comments below: On Mon, Jun 15, 2020 at 12:48 AM Richard Palethorpe <rpalethorpe@suse.com> wrote: > > It is possible to cause a division error and use-after-free by querying the > nmem device before the driver data is fully initialised in nvdimm_probe. E.g > by doing > > (while true; do > cat /sys/bus/nd/devices/nmem*/available_slots 2>&1 > /dev/null > done) & > > while true; do > for i in $(seq 0 4); do > echo nmem$i > /sys/bus/nd/drivers/nvdimm/bind > done > for i in $(seq 0 4); do > echo nmem$i > /sys/bus/nd/drivers/nvdimm/unbind > done > done > > On 5.7-rc3 this causes: > > [ 12.711578] divide error: 0000 [#1] SMP KASAN PTI > [ 12.714857] RIP: 0010:nd_label_nfree+0x134/0x1a0 [libnvdimm] [..] > [ 12.725308] CR2: 00007fd16f1ec000 CR3: 0000000064322006 CR4: 0000000000160ef0 > [ 12.726268] Call Trace: > [ 12.726633] available_slots_show+0x4e/0x120 [libnvdimm] > [ 12.727380] dev_attr_show+0x42/0x80 > [ 12.727891] ? memset+0x20/0x40 > [ 12.728341] sysfs_kf_seq_show+0x218/0x410 > [ 12.728923] seq_read+0x389/0xe10 > [ 12.729415] vfs_read+0x101/0x2d0 > [ 12.729891] ksys_read+0xf9/0x1d0 > [ 12.730361] ? kernel_write+0x120/0x120 > [ 12.730915] do_syscall_64+0x95/0x4a0 > [ 12.731435] entry_SYSCALL_64_after_hwframe+0x49/0xb3 [..] > Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure") > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Vishal Verma <vishal.l.verma@intel.com> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Cc: linux-nvdimm@lists.01.org > Cc: linux-kernel@vger.kernel.org > Cc: Coly Li <colyli@suse.com> > Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com> > --- > > V2: > + Reviewed by Coly and removed unecessary lock > > drivers/nvdimm/dimm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c > index 7d4ddc4d9322..3d3988e1d9a0 100644 > --- a/drivers/nvdimm/dimm.c > +++ b/drivers/nvdimm/dimm.c > @@ -43,7 +43,6 @@ static int nvdimm_probe(struct device *dev) > if (!ndd) > return -ENOMEM; > > - dev_set_drvdata(dev, ndd); > ndd->dpa.name = dev_name(dev); > ndd->ns_current = -1; > ndd->ns_next = -1; > @@ -106,6 +105,8 @@ static int nvdimm_probe(struct device *dev) > if (rc) > goto err; > > + dev_set_drvdata(dev, ndd); > + I see why this works, but I think the bug is in available_slots_show(). It is a bug for a sysfs attribute to reference driver-data without synchronizing against bind. So it should be possible for probe set that pointer whenever it wants. In other words this fix (forgive the whitespace damage from pasting). diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c index b59032e0859b..e68b17bc7aab 100644 --- a/drivers/nvdimm/dimm_devs.c +++ b/drivers/nvdimm/dimm_devs.c @@ -335,10 +335,8 @@ static ssize_t state_show(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_RO(state); -static ssize_t available_slots_show(struct device *dev, - struct device_attribute *attr, char *buf) +static ssize_t __available_slots_show(struct nvdimm_drvdata *ndd, char *buf) { - struct nvdimm_drvdata *ndd = dev_get_drvdata(dev); ssize_t rc; u32 nfree; @@ -356,6 +354,18 @@ static ssize_t available_slots_show(struct device *dev, nvdimm_bus_unlock(dev); return rc; } + +static ssize_t available_slots_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + ssize_t rc; + + nd_device_lock(dev); + rc = __available_slots_show(dev_get_drvdata(dev), buf); + nd_device_unlock(dev); + + return rc; +} static DEVICE_ATTR_RO(available_slots); __weak ssize_t security_show(struct device *dev, _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com> To: Richard Palethorpe <rpalethorpe@suse.com> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>, Vishal Verma <vishal.l.verma@intel.com>, Dave Jiang <dave.jiang@intel.com>, Ira Weiny <ira.weiny@intel.com>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Coly Li <colyli@suse.com> Subject: Re: [PATCH v2] nvdimm: Avoid race between probe and reading device attributes Date: Mon, 1 Feb 2021 15:19:37 -0800 [thread overview] Message-ID: <CAPcyv4jzfnnOTJTK5WKYpt_qOm1UWv-PZ7ZH3GiXf7x_oz6jQw@mail.gmail.com> (raw) In-Reply-To: <20200615074723.12163-1-rpalethorpe@suse.com> Yikes, sorry this languished so long, comments below: On Mon, Jun 15, 2020 at 12:48 AM Richard Palethorpe <rpalethorpe@suse.com> wrote: > > It is possible to cause a division error and use-after-free by querying the > nmem device before the driver data is fully initialised in nvdimm_probe. E.g > by doing > > (while true; do > cat /sys/bus/nd/devices/nmem*/available_slots 2>&1 > /dev/null > done) & > > while true; do > for i in $(seq 0 4); do > echo nmem$i > /sys/bus/nd/drivers/nvdimm/bind > done > for i in $(seq 0 4); do > echo nmem$i > /sys/bus/nd/drivers/nvdimm/unbind > done > done > > On 5.7-rc3 this causes: > > [ 12.711578] divide error: 0000 [#1] SMP KASAN PTI > [ 12.714857] RIP: 0010:nd_label_nfree+0x134/0x1a0 [libnvdimm] [..] > [ 12.725308] CR2: 00007fd16f1ec000 CR3: 0000000064322006 CR4: 0000000000160ef0 > [ 12.726268] Call Trace: > [ 12.726633] available_slots_show+0x4e/0x120 [libnvdimm] > [ 12.727380] dev_attr_show+0x42/0x80 > [ 12.727891] ? memset+0x20/0x40 > [ 12.728341] sysfs_kf_seq_show+0x218/0x410 > [ 12.728923] seq_read+0x389/0xe10 > [ 12.729415] vfs_read+0x101/0x2d0 > [ 12.729891] ksys_read+0xf9/0x1d0 > [ 12.730361] ? kernel_write+0x120/0x120 > [ 12.730915] do_syscall_64+0x95/0x4a0 > [ 12.731435] entry_SYSCALL_64_after_hwframe+0x49/0xb3 [..] > Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure") > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Vishal Verma <vishal.l.verma@intel.com> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Cc: linux-nvdimm@lists.01.org > Cc: linux-kernel@vger.kernel.org > Cc: Coly Li <colyli@suse.com> > Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com> > --- > > V2: > + Reviewed by Coly and removed unecessary lock > > drivers/nvdimm/dimm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c > index 7d4ddc4d9322..3d3988e1d9a0 100644 > --- a/drivers/nvdimm/dimm.c > +++ b/drivers/nvdimm/dimm.c > @@ -43,7 +43,6 @@ static int nvdimm_probe(struct device *dev) > if (!ndd) > return -ENOMEM; > > - dev_set_drvdata(dev, ndd); > ndd->dpa.name = dev_name(dev); > ndd->ns_current = -1; > ndd->ns_next = -1; > @@ -106,6 +105,8 @@ static int nvdimm_probe(struct device *dev) > if (rc) > goto err; > > + dev_set_drvdata(dev, ndd); > + I see why this works, but I think the bug is in available_slots_show(). It is a bug for a sysfs attribute to reference driver-data without synchronizing against bind. So it should be possible for probe set that pointer whenever it wants. In other words this fix (forgive the whitespace damage from pasting). diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c index b59032e0859b..e68b17bc7aab 100644 --- a/drivers/nvdimm/dimm_devs.c +++ b/drivers/nvdimm/dimm_devs.c @@ -335,10 +335,8 @@ static ssize_t state_show(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_RO(state); -static ssize_t available_slots_show(struct device *dev, - struct device_attribute *attr, char *buf) +static ssize_t __available_slots_show(struct nvdimm_drvdata *ndd, char *buf) { - struct nvdimm_drvdata *ndd = dev_get_drvdata(dev); ssize_t rc; u32 nfree; @@ -356,6 +354,18 @@ static ssize_t available_slots_show(struct device *dev, nvdimm_bus_unlock(dev); return rc; } + +static ssize_t available_slots_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + ssize_t rc; + + nd_device_lock(dev); + rc = __available_slots_show(dev_get_drvdata(dev), buf); + nd_device_unlock(dev); + + return rc; +} static DEVICE_ATTR_RO(available_slots); __weak ssize_t security_show(struct device *dev,
next prev parent reply other threads:[~2021-02-01 23:19 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-15 7:47 [PATCH v2] nvdimm: Avoid race between probe and reading device attributes Richard Palethorpe 2020-06-15 7:47 ` Richard Palethorpe 2020-06-15 8:36 ` Coly Li 2020-06-15 8:36 ` Coly Li 2021-01-07 10:54 ` Michal Suchánek 2021-01-07 10:54 ` Michal Suchánek 2021-02-01 23:19 ` Dan Williams [this message] 2021-02-01 23:19 ` Dan Williams 2021-02-02 16:58 ` Richard Palethorpe 2021-02-02 16:58 ` Richard Palethorpe
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAPcyv4jzfnnOTJTK5WKYpt_qOm1UWv-PZ7ZH3GiXf7x_oz6jQw@mail.gmail.com \ --to=dan.j.williams@intel.com \ --cc=colyli@suse.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvdimm@lists.01.org \ --cc=rpalethorpe@suse.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.