All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: yizhan <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvdimm
	<linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org>
Subject: Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
Date: Sun, 31 Jul 2016 10:54:25 -0700	[thread overview]
Message-ID: <CAPcyv4hxP8aDyzsoeG9XH5ygtNWypU8fUj+qCNKh2wWa+PJh6w@mail.gmail.com> (raw)
In-Reply-To: <d83eeac2-49dc-4c45-9aea-7d68da4fbf7d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Sun, Jul 31, 2016 at 10:19 AM, yizhan <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 07/30/2016 11:52 PM, Dan Williams wrote:
>>
>> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> wrote:
>>>
>>> [ adding linux-block ]
>>>
>>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>>>
>>>> Hello everyone
>>>>
>>>> Could you help check this issue, thanks.
>>>>
>>>> Steps I used:
>>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G
>>>> memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>>> 2. Execute below script
>>>> #!/bin/bash
>>>> pmem_btt_switch() {
>>>>          sector_size_list="512 520 528 4096 4104 4160 4224"
>>>>          for sector_size in $sector_size_list; do
>>>>                  ndctl create-namespace -f -e namespace${1}.0
>>>> --mode=sector -l $sector_size
>>>>                  ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>>          done
>>>> }
>>>>
>>>> for i in 0 1 2 3; do
>>>>          pmem_btt_switch $i &
>>>> done
>>>
>>> Thanks for the report.  This looks like del_gendisk() frees the
>>> previous usage of the devt before the bdi is unregistered.  This
>>> appears to be a general problem with all block drivers, not just
>>> libnvdimm, since blk_cleanup_queue() is typically called after
>>> del_gendisk().  I.e. it will always be the case that the bdi
>>> registered with the devt allocated at add_disk() will still be alive
>>> when del_gendisk()->disk_release() frees the previous devt number.
>>>
>>> I *think* the path forward is to allow the bdi to hold a reference
>>> against the blk_alloc_devt() allocation until it is done with it.  Any
>>> other ideas on fixing this object lifetime problem?
>>
>> Does the attached patch solve this for you?
>
> Hi Dan
> This patch works and the issue cannot be reproduced after several times'
> test, thanks

Thank you!

> Another thing is during the bug verifying, I found below error message,
> could you check whether it is reasonable:
> [  150.464620] Dev pmem1: unable to read RDB block 0
> [  150.486897]  pmem1: unable to read partition table
> [  150.486901] pmem1: partition table beyond EOD, truncated
> [  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page
> read
> [  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page
> read
>

This test is racing block device registration versus teardown.  These
messages are expected and are likely coming from the block queue
percpu ref being marked dead while the partition scan runs.  When this
happens blk_queue_enter() in generic_make_request() returns errors for
every new I/O submission while blk_cleanup_queue() runs.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: yizhan <yizhan@redhat.com>
Cc: linux-nvdimm <linux-nvdimm@ml01.01.org>, linux-block@vger.kernel.org
Subject: Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
Date: Sun, 31 Jul 2016 10:54:25 -0700	[thread overview]
Message-ID: <CAPcyv4hxP8aDyzsoeG9XH5ygtNWypU8fUj+qCNKh2wWa+PJh6w@mail.gmail.com> (raw)
In-Reply-To: <d83eeac2-49dc-4c45-9aea-7d68da4fbf7d@redhat.com>

On Sun, Jul 31, 2016 at 10:19 AM, yizhan <yizhan@redhat.com> wrote:
> On 07/30/2016 11:52 PM, Dan Williams wrote:
>>
>> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@intel.com>
>> wrote:
>>>
>>> [ adding linux-block ]
>>>
>>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@redhat.com> wrote:
>>>>
>>>> Hello everyone
>>>>
>>>> Could you help check this issue, thanks.
>>>>
>>>> Steps I used:
>>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G
>>>> memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>>> 2. Execute below script
>>>> #!/bin/bash
>>>> pmem_btt_switch() {
>>>>          sector_size_list="512 520 528 4096 4104 4160 4224"
>>>>          for sector_size in $sector_size_list; do
>>>>                  ndctl create-namespace -f -e namespace${1}.0
>>>> --mode=sector -l $sector_size
>>>>                  ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>>          done
>>>> }
>>>>
>>>> for i in 0 1 2 3; do
>>>>          pmem_btt_switch $i &
>>>> done
>>>
>>> Thanks for the report.  This looks like del_gendisk() frees the
>>> previous usage of the devt before the bdi is unregistered.  This
>>> appears to be a general problem with all block drivers, not just
>>> libnvdimm, since blk_cleanup_queue() is typically called after
>>> del_gendisk().  I.e. it will always be the case that the bdi
>>> registered with the devt allocated at add_disk() will still be alive
>>> when del_gendisk()->disk_release() frees the previous devt number.
>>>
>>> I *think* the path forward is to allow the bdi to hold a reference
>>> against the blk_alloc_devt() allocation until it is done with it.  Any
>>> other ideas on fixing this object lifetime problem?
>>
>> Does the attached patch solve this for you?
>
> Hi Dan
> This patch works and the issue cannot be reproduced after several times'
> test, thanks

Thank you!

> Another thing is during the bug verifying, I found below error message,
> could you check whether it is reasonable:
> [  150.464620] Dev pmem1: unable to read RDB block 0
> [  150.486897]  pmem1: unable to read partition table
> [  150.486901] pmem1: partition table beyond EOD, truncated
> [  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page
> read
> [  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page
> read
>

This test is racing block device registration versus teardown.  These
messages are expected and are likely coming from the block queue
percpu ref being marked dead while the partition scan runs.  When this
happens blk_queue_enter() in generic_make_request() returns errors for
every new I/O submission while blk_cleanup_queue() runs.

  parent reply	other threads:[~2016-07-31 17:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <622794958.9574724.1469674652262.JavaMail.zimbra@redhat.com>
     [not found] ` <622794958.9574724.1469674652262.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-07-28  3:20   ` [BUG] kernel NULL pointer dereference observed during pmem btt switch test Yi Zhang
2016-07-28 15:50     ` Dan Williams
     [not found]       ` <CAPcyv4g5PpShWfXSV+KPJYW7GFrejUNjk=C1-ak=88iX8XczGA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-30 15:52         ` Dan Williams
2016-07-30 15:52           ` Dan Williams
2016-07-31 17:19           ` yizhan
     [not found]             ` <d83eeac2-49dc-4c45-9aea-7d68da4fbf7d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-07-31 17:54               ` Dan Williams [this message]
2016-07-31 17:54                 ` Dan Williams
     [not found]                 ` <CAPcyv4hxP8aDyzsoeG9XH5ygtNWypU8fUj+qCNKh2wWa+PJh6w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-01  5:30                   ` yizhan
2016-08-01  5:30                     ` yizhan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hxP8aDyzsoeG9XH5ygtNWypU8fUj+qCNKh2wWa+PJh6w@mail.gmail.com \
    --to=dan.j.williams-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org \
    --cc=yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.