All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Tony Camuso <tcamuso@redhat.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	airlied@linux.ie, dkwon@redhat.com
Subject: Re: [PATCH] drm: assure aux_dev is nonzero before using it
Date: Wed, 10 Jul 2019 16:56:17 +0300	[thread overview]
Message-ID: <20190710135617.GE5942@intel.com> (raw)
In-Reply-To: <5111581c-9d73-530d-d3ff-4f6950bf3f8c@redhat.com>

On Wed, Jul 10, 2019 at 09:47:11AM -0400, Tony Camuso wrote:
> On 5/24/19 4:36 AM, Jani Nikula wrote:
> > On Thu, 23 May 2019, tcamuso <tcamuso@redhat.com> wrote:
> >>  From Daniel Kwon <dkwon@redhat.com>
> >>
> >> The system was crashed due to invalid memory access while trying to access
> >> auxiliary device.
> >>
> >> crash> bt
> >> PID: 9863   TASK: ffff89d1bdf11040  CPU: 1   COMMAND: "ipmitool"
> >>   #0 [ffff89cedd7f3868] machine_kexec at ffffffffb0663674
> >>   #1 [ffff89cedd7f38c8] __crash_kexec at ffffffffb071cf62
> >>   #2 [ffff89cedd7f3998] crash_kexec at ffffffffb071d050
> >>   #3 [ffff89cedd7f39b0] oops_end at ffffffffb0d6d758
> >>   #4 [ffff89cedd7f39d8] no_context at ffffffffb0d5bcde
> >>   #5 [ffff89cedd7f3a28] __bad_area_nosemaphore at ffffffffb0d5bd75
> >>   #6 [ffff89cedd7f3a78] bad_area at ffffffffb0d5c085
> >>   #7 [ffff89cedd7f3aa0] __do_page_fault at ffffffffb0d7080c
> >>   #8 [ffff89cedd7f3b10] do_page_fault at ffffffffb0d70905
> >>   #9 [ffff89cedd7f3b40] page_fault at ffffffffb0d6c758
> >>      [exception RIP: drm_dp_aux_dev_get_by_minor+0x3d]
> >>      RIP: ffffffffc0a589bd  RSP: ffff89cedd7f3bf0  RFLAGS: 00010246
> >>      RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffff89cedd7f3fd8
> >>      RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffffffffc0a613e0
> >>      RBP: ffff89cedd7f3bf8   R8: ffff89f1bcbabbd0   R9: 0000000000000000
> >>      R10: ffff89f1be7a1cc0  R11: 0000000000000000  R12: 0000000000000000
> >>      R13: ffff89f1b32a2830  R14: ffff89d18fadfa00  R15: 0000000000000000
> >>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> >>      RIP: 00002b45f0d80d30  RSP: 00007ffc416066a0  RFLAGS: 00010246
> >>      RAX: 0000000000000002  RBX: 000056062e212d80  RCX: 00007ffc41606810
> >>      RDX: 0000000000000000  RSI: 0000000000000002  RDI: 00007ffc41606ec0
> >>      RBP: 0000000000000000   R8: 000056062dfed229   R9: 00002b45f0cdf14d
> >>      R10: 0000000000000002  R11: 0000000000000246  R12: 00007ffc41606ec0
> >>      R13: 00007ffc41606ed0  R14: 00007ffc41606ee0  R15: 0000000000000000
> >>      ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b
> >>
> >> ----------------------------------------------------------------------------
> >>
> >> It was trying to open '/dev/ipmi0', but as no entry in aux_dir, it returned
> >> NULL from 'idr_find()'. This drm_dp_aux_dev_get_by_minor() should have done a
> >> check on this, but had failed to do it.
> > 
> > I think the better question is, *why* does the idr_find() return NULL? I
> > don't think it should, under any circumstances. I fear adding the check
> > here papers over some other problem, taking us further away from the
> > root cause.
> > 
> > Also, can you reproduce this on a recent upstream kernel? The aux device
> > nodes were introduced in kernel v4.6. Whatever you reproduced on v3.10
> > is pretty much irrelevant for upstream.
> > 
> > 
> > BR,
> > Jani.
> 
> I have not been able to reproduce this problem.

mknod /dev/foo c <drm_dp_aux major> 255
cat /dev/foo

should do it.

> 
> However, whatever the reason idr_find() returns NULL, isn't it good form to
> check it before using it? What would be the software engineering reason not
> to do this?
> 
> > 
> > 
> > 
> > 
> >>
> >> ----------------------------------------------------------------------------
> >> /usr/src/debug/kernel-3.10.0-957.12.1.el7/linux-3.10.0-957.12.1.el7.x86_64/include/linux/idr.h: 114
> >>       114 	struct idr_layer *hint = rcu_dereference_raw(idr->hint);
> >> 0xffffffffc0a58998 <drm_dp_aux_dev_get_by_minor+0x18>:	mov    0x8a41(%rip),%rax        # 0xffffffffc0a613e0 <aux_idr>
> >> /usr/src/debug/kernel-3.10.0-957.12.1.el7/linux-3.10.0-957.12.1.el7.x86_64/include/linux/idr.h: 116
> >>       116 	if (hint && (id & ~IDR_MASK) == hint->prefix)
> >>       117 		return rcu_dereference_raw(hint->ary[id & IDR_MASK]);
> >> 0xffffffffc0a5899f <drm_dp_aux_dev_get_by_minor+0x1f>:	test   %rax,%rax
> >> 0xffffffffc0a589a2 <drm_dp_aux_dev_get_by_minor+0x22>:	je     0xffffffffc0a589ac <drm_dp_aux_dev_get_by_minor+0x2c>
> >> 0xffffffffc0a589a4 <drm_dp_aux_dev_get_by_minor+0x24>:	mov    %ebx,%edx
> >> 0xffffffffc0a589a6 <drm_dp_aux_dev_get_by_minor+0x26>:	xor    %dl,%dl
> >> 0xffffffffc0a589a8 <drm_dp_aux_dev_get_by_minor+0x28>:	cmp    (%rax),%edx
> >> 0xffffffffc0a589aa <drm_dp_aux_dev_get_by_minor+0x2a>:	je     0xffffffffc0a589f0 <drm_dp_aux_dev_get_by_minor+0x70>
> >> /usr/src/debug/kernel-3.10.0-957.12.1.el7/linux-3.10.0-957.12.1.el7.x86_64/include/linux/idr.h: 119
> >>       119 	return idr_find_slowpath(idr, id);
> >> 0xffffffffc0a589ac <drm_dp_aux_dev_get_by_minor+0x2c>:	mov    %ebx,%esi
> >> 0xffffffffc0a589ae <drm_dp_aux_dev_get_by_minor+0x2e>:	mov    $0xffffffffc0a613e0,%rdi
> >> 0xffffffffc0a589b5 <drm_dp_aux_dev_get_by_minor+0x35>:	callq  0xffffffffb09771b0 <idr_find_slowpath>
> >> 0xffffffffc0a589ba <drm_dp_aux_dev_get_by_minor+0x3a>:	mov    %rax,%rbx
> >> /usr/src/debug/kernel-3.10.0-957.12.1.el7/linux-3.10.0-957.12.1.el7.x86_64/arch/x86/include/asm/atomic.h: 25
> >>        25 	return ACCESS_ONCE((v)->counter);
> >> 0xffffffffc0a589bd <drm_dp_aux_dev_get_by_minor+0x3d>:	mov    0x18(%rbx),%edx
> >>
> >> crash> struct file.f_path 0xffff89d18fadfa00
> >>    f_path = {
> >>      mnt = 0xffff89f23feaa620,
> >>      dentry = 0xffff89f1be7a1cc0
> >>    }
> >> crash> files -d 0xffff89f1be7a1cc0
> >>       DENTRY           INODE           SUPERBLK     TYPE PATH
> >> ffff89f1be7a1cc0 ffff89f1b32a2830 ffff89d293aa8800 CHR  /dev/ipmi0
> >>
> >> crash> struct inode.i_rdev ffff89f1b32a2830
> >>    i_rdev = 0xf200000
> >> crash> eval (0xfffff & 0xf200000)
> >> hexadecimal: 0
> >>      decimal: 0
> >>        octal: 0
> >>       binary: 0000000000000000000000000000000000000000000000000000000000000000
> >> ----------------------------------------------------------------------------
> >>
> >> As the index value was 0 and aux_idr had value 0 for all, it can have value
> >> NULL from idr_find() function, but the below function doesn't check and just
> >> tries to use it.
> >>
> >> ----------------------------------------------------------------------------
> >> crash> aux_idr
> >> aux_idr = $8 = {
> >>    hint = 0x0,
> >>    top = 0x0,
> >>    id_free = 0x0,
> >>    layers = 0x0,
> >>    id_free_cnt = 0x0,
> >>    cur = 0x0,
> >>    lock = {
> >>      {
> >>        rlock = {
> >>          raw_lock = {
> >>            val = {
> >>              counter = 0x0
> >>            }
> >>          }
> >>        }
> >>      }
> >>    }
> >> }
> >>
> >> crash> edis -f drm_dp_aux_dev_get_by_minor
> >> /usr/src/debug/kernel-3.10.0-957.12.1.el7/linux-3.10.0-957.12.1.el7.x86_64/drivers/gpu/drm/drm_dp_aux_dev.c: 57
> >>
> >>        56 static struct drm_dp_aux_dev *drm_dp_aux_dev_get_by_minor(unsigned index)
> >>        57 {
> >>        58 	struct drm_dp_aux_dev *aux_dev = NULL;
> >>        59
> >>        60 	mutex_lock(&aux_idr_mutex);
> >>        61 	aux_dev = idr_find(&aux_idr, index);
> >>        62 	if (!kref_get_unless_zero(&aux_dev->refcount))
> >>        63 		aux_dev = NULL;
> >>        64 	mutex_unlock(&aux_idr_mutex);
> >>        65
> >>        66 	return aux_dev;
> >>        67 }
> >> ----------------------------------------------------------------------------
> >>
> >> To avoid this kinds of situation, we should make a safeguard for the returned
> >> value. Changing the line 62 with the below would do.
> >>
> >>        62 	if (aux_dev && !kref_get_unless_zero(&aux_dev->refcount))
> >>                      ^^^^^^^^^^
> >>  From Tony Camuso <tcamuso@redhat.com>
> >> I built a patched kernel for several architectures.
> >> Booted the kernel, and ran the following for 100 iterations.
> >>     rmmod ipmi kmods to remove /dev/ipmi0.
> >>     Invoked ipmitool
> >>     insmod ipmi kmods
> >> Did not see any crashes or call traces.
> >>
> >> Suggested-by: Daniel Kwon <dkwon@redhat.com>
> >> Signed-off-by: Tony Camuso <tcamuso@redhat.com>
> >> ---
> >>   drivers/gpu/drm/drm_dp_aux_dev.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_dp_aux_dev.c b/drivers/gpu/drm/drm_dp_aux_dev.c
> >> index 0e4f25d63fd2d..0b11210c882ee 100644
> >> --- a/drivers/gpu/drm/drm_dp_aux_dev.c
> >> +++ b/drivers/gpu/drm/drm_dp_aux_dev.c
> >> @@ -60,7 +60,7 @@ static struct drm_dp_aux_dev *drm_dp_aux_dev_get_by_minor(unsigned index)
> >>   
> >>   	mutex_lock(&aux_idr_mutex);
> >>   	aux_dev = idr_find(&aux_idr, index);
> >> -	if (!kref_get_unless_zero(&aux_dev->refcount))
> >> +	if (aux_dev && !kref_get_unless_zero(&aux_dev->refcount))
> >>   		aux_dev = NULL;
> >>   	mutex_unlock(&aux_idr_mutex);
> > 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Ville Syrjälä
Intel

  reply	other threads:[~2019-07-10 13:56 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 11:09 [PATCH] drm: assure aux_dev is nonzero before using it tcamuso
2019-05-23 11:09 ` tcamuso
2019-05-24  8:36 ` Jani Nikula
2019-05-24  8:36   ` Jani Nikula
2019-05-24 10:48   ` tony camuso
2019-05-24 11:58     ` Ville Syrjälä
2019-07-10 13:47   ` Tony Camuso
2019-07-10 13:56     ` Ville Syrjälä [this message]
2019-07-12 16:07       ` Tony Camuso
2019-07-12 17:06         ` Ville Syrjälä
2019-07-12 17:35           ` Tony Camuso
2019-09-23 15:03           ` Tony Camuso
2019-09-23 15:22             ` Ville Syrjälä
2020-08-10 17:11 Zwane Mwaikambo
2020-08-10 17:11 ` Zwane Mwaikambo
2020-08-11  8:58 ` Daniel Vetter
2020-08-11  8:58   ` Daniel Vetter
2020-08-11 22:16   ` Zwane Mwaikambo
2020-08-11 22:16     ` Zwane Mwaikambo
2020-08-12 14:10     ` Daniel Vetter
2020-08-12 14:10       ` Daniel Vetter
2020-08-12 15:44       ` Lyude Paul
2020-08-12 15:44         ` Lyude Paul
2020-08-12 20:21         ` Zwane Mwaikambo
2020-08-12 20:21           ` Zwane Mwaikambo
2020-08-18 17:58         ` Zwane Mwaikambo
2020-08-18 17:58           ` Zwane Mwaikambo
2020-09-08 18:41           ` Lyude Paul
2020-09-08 18:41             ` Lyude Paul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190710135617.GE5942@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=airlied@linux.ie \
    --cc=dkwon@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tcamuso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.