All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Xiaotian Feng <xtfeng@gmail.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>, tj <tj@kernel.org>,
	linux-ide@vger.kernel.org,
	Alexander Gordeev <agordeev@redhat.com>
Subject: Re: ahci_host_activate NULL pointer (was Re: Linux 3.11-rc1)
Date: Mon, 15 Jul 2013 20:29:16 -0600	[thread overview]
Message-ID: <1373941756.8441.324.camel@ul30vt.home> (raw)
In-Reply-To: <CAJn8CcHbFtNxDr6N1qjUe_2RUSU=VqN9_XBRNi5rBHnV65bQ4w@mail.gmail.com>

On Mon, 2013-07-15 at 21:02 -0400, Xiaotian Feng wrote:
> On Mon, Jul 15, 2013 at 3:44 PM, Alex Williamson
> <alex.williamson@redhat.com> wrote:
> > On Mon, 2013-07-15 at 13:23 -0600, Alex Williamson wrote:
> >> On Mon, 2013-07-15 at 14:46 -0400, Xiaotian Feng wrote:
> >> > On Tue, Jul 16, 2013 at 1:38 AM, Alex Williamson <alex.williamson@redhat.com
> >> > > wrote:
> >> >
> >> > > On Mon, 2013-07-15 at 10:49 -0600, Alex Williamson wrote:
> >> > > > On Sun, 2013-07-14 at 16:57 -0700, Linus Torvalds wrote:
> >> > > > > It's been two weeks, and the merge window has closed. If I missed
> >> > > > > anything, holler, but I don't have anything pending that I am aware
> >> > > > > of.
> >> > > > >
> >> > > > > This merge window was smaller in terms of number of commits than the
> >> > > > > 3.10 merge window, but we actually have more new lines. Most of that
> >> > > > > seems to be in staging - a full third of all changes by line-count is
> >> > > > > staging, and merging in Lustre is the bulk of that. Let's see how that
> >> > > > > all turns out, I have to say that we don't have a great track record
> >> > > > > on merging filesystems through staging.
> >> > > > >
> >> > > > > Ignoring the lustre merge, I think this really was a somewhat calmer
> >> > > > > merge window. We had a few trees with problems, and we have an
> >> > > > > on-going debate about stable patches that was triggered largely thanks
> >> > > > > to this merge window, so now we'll have something to discuss for the
> >> > > > > kernel summit. But on the whole, I suspect we might be starting to see
> >> > > > > the traditional summer slump (Australia notwithstanding).
> >> > > > >
> >> > > > > Despite being a bit smaller than the last merge window, it's not like
> >> > > > > this was a _tiny_ one, and so as usual I'm only summarizing with the
> >> > > > > normal -rc1 mergelog: and as usual the people credited here are *not*
> >> > > > > the people who actually wrote the code (although in some cases that is
> >> > > > > true), they are the people who I merged the code from.
> >> > > > >
> >> > > > > Hey, let's all start testing,
> >> > > >
> >> > > > Anyone else seeing this:
> >> > > >
> >> > > > [    2.212548] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps
> >> > > 0x29 impl SATA mode
> >> > > > [    2.220732] ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pmp pio
> >> > > slum part ccc sxs
> >> > > > [    2.228997] BUG: unable to handle kernel NULL pointer dereference at
> >> > > 0000000000000508
> >> > > > [    2.236850] IP: [<ffffffff814084f7>] ahci_host_activate+0x87/0x140
> >> > > > [    2.243047] PGD 0
> >> > > > [    2.245077] Oops: 0000 [#1] SMP
> >> > > > [    2.248335] Modules linked in:
> >> > > > [    2.251405] CPU: 7 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc1+ #574
> >> > > > [    2.257929] Hardware name: LENOVO 4157CTO/LENOVO, BIOS 60KT41AUS
> >> > > 01/04/2011
> >> > > > [    2.264880] task: ffff880371508000 ti: ffff880371510000 task.ti:
> >> > > ffff880371510000
> >> > > > [    2.272353] RIP: 0010:[<ffffffff814084f7>]  [<ffffffff814084f7>]
> >> > > ahci_host_activate+0x87/0x140
> >> > > > [    2.280969] RSP: 0018:ffff880371511b38  EFLAGS: 00010293
> >> > > > [    2.286273] RAX: ffff88036e724000 RBX: ffff88036e71c028 RCX:
> >> > > ffffffff8140bce0
> >> > > > [    2.293397] RDX: 0000000000000000 RSI: 000000000000002f RDI:
> >> > > ffff88037122f098
> >> > > > [    2.300521] RBP: ffff880371511b68 R08: 0000000000000080 R09:
> >> > > 0000000000000001
> >> > > > [    2.307645] R10: 0000000000000000 R11: 0000000000000000 R12:
> >> > > 0000000000000001
> >> > > > [    2.314772] R13: 000000000000002e R14: 0000000000000000 R15:
> >> > > ffff88037122f000
> >> > > > [    2.321896] FS:  0000000000000000(0000) GS:ffff88037fdc0000(0000)
> >> > > knlGS:0000000000000000
> >> > > > [    2.329973] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> > > > [    2.335711] CR2: 0000000000000508 CR3: 0000000001c0b000 CR4:
> >> > > 00000000000007e0
> >> > > > [    2.342835] Stack:
> >> > > > [    2.344848]  ffff88036e720000 ffff88037122f000 0000000000000005
> >> > > ffff88037122f098
> >> > > > [    2.352301]  ffff88036e734000 ffff88036e71c028 ffff880371511c38
> >> > > ffffffff81408eae
> >> > > > [    2.359756]  ffff880300000000 ffff88037122f098 ffff8803710be7a8
> >> > > ffff880300000010
> >> > > > [    2.367210] Call Trace:
> >> > > > [    2.369655]  [<ffffffff81408eae>] ahci_init_one+0x8fe/0xaa0
> >> > > > [    2.375221]  [<ffffffff816141cf>] ?
> >> > > _raw_spin_unlock_irqrestore+0x3f/0x70
> >> > > > [    2.382006]  [<ffffffff8132529b>] local_pci_probe+0x4b/0x80
> >> > > > [    2.387571]  [<ffffffff81325501>] pci_device_probe+0x111/0x120
> >> > > > [    2.393405]  [<ffffffff813bf43b>] driver_probe_device+0x8b/0x390
> >> > > > [    2.399411]  [<ffffffff813bf7eb>] __driver_attach+0xab/0xb0
> >> > > > [    2.404984]  [<ffffffff813bf740>] ? driver_probe_device+0x390/0x390
> >> > > > [    2.411241]  [<ffffffff813bd2cd>] bus_for_each_dev+0x5d/0xa0
> >> > > > [    2.416893]  [<ffffffff813bed7e>] driver_attach+0x1e/0x20
> >> > > > [    2.422284]  [<ffffffff813be917>] bus_add_driver+0x117/0x290
> >> > > > [    2.427937]  [<ffffffff81d54755>] ? libata_transport_init+0x5e/0x5e
> >> > > > [    2.434201]  [<ffffffff813bfc8a>] driver_register+0x7a/0x170
> >> > > > [    2.439853]  [<ffffffff81d54755>] ? libata_transport_init+0x5e/0x5e
> >> > > > [    2.446110]  [<ffffffff81324ce4>] __pci_register_driver+0x64/0x70
> >> > > > [    2.452196]  [<ffffffff81d5476e>] ahci_pci_driver_init+0x19/0x1b
> >> > > > [    2.458196]  [<ffffffff810002fa>] do_one_initcall+0xfa/0x1b0
> >> > > > [    2.463853]  [<ffffffff8107b100>] ? parse_args+0x1f0/0x450
> >> > > > [    2.469332]  [<ffffffff81d13ff8>] kernel_init_freeable+0x154/0x1e3
> >> > > > [    2.475510]  [<ffffffff81d1383f>] ? do_early_param+0x8c/0x8c
> >> > > > [    2.481163]  [<ffffffff81602140>] ? rest_init+0xe0/0xe0
> >> > > > [    2.486390]  [<ffffffff8160214e>] kernel_init+0xe/0xf0
> >> > > > [    2.491530]  [<ffffffff8161cf5c>] ret_from_fork+0x7c/0xb0
> >> > > > [    2.496928]  [<ffffffff81602140>] ? rest_init+0xe0/0xe0
> >> > > > [    2.502144] Code: 88 00 00 00 49 63 c4 48 8b 7b 38 43 8d 34 2c 48 8b
> >> > > 84 c3 00 01 00 00 41 b8 80 00 00 00 48 c7 c1 e0 bc 40 81 48 8b 90 78 37 00
> >> > > 00 <4c> 8b 8a 08 05 00 00 48 89 04 24 48 c7 c2 90 be 40 81 e8 f2 b7
> >> > > > [    2.522097] RIP  [<ffffffff814084f7>] ahci_host_activate+0x87/0x140
> >> > > > [    2.528372]  RSP <ffff880371511b38>
> >> > > > [    2.531858] CR2: 0000000000000508
> >> > > > [    2.535184] ---[ end trace 66267c9b7b73f56b ]---
> >> > > > [    2.539808] Kernel panic - not syncing: Attempted to kill init!
> >> > > exitcode=0x00000009
> >> > > >
> >> > >
> >> > > Bisected to:
> >> > >
> >> > > commit b29900e62598cecd519c9ab2b8e4d03f8ebf702d
> >> > > Author: Alexander Gordeev <agordeev@redhat.com>
> >> > > Date:   Wed May 22 08:53:48 2013 +0900
> >> > >
> >> > >     AHCI: Make distinct names for ports in /proc/interrupts
> >> > >
> >> > >     Currently all interrupts assigned to AHCI ports show up in
> >> > >     '/proc/interrupts' as 'ahci'. This fix adds port numbers as
> >> > >     suffixes and hence makes the descriptions distinct.
> >> > >
> >> > >     Reported-by: Jan Beulich <JBeulich@suse.com>
> >> > >     Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
> >> > >     Signed-off-by: Tejun Heo <tj@kernel.org>
> >> > >
> >> > >
> >> > Could you please try this patch?
> >> >
> >> > diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> >> > index acfd0f7..e4b7176 100644
> >> > --- a/drivers/ata/libahci.c
> >> > +++ b/drivers/ata/libahci.c
> >> > @@ -2234,7 +2234,7 @@ static int ahci_port_start(struct ata_port *ap)
> >> >         if (!pp)
> >> >                 return -ENOMEM;
> >> >
> >> > -       if (ap->host->n_ports > 1) {
> >> > +       if (ap->host->n_ports > 0) {
> >> >                 pp->irq_desc = devm_kzalloc(dev, 8, GFP_KERNEL);
> >> >                 if (!pp->irq_desc) {
> >> >                         devm_kfree(dev, pp);
> >> >
> >>
> >> It does not help.  Thanks,
> >
> > Some further debugging, nr_ports is 6.  ahci_port_start gets called for
> > ap->port_no 0, 3 and 5.  The loop in ahci_host_activate dies on i = 1
> > because ->private_data is null.  Thanks,
> >
> 
> I think following patch can fix this panic. For ahci ports, when the
> port is dummy port, its private_data will be NULL, as dummy_port_ops
> doesn't support ->port_start. Could you please try this?
> 
> drivers/ata/ahci.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> index 5064f3e..26e07ba 100644
> --- a/drivers/ata/ahci.c
> +++ b/drivers/ata/ahci.c
> @@ -1147,10 +1147,16 @@ int ahci_host_activate(struct ata_host *host,
> int irq, unsigned int n_msis)
> 
>         for (i = 0; i < host->n_ports; i++) {
>                 struct ahci_port_priv *pp = host->ports[i]->private_data;
> +               const char *desc;
> +
> +               if (ata_port_is_dummy(host->ports[i]))
> +                       desc = dev_driver_string(host->dev);
> +               else
> +                       desc = pp->irq_desc;
> 
>                 rc = devm_request_threaded_irq(host->dev,
>                         irq + i, ahci_hw_interrupt, ahci_thread_fn, IRQF_SHARED,
> -                       pp->irq_desc, host->ports[i]);
> +                       desc, host->ports[i]);
>                 if (rc)
>                         goto out_free_irqs;
>         }

This works, so the ports must be using ata_dummy_port_ops.  Thanks,

Alex


      reply	other threads:[~2013-07-16  2:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-14 23:57 Linux 3.11-rc1 Linus Torvalds
2013-07-15  2:03 ` Linus-next stats (Was: Linux 3.11-rc1) Stephen Rothwell
2013-07-15 16:49 ` ahci_host_activate NULL pointer (was " Alex Williamson
2013-07-15 17:38   ` Alex Williamson
     [not found]     ` <CAJn8CcGWzvL67sBeZJVCM-xfibSTyAPPD2+0hf=qGqoz0CW-Kw@mail.gmail.com>
2013-07-15 19:23       ` Alex Williamson
2013-07-15 19:44         ` Alex Williamson
2013-07-15 21:24           ` Xiaotian Feng
2013-07-16  1:02           ` Xiaotian Feng
2013-07-16  2:29             ` Alex Williamson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1373941756.8441.324.camel@ul30vt.home \
    --to=alex.williamson@redhat.com \
    --cc=agordeev@redhat.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=xtfeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.