linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Oops in megaraid , kernel 2.6.14, 2.6.15 on x86_64
       [not found] <OFA777C944.9337E52B-ONC12570C1.0039A0E1-C12570C9.004D661A@avm.de>
@ 2006-01-17 22:44 ` Adrian Bunk
  2006-01-17 23:06   ` bug tracking Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Adrian Bunk @ 2006-01-17 22:44 UTC (permalink / raw)
  To: m.gerth
  Cc: Neela.Kolli, linux-scsi, rtjohnso, akpm, James.Bottomley, ak,
	linux-kernel

Hi Mike,

is your problem still present in kernel 2.6.16-rc1?

cu
Adrian


On Wed, Nov 30, 2005 at 03:05:18PM +0100, m.gerth@avm.de wrote:
> Hello
> this is a repost since I have some new information for the Oops described 
> below.
> 
> System:
>         4P Opteron x86_64
>         24GB RAM
>         LSI SCSI-megaraid controller 320-2X
>         Vanilla kernel 2.6.14 oops
> 
> Tested so far: (driver version is same in all kernels)
> 2.6.13 is ok. No Problems.
> 2.6.14 /w boot option mem=4G is also ok!
> 2.6.15-rc2 similiar to 2.6.14?
> 
> -----------------------------------------------------------------------------------------------------------------------------------
> Hello,
> 
> sorry for using this big list of recipients, but I'm not sure, who of you 
> might be the one, who is most interested in the problem.
> 
> I just booted vanilla 2.6.14 on a 4way Opteron (HP DL585) with 24GB RAM. 
> Kernel was compiled with gcc 3.3.5.
> MegaRaid is compiled into kernel.
> While booting and initializing megaraid I get the Oops described in the 
> following picture: 
> 
>         megaraid cmm: 2.20.2.6 (Release Date: Mon Mar 7 00:01:03 EST 2005)
>         megaraid: 2.20.4.6 (Release Date: Mon Mar 07 12:27:22 EST 2005)
>         megaraid: probe new device 0x1000:0x0407:0x1000:0x0532: bus 6:slot 
> 0:func 0
>         GSI 17 sharing vector 0xB1 and IRQ 17
>         ACPI: PCI Interrupt 0000:06:00.0[A] -> GSI 32 (level, low) -> IRQ 
> 177
>         megaraid: fw version:[414C] bios version:[H429]
>         scsi0 : LSI Logic MegaRAID driver
>         scsi[0]: scanning scsi channel 0 [Phy 0] for non-raid devices
>         Unable to handle kernel NULL pointer dereference at 
> 00000000000001cd RIP:
>         PGD 0
>         Oops: 0000 [1] SMP
>         CPU 3
>         Modules linked in:
>         Pid: 0, comm: swapper  Not tainted 2.6.14 #13
>         .....
> 
> 
> Tested so far: (driver version is same in all kernels)
> 2.6.13 is ok. No Problems.
> 2.6.14 /w boot option mem=4G is also ok!
> 2.6.15-rc2 similiar to 2.6.14?
> 
> Output 2.6.15-rc2:::
> megaraid cmm: 2.20.2.6 (Release Date: Mon Mar 7 00:01:03 EST 2005)
> megaraid: 2.20.4.6 (Release Date: Mon Mar 07 12:27:22 EST 2005)
> megaraid: probe new device 0x1000:0x0407:0x1000:0x0532: bus 6:slot 0:func 
> 0
> GSI 17 sharing vector 0xB1 and IRQ 17
> ACPI: PCI Interrupt 0000:06:00.0[A] -> GSI 32 (level, low) -> IRQ 177
> megaraid: fw version:[414C] bios version:[H429]
> scsi0 : LSI Logic MegaRAID driver
> scsi[0]: scanning scsi channel 0 [Phy 0] for non-raid devices
>   Vendor: COMPAQ    Model: PROLIANT 4L2I DT  Rev: 1.86
>   Type:   Processor                          ANSI SCSI revision: 02
> scsi[0]: scanning scsi channel 1 [Phy 1] for non-raid devices
> scheduling while atomic: swapper/0x00000100/0
> 
> Call Trace: <IRQ> <ffffffff8039153a>{schedule+122} 
> <ffffffff80193e0f>{__d_lookup+159}
>        <ffffffff80392e98>{__down+152} 
> <ffffffff8012dee0>{default_wake_function+0}
>        <ffffffff80392c8a>{__down_failed+53} 
> <ffffffff802224d0>{kobject_release+0}
>        <ffffffff802c8260>{.text.lock.main+25} 
> <ffffffff802c2f0d>{device_del+93}
>        <ffffffff802f82ce>{scsi_target_reap+142} 
> <ffffffff802f9aa5>{scsi_device_dev_release+261}
>        <ffffffff802c297c>{device_release+28} 
> <ffffffff80222474>{kobject_cleanup+100}
>        <ffffffff802224d0>{kobject_release+0} 
> <ffffffff80222eb1>{kref_put+129}
>        <ffffffff802f648f>{scsi_end_request+207} 
> <ffffffff802f69ef>{scsi_io_completion+1039}
>        <ffffffff802f1ba6>{scsi_finish_command+150} 
> <ffffffff802f1abd>{scsi_softirq+333}
>        <ffffffff801385eb>{__do_softirq+107} 
> <ffffffff8010ef2b>{call_softirq+31}
>        <ffffffff801107c1>{do_softirq+49} <ffffffff80110784>{do_IRQ+52}
>        <ffffffff8010e14c>{ret_from_intr+0}  <EOI> 
> <ffffffff80391b60>{thread_return+0}
>        <ffffffff8010bb1a>{default_idle+58} <ffffffff8010bd71>{cpu_idle+97}
> 
>   Vendor: COMPAQ    Model: PROLIANT 4L2I DB  Rev: 1.86
>   Type:   Processor                          ANSI SCSI revision: 02
> scsi[0]: scanning scsi channel 2 [virtual] for logical drives
>   Vendor: MegaRAID  Model: LD 0 RAID1  280G  Rev: 414C
>   Type:   Direct-Access                      ANSI SCSI revision: 02
>   Vendor: MegaRAID  Model: LD 1 RAID1  280G  Rev: 414C
>   Type:   Direct-Access                      ANSI SCSI revision: 02
> libata version 1.20 loaded.
> SCSI device sda: 573440000 512-byte hdwr sectors (293601 MB)
> sda: asking for cache data failed
> sda: assuming drive cache: write through
> SCSI device sda: 573440000 512-byte hdwr sectors (293601 MB)
> sda: asking for cache data failed
> sda: assuming drive cache: write through
>  sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 >
> sd 0:2:0:0: Attached scsi disk sda
> SCSI device sdb: 573440000 512-byte hdwr sectors (293601 MB)
> sdb: asking for cache data failed
> sdb: assuming drive cache: write through
> SCSI device sdb: 573440000 512-byte hdwr sectors (293601 MB)
> sdb: asking for cache data failed
> sdb: assuming drive cache: write through
>  sdb: sdb1 sdb2 sdb3
> sd 0:2:1:0: Attached scsi disk sdb
> 
> Thank you for your help in advance,
> Regards,
> Mike Gerth

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug tracking
  2006-01-17 22:44 ` Oops in megaraid , kernel 2.6.14, 2.6.15 on x86_64 Adrian Bunk
@ 2006-01-17 23:06   ` Andrew Morton
  2006-01-17 23:11     ` Adrian Bunk
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2006-01-17 23:06 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel

Adrian Bunk <bunk@stusta.de> wrote:
>
> is your problem still present in kernel 2.6.16-rc1?
>

What I've been doing for email bug reports is

a) Respond if I can, forward to maintainer, save them away

b) Wait a while (weeks to months)

c) Privately email the reporter, saying "if this bug is still present in
   2.6.15, please raise a report at bugzilla.kernel.org"


I've sent 100-200 of these emails in the past few days.  Of the people
who've responded, the great majority of these bugs were actually fixed, which
is nice.

My overall intent here is: if the bug isn't quickly resolved, get it moved
from email into bugzilla, where we can sanely keep track of its status.

For long-term bug tracking, I want to only track bugzilla-based bugs.  It
just gets too insane trying to follow the status of email-based reports.


What I'm doing locally is tracking all the bugzilla bugs which I think need
addressing, categorised by subsystem.  Go through them periodically, toss
out the ones which have been fixed.  I have a few hundred reports to go
through and I plan on getting nicely-collated per-subsystem reports out to
the mailing lists soon - probably next week.

I'm not tracking acpi at all - the acpi guys are doing that well and there
are too damn many of them ;)

What I'm not bothering to do is to close off or to reject fixed or
uninteresting bug reports.  I just silently ignore them.  Which means that
a bugzilla-based query will toss up a lot of noise, which is sad.  And I'm
not ensuring that all bugs are categorised as well as they could be - I do
that locally.  It's be nice to do these things, but it's dull and
time-consuming.

I do expect and hope that subsystem maintainers and developers will put
bugs into appropriate states as they work on them - most people are good
about this, but it varies a lot.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug tracking
  2006-01-17 23:06   ` bug tracking Andrew Morton
@ 2006-01-17 23:11     ` Adrian Bunk
  2006-01-17 23:50       ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Adrian Bunk @ 2006-01-17 23:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Tue, Jan 17, 2006 at 03:06:12PM -0800, Andrew Morton wrote:
>...
> c) Privately email the reporter, saying "if this bug is still present in
>    2.6.15, please raise a report at bugzilla.kernel.org"
> 
> I've sent 100-200 of these emails in the past few days.  Of the people
> who've responded, the great majority of these bugs were actually fixed, which
> is nice.
>...

Private emails have the disadvantage that noone else sees them.

Does this imply that it can be assumed that you are tracking all 
unresolved bug reports sent to linux-kernel until they are either 
resolved or in Bugzilla?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug tracking
  2006-01-17 23:11     ` Adrian Bunk
@ 2006-01-17 23:50       ` Andrew Morton
  2006-01-18  8:07         ` Adrian Bunk
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2006-01-17 23:50 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel

Adrian Bunk <bunk@stusta.de> wrote:
>
> On Tue, Jan 17, 2006 at 03:06:12PM -0800, Andrew Morton wrote:
> >...
> > c) Privately email the reporter, saying "if this bug is still present in
> >    2.6.15, please raise a report at bugzilla.kernel.org"
> > 
> > I've sent 100-200 of these emails in the past few days.  Of the people
> > who've responded, the great majority of these bugs were actually fixed, which
> > is nice.
> >...
> 
> Private emails have the disadvantage that noone else sees them.

Sure.  But given that I've given the reporter only two options:

a) Tell me it's fixed or

b) take it to bugzilla

I don't think there's much of interest to anyone else.  If the reporter
chooses to be awkward and starts going into details then yeah, cc's need to
be re-added.

> Does this imply that it can be assumed that you are tracking all 
> unresolved bug reports sent to linux-kernel until they are either 
> resolved or in Bugzilla?

No, I can miss stuff.  And there are lots more mailing lists.  Many of
them for drivers, which is where most of the bugs are.  

So please don't let me discourage you from doing this - if a reporter gets
multiple emails regarding a bug, he's unlikely to be offended - it's heaps
better than zero emails!  I'll cc you in future if you like so we can avoid
duplication.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug tracking
  2006-01-17 23:50       ` Andrew Morton
@ 2006-01-18  8:07         ` Adrian Bunk
  0 siblings, 0 replies; 5+ messages in thread
From: Adrian Bunk @ 2006-01-18  8:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Tue, Jan 17, 2006 at 03:50:13PM -0800, Andrew Morton wrote:
>...
> So please don't let me discourage you from doing this - if a reporter gets
> multiple emails regarding a bug, he's unlikely to be offended - it's heaps
> better than zero emails!  I'll cc you in future if you like so we can avoid
> duplication.

You can Cc me, but there might be other people with the same problem.

But thinking about it, the real issue seems to be that the Linux kernel 
is the only open source project of this size I know about where bug 
reporters are not encouraged to enter all bugs in a bug tracking system 
resulting in manual work to track them.

There are classes of bugs where Bugzilla doesn't bring much ("your patch 
xyz.patch in the latest -mm breaks compilation on i386"), but for most 
bug reports it would be nice. The problem is that while some subsystem 
maintainers use the kernel Bugzilla, others don't want to use it and 
want bug reports sent to mailing lists instead.

I know that some kernel developers do not like Bugzilla, but the kernel
Bugzilla has email interfaces in both directions and it should therefore 
be possible to integrate into existing workflows without too much
overhead.

This way, the bookkeeping of bugs is done automatically, and although 
this does not automatically fix bugs I've e.g. had some cases where I 
told maintainers "please review your 19 open bugs in the kernel 
Bugzilla" resulting in maintainers actually reviewing and fixing bugs. 
This included cases where the old maintainer was inactive and a new 
maintainer has taken over maintainership.

How can we get to the state that there's a kernel bug tracking system 
every maintainer is using?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-01-18  8:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <OFA777C944.9337E52B-ONC12570C1.0039A0E1-C12570C9.004D661A@avm.de>
2006-01-17 22:44 ` Oops in megaraid , kernel 2.6.14, 2.6.15 on x86_64 Adrian Bunk
2006-01-17 23:06   ` bug tracking Andrew Morton
2006-01-17 23:11     ` Adrian Bunk
2006-01-17 23:50       ` Andrew Morton
2006-01-18  8:07         ` Adrian Bunk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).