All of lore.kernel.org
 help / color / mirror / Atom feed
* LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
@ 2009-09-16 16:18 Eddie
  2009-09-16 16:31 ` James Bottomley
  0 siblings, 1 reply; 18+ messages in thread
From: Eddie @ 2009-09-16 16:18 UTC (permalink / raw)
  To: linux-scsi

I've done a quick search of the archives, and didn't find anything 
relevant, or I missed it.  :(

I've just installed the latest Slackware release, 13.0, in 64-bit mode, 
that uses kernel 2.6.29.6.  Previously, I was running 12.2, in 32-bit, 
which used kernel 2.6.27.31.  After booting, I noticed that the drives 
on the raid had not been mounted.  Checking through dmesg, I noticed 
that the "drive" attached to the card had not been correctly 
recognised.  Here's the relevant messages, about the card, and the 
drive, from dmesg:

megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)

megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006)
megaraid: probe new device 0x101e:0x1960:0x101e:0x0511: bus 1:slot 4:func 0

megaraid 0000:01:04.0: PCI INT A -> Link[LNK1] -> GSI 19 (level, high) 
-> IRQ 19
megaraid: fw version:[N661] bios version:[1.01]
scsi4 : LSI Logic MegaRAID driver
scsi[4]: scanning scsi channel 0 [Phy 0] for non-raid devices

scsi[4]: scanning scsi channel 1 [Phy 1] for non-raid devices
scsi[4]: scanning scsi channel 2 [Phy 2] for non-raid devices
scsi[4]: scanning scsi channel 3 [Phy 3] for non-raid devices
scsi[4]: scanning scsi channel 4 [virtual] for logical drives
scsi scan: INQUIRY result too short (5), using 36
scsi 4:4:0:0: Direct-Access                                    PQ: 0 ANSI: 0
sd 4:4:0:0: [sda] Sector size 0 reported, assuming 512.
sd 4:4:0:0: [sda] 1 512-byte hardware sectors: (512 B/512 B)
sd 4:4:0:0: [sda] Write Protect is off
sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
sd 4:4:0:0: [sda] Asking for cache data failed
sd 4:4:0:0: [sda] Assuming drive cache: write through
sd 4:4:0:0: [sda] Sector size 0 reported, assuming 512.
sd 4:4:0:0: [sda] 1 512-byte hardware sectors: (512 B/512 B)
sd 4:4:0:0: [sda] Write Protect is off
sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
sd 4:4:0:0: [sda] Asking for cache data failed
sd 4:4:0:0: [sda] Assuming drive cache: write through
 sda: unknown partition table
sd 4:4:0:0: [sda] Attached SCSI disk
sd 4:4:0:0: Attached scsi generic sg1 type 0


Obviously, it's this "scsi scan: INQUIRY result too short (5), using 
36", causing the drive not to be correctly recognised.  Normally, I'd 
expect to get:

scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors (1500317 MB)

In order to eliminate either the change in kernel version, or the change 
in architecture, as the culprit, I booted the machine, from both the 
32-bit, and 64-bit, DVDs from Slack.  With the 32-bit kernel, the drive 
was correctly recognised, but not with the 64-bit.  So, it's definitely 
the change in architecture causing this, not the change in kernel version.

The full dmesg can be found here:  ftp.BogoLinux.net/pub together with 
the dmesg from the 32-bit system, with kernel 2.6.27.31.

Cheers,
Eddie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-16 16:18 LSI MegaRAID not recognised correctly in 64-bit. 2.6.29.6 Eddie
@ 2009-09-16 16:31 ` James Bottomley
  2009-09-16 16:54   ` Eddie
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2009-09-16 16:31 UTC (permalink / raw)
  To: stunnel; +Cc: linux-scsi

On Wed, 2009-09-16 at 09:18 -0700, Eddie wrote:
> I've done a quick search of the archives, and didn't find anything 
> relevant, or I missed it.  :(
> 
> I've just installed the latest Slackware release, 13.0, in 64-bit mode, 
> that uses kernel 2.6.29.6.  Previously, I was running 12.2, in 32-bit, 
> which used kernel 2.6.27.31.  After booting, I noticed that the drives 
> on the raid had not been mounted.  Checking through dmesg, I noticed 
> that the "drive" attached to the card had not been correctly 
> recognised.  Here's the relevant messages, about the card, and the 
> drive, from dmesg:
> 
> megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)
> 
> megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006)
> megaraid: probe new device 0x101e:0x1960:0x101e:0x0511: bus 1:slot 4:func 0
> 
> megaraid 0000:01:04.0: PCI INT A -> Link[LNK1] -> GSI 19 (level, high) 
> -> IRQ 19
> megaraid: fw version:[N661] bios version:[1.01]
> scsi4 : LSI Logic MegaRAID driver
> scsi[4]: scanning scsi channel 0 [Phy 0] for non-raid devices
> 
> scsi[4]: scanning scsi channel 1 [Phy 1] for non-raid devices
> scsi[4]: scanning scsi channel 2 [Phy 2] for non-raid devices
> scsi[4]: scanning scsi channel 3 [Phy 3] for non-raid devices
> scsi[4]: scanning scsi channel 4 [virtual] for logical drives
> scsi scan: INQUIRY result too short (5), using 36
> scsi 4:4:0:0: Direct-Access                                    PQ: 0 ANSI: 0
> sd 4:4:0:0: [sda] Sector size 0 reported, assuming 512.
> sd 4:4:0:0: [sda] 1 512-byte hardware sectors: (512 B/512 B)
> sd 4:4:0:0: [sda] Write Protect is off
> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> sd 4:4:0:0: [sda] Asking for cache data failed
> sd 4:4:0:0: [sda] Assuming drive cache: write through
> sd 4:4:0:0: [sda] Sector size 0 reported, assuming 512.
> sd 4:4:0:0: [sda] 1 512-byte hardware sectors: (512 B/512 B)
> sd 4:4:0:0: [sda] Write Protect is off
> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> sd 4:4:0:0: [sda] Asking for cache data failed
> sd 4:4:0:0: [sda] Assuming drive cache: write through
>  sda: unknown partition table
> sd 4:4:0:0: [sda] Attached SCSI disk
> sd 4:4:0:0: Attached scsi generic sg1 type 0
> 
> 
> Obviously, it's this "scsi scan: INQUIRY result too short (5), using 
> 36", causing the drive not to be correctly recognised.  Normally, I'd 
> expect to get:
> 
> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors (1500317 MB)
> 
> In order to eliminate either the change in kernel version, or the change 
> in architecture, as the culprit, I booted the machine, from both the 
> 32-bit, and 64-bit, DVDs from Slack.  With the 32-bit kernel, the drive 
> was correctly recognised, but not with the 64-bit.  So, it's definitely 
> the change in architecture causing this, not the change in kernel version.
> 
> The full dmesg can be found here:  ftp.BogoLinux.net/pub together with 
> the dmesg from the 32-bit system, with kernel 2.6.27.31.

How much memory does your system have?

Best guess in the 64 bit case is that the physical memory the kernel is
doing DMA to isn't within the range of the card.  You might be able to
test this by booting with the max_addr=4G parameter in the 64 bit case.

If it is, we'll have to get the DMA mask for this thing set up
correctly.

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-16 16:31 ` James Bottomley
@ 2009-09-16 16:54   ` Eddie
  2009-09-17  2:58     ` Eddie
  2009-09-17  2:59     ` Eddie
  0 siblings, 2 replies; 18+ messages in thread
From: Eddie @ 2009-09-16 16:54 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi

James Bottomley wrote:
> On Wed, 2009-09-16 at 09:18 -0700, Eddie wrote:
>   
>> I've done a quick search of the archives, and didn't find anything 
>> relevant, or I missed it.  :(
>>
>> I've just installed the latest Slackware release, 13.0, in 64-bit mode, 
>> that uses kernel 2.6.29.6.  Previously, I was running 12.2, in 32-bit, 
>> which used kernel 2.6.27.31.  After booting, I noticed that the drives 
>> on the raid had not been mounted.  Checking through dmesg, I noticed 
>> that the "drive" attached to the card had not been correctly 
>> recognised.  Here's the relevant messages, about the card, and the 
>> drive, from dmesg:
>>
>> megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)
>>
>> megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006)
>> megaraid: probe new device 0x101e:0x1960:0x101e:0x0511: bus 1:slot 4:func 0
>>
>> megaraid 0000:01:04.0: PCI INT A -> Link[LNK1] -> GSI 19 (level, high) 
>> -> IRQ 19
>> megaraid: fw version:[N661] bios version:[1.01]
>> scsi4 : LSI Logic MegaRAID driver
>> scsi[4]: scanning scsi channel 0 [Phy 0] for non-raid devices
>>
>> scsi[4]: scanning scsi channel 1 [Phy 1] for non-raid devices
>> scsi[4]: scanning scsi channel 2 [Phy 2] for non-raid devices
>> scsi[4]: scanning scsi channel 3 [Phy 3] for non-raid devices
>> scsi[4]: scanning scsi channel 4 [virtual] for logical drives
>> scsi scan: INQUIRY result too short (5), using 36
>> scsi 4:4:0:0: Direct-Access                                    PQ: 0 ANSI: 0
>> sd 4:4:0:0: [sda] Sector size 0 reported, assuming 512.
>> sd 4:4:0:0: [sda] 1 512-byte hardware sectors: (512 B/512 B)
>> sd 4:4:0:0: [sda] Write Protect is off
>> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
>> sd 4:4:0:0: [sda] Asking for cache data failed
>> sd 4:4:0:0: [sda] Assuming drive cache: write through
>> sd 4:4:0:0: [sda] Sector size 0 reported, assuming 512.
>> sd 4:4:0:0: [sda] 1 512-byte hardware sectors: (512 B/512 B)
>> sd 4:4:0:0: [sda] Write Protect is off
>> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
>> sd 4:4:0:0: [sda] Asking for cache data failed
>> sd 4:4:0:0: [sda] Assuming drive cache: write through
>>  sda: unknown partition table
>> sd 4:4:0:0: [sda] Attached SCSI disk
>> sd 4:4:0:0: Attached scsi generic sg1 type 0
>>
>>
>> Obviously, it's this "scsi scan: INQUIRY result too short (5), using 
>> 36", causing the drive not to be correctly recognised.  Normally, I'd 
>> expect to get:
>>
>> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
>> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors (1500317 MB)
>>
>> In order to eliminate either the change in kernel version, or the change 
>> in architecture, as the culprit, I booted the machine, from both the 
>> 32-bit, and 64-bit, DVDs from Slack.  With the 32-bit kernel, the drive 
>> was correctly recognised, but not with the 64-bit.  So, it's definitely 
>> the change in architecture causing this, not the change in kernel version.
>>
>> The full dmesg can be found here:  ftp.BogoLinux.net/pub together with 
>> the dmesg from the 32-bit system, with kernel 2.6.27.31.
>>     
>
> How much memory does your system have?
>
> Best guess in the 64 bit case is that the physical memory the kernel is
> doing DMA to isn't within the range of the card.  You might be able to
> test this by booting with the max_addr=4G parameter in the 64 bit case.
>
> If it is, we'll have to get the DMA mask for this thing set up
> correctly.
>
> James
>   

James,

It's got 8Gig.

I'll try your suggestion tonight, when I get home.

Cheers,
Eddie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-16 16:54   ` Eddie
@ 2009-09-17  2:58     ` Eddie
  2009-09-17  2:59     ` Eddie
  1 sibling, 0 replies; 18+ messages in thread
From: Eddie @ 2009-09-17  2:58 UTC (permalink / raw)
  Cc: James Bottomley, linux-scsi

>
>> How much memory does your system have?
>>
>> Best guess in the 64 bit case is that the physical memory the kernel is
>> doing DMA to isn't within the range of the card.  You might be able to
>> test this by booting with the max_addr=4G parameter in the 64 bit case.
>>
>> If it is, we'll have to get the DMA mask for this thing set up
>> correctly.
>>
>> James
>>   
>
> James,
>
> It's got 8Gig.
>
> I'll try your suggestion tonight, when I get home.
>
> Cheers,
> Eddie
>
OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
difference.  It still booted with all 8G.  :(

But, changing it to addappend = " mem=4G" seemed to do the trick.

And, your guess might be correct.  I now see the correct messages for 
the MegaRAID:

scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
sd 4:4:0:0: [sda] Write Protect is off
sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
sd 4:4:0:0: [sda] Asking for cache data failed
sd 4:4:0:0: [sda] Assuming drive cache: write through
sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
sd 4:4:0:0: [sda] Write Protect is off
sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
sd 4:4:0:0: [sda] Asking for cache data failed
sd 4:4:0:0: [sda] Assuming drive cache: write through
 sda: sda1
sd 4:4:0:0: [sda] Attached SCSI disk
sd 4:4:0:0: Attached scsi generic sg1 type 0

Cheers,
Eddie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-16 16:54   ` Eddie
  2009-09-17  2:58     ` Eddie
@ 2009-09-17  2:59     ` Eddie
  2009-09-17 15:59       ` James Bottomley
  1 sibling, 1 reply; 18+ messages in thread
From: Eddie @ 2009-09-17  2:59 UTC (permalink / raw)
  Cc: James Bottomley, linux-scsi

>
>> How much memory does your system have?
>>
>> Best guess in the 64 bit case is that the physical memory the kernel is
>> doing DMA to isn't within the range of the card.  You might be able to
>> test this by booting with the max_addr=4G parameter in the 64 bit case.
>>
>> If it is, we'll have to get the DMA mask for this thing set up
>> correctly.
>>
>> James
>>   
>
> James,
>
> It's got 8Gig.
>
> I'll try your suggestion tonight, when I get home.
>
> Cheers,
> Eddie
>
OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
difference.  It still booted with all 8G.  :(

But, changing it to addappend = " mem=4G" seemed to do the trick.

And, your guess might be correct.  I now see the correct messages for 
the MegaRAID:

scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
sd 4:4:0:0: [sda] Write Protect is off
sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
sd 4:4:0:0: [sda] Asking for cache data failed
sd 4:4:0:0: [sda] Assuming drive cache: write through
sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
sd 4:4:0:0: [sda] Write Protect is off
sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
sd 4:4:0:0: [sda] Asking for cache data failed
sd 4:4:0:0: [sda] Assuming drive cache: write through
 sda: sda1
sd 4:4:0:0: [sda] Attached SCSI disk
sd 4:4:0:0: Attached scsi generic sg1 type 0

Cheers,
Eddie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-17  2:59     ` Eddie
@ 2009-09-17 15:59       ` James Bottomley
  2009-09-17 16:09         ` Eddie
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2009-09-17 15:59 UTC (permalink / raw)
  To: stunnel; +Cc: linux-scsi, linux-box

On Wed, 2009-09-16 at 19:59 -0700, Eddie wrote:
> >
> >> How much memory does your system have?
> >>
> >> Best guess in the 64 bit case is that the physical memory the kernel is
> >> doing DMA to isn't within the range of the card.  You might be able to
> >> test this by booting with the max_addr=4G parameter in the 64 bit case.
> >>
> >> If it is, we'll have to get the DMA mask for this thing set up
> >> correctly.
> >>
> >> James
> >>   
> >
> > James,
> >
> > It's got 8Gig.
> >
> > I'll try your suggestion tonight, when I get home.
> >
> > Cheers,
> > Eddie
> >
> OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
> difference.  It still booted with all 8G.  :(
> 
> But, changing it to addappend = " mem=4G" seemed to do the trick.
> 
> And, your guess might be correct.  I now see the correct messages for 
> the MegaRAID:
> 
> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> sd 4:4:0:0: [sda] Write Protect is off
> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> sd 4:4:0:0: [sda] Asking for cache data failed
> sd 4:4:0:0: [sda] Assuming drive cache: write through
> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> sd 4:4:0:0: [sda] Write Protect is off
> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> sd 4:4:0:0: [sda] Asking for cache data failed
> sd 4:4:0:0: [sda] Assuming drive cache: write through
>  sda: sda1
> sd 4:4:0:0: [sda] Attached SCSI disk
> sd 4:4:0:0: Attached scsi generic sg1 type 0

Hmm, so the driver looks to do this correctly.  By default it sets a 32
bit DMA mask but it raises it to 64 bits for certain boards which can
support that (based on the PCI ids).  Can you do an lspci -n -v and send
the output?  That will tell me whether the board got a 64 bit mask.

Thanks,

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-17 15:59       ` James Bottomley
@ 2009-09-17 16:09         ` Eddie
  2009-09-17 18:39           ` James Bottomley
  0 siblings, 1 reply; 18+ messages in thread
From: Eddie @ 2009-09-17 16:09 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-box

James Bottomley wrote:
> On Wed, 2009-09-16 at 19:59 -0700, Eddie wrote:
>   
>>>> How much memory does your system have?
>>>>
>>>> Best guess in the 64 bit case is that the physical memory the kernel is
>>>> doing DMA to isn't within the range of the card.  You might be able to
>>>> test this by booting with the max_addr=4G parameter in the 64 bit case.
>>>>
>>>> If it is, we'll have to get the DMA mask for this thing set up
>>>> correctly.
>>>>
>>>> James
>>>>   
>>>>         
>>> James,
>>>
>>> It's got 8Gig.
>>>
>>> I'll try your suggestion tonight, when I get home.
>>>
>>> Cheers,
>>> Eddie
>>>
>>>       
>> OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
>> difference.  It still booted with all 8G.  :(
>>
>> But, changing it to addappend = " mem=4G" seemed to do the trick.
>>
>> And, your guess might be correct.  I now see the correct messages for 
>> the MegaRAID:
>>
>> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
>> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
>> sd 4:4:0:0: [sda] Write Protect is off
>> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
>> sd 4:4:0:0: [sda] Asking for cache data failed
>> sd 4:4:0:0: [sda] Assuming drive cache: write through
>> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
>> sd 4:4:0:0: [sda] Write Protect is off
>> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
>> sd 4:4:0:0: [sda] Asking for cache data failed
>> sd 4:4:0:0: [sda] Assuming drive cache: write through
>>  sda: sda1
>> sd 4:4:0:0: [sda] Attached SCSI disk
>> sd 4:4:0:0: Attached scsi generic sg1 type 0
>>     
>
> Hmm, so the driver looks to do this correctly.  By default it sets a 32
> bit DMA mask but it raises it to 64 bits for certain boards which can
> support that (based on the PCI ids).  Can you do an lspci -n -v and send
> the output?  That will tell me whether the board got a 64 bit mask.
>
> Thanks,
>
> James
>
>   
James,

This is when booted with the "mem=4G" override still in place.  If you 
need it without that, when it fails to "see" the device, let me know, 
and I'll re-boot tonight to gather it:

01:04.0 0104: 101e:1960 (rev 02)
        Subsystem: 101e:0511
        Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 19
        Memory at b8000000 (32-bit, prefetchable) [size=64K]
        [virtual] Expansion ROM at b8010000 [disabled] [size=32K]
        Capabilities: [80] Power Management version 2
        Kernel driver in use: megaraid
        Kernel modules: megaraid_mbox

Cheers,
Eddie


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-17 16:09         ` Eddie
@ 2009-09-17 18:39           ` James Bottomley
  2009-09-17 20:28             ` James Bottomley
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2009-09-17 18:39 UTC (permalink / raw)
  To: stunnel; +Cc: linux-scsi, linux-box

On Thu, 2009-09-17 at 09:09 -0700, Eddie wrote:
> James Bottomley wrote:
> > On Wed, 2009-09-16 at 19:59 -0700, Eddie wrote:
> >   
> >>>> How much memory does your system have?
> >>>>
> >>>> Best guess in the 64 bit case is that the physical memory the kernel is
> >>>> doing DMA to isn't within the range of the card.  You might be able to
> >>>> test this by booting with the max_addr=4G parameter in the 64 bit case.
> >>>>
> >>>> If it is, we'll have to get the DMA mask for this thing set up
> >>>> correctly.
> >>>>
> >>>> James
> >>>>   
> >>>>         
> >>> James,
> >>>
> >>> It's got 8Gig.
> >>>
> >>> I'll try your suggestion tonight, when I get home.
> >>>
> >>> Cheers,
> >>> Eddie
> >>>
> >>>       
> >> OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
> >> difference.  It still booted with all 8G.  :(
> >>
> >> But, changing it to addappend = " mem=4G" seemed to do the trick.
> >>
> >> And, your guess might be correct.  I now see the correct messages for 
> >> the MegaRAID:
> >>
> >> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
> >> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> >> sd 4:4:0:0: [sda] Write Protect is off
> >> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> >> sd 4:4:0:0: [sda] Asking for cache data failed
> >> sd 4:4:0:0: [sda] Assuming drive cache: write through
> >> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> >> sd 4:4:0:0: [sda] Write Protect is off
> >> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> >> sd 4:4:0:0: [sda] Asking for cache data failed
> >> sd 4:4:0:0: [sda] Assuming drive cache: write through
> >>  sda: sda1
> >> sd 4:4:0:0: [sda] Attached SCSI disk
> >> sd 4:4:0:0: Attached scsi generic sg1 type 0
> >>     
> >
> > Hmm, so the driver looks to do this correctly.  By default it sets a 32
> > bit DMA mask but it raises it to 64 bits for certain boards which can
> > support that (based on the PCI ids).  Can you do an lspci -n -v and send
> > the output?  That will tell me whether the board got a 64 bit mask.
> >
> > Thanks,
> >
> > James
> >
> >   
> James,
> 
> This is when booted with the "mem=4G" override still in place.  If you 
> need it without that, when it fails to "see" the device, let me know, 
> and I'll re-boot tonight to gather it:
> 
> 01:04.0 0104: 101e:1960 (rev 02)
>         Subsystem: 101e:0511

This is sufficient.  That's an AMI Megaraid3.  They're not 64 bit
capable and they should only have a 32 bit DMA mask.  The block layer
should be doing the right thing, so there must be something from a >4GB
pool leaking into the driver somewhere: probably a stray kmalloc of a
DMA buffer without the right flags ... I'll run over the driver and see
if I can spot it.

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-17 18:39           ` James Bottomley
@ 2009-09-17 20:28             ` James Bottomley
  2009-09-18 13:00               ` FUJITA Tomonori
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2009-09-17 20:28 UTC (permalink / raw)
  To: stunnel; +Cc: linux-scsi, linux-box, FUJITA Tomonori

On Thu, 2009-09-17 at 18:39 +0000, James Bottomley wrote:
> On Thu, 2009-09-17 at 09:09 -0700, Eddie wrote:
> > James Bottomley wrote:
> > > On Wed, 2009-09-16 at 19:59 -0700, Eddie wrote:
> > >   
> > >>>> How much memory does your system have?
> > >>>>
> > >>>> Best guess in the 64 bit case is that the physical memory the kernel is
> > >>>> doing DMA to isn't within the range of the card.  You might be able to
> > >>>> test this by booting with the max_addr=4G parameter in the 64 bit case.
> > >>>>
> > >>>> If it is, we'll have to get the DMA mask for this thing set up
> > >>>> correctly.
> > >>>>
> > >>>> James
> > >>>>   
> > >>>>         
> > >>> James,
> > >>>
> > >>> It's got 8Gig.
> > >>>
> > >>> I'll try your suggestion tonight, when I get home.
> > >>>
> > >>> Cheers,
> > >>> Eddie
> > >>>
> > >>>       
> > >> OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
> > >> difference.  It still booted with all 8G.  :(
> > >>
> > >> But, changing it to addappend = " mem=4G" seemed to do the trick.
> > >>
> > >> And, your guess might be correct.  I now see the correct messages for 
> > >> the MegaRAID:
> > >>
> > >> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
> > >> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> > >> sd 4:4:0:0: [sda] Write Protect is off
> > >> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> > >> sd 4:4:0:0: [sda] Asking for cache data failed
> > >> sd 4:4:0:0: [sda] Assuming drive cache: write through
> > >> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> > >> sd 4:4:0:0: [sda] Write Protect is off
> > >> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> > >> sd 4:4:0:0: [sda] Asking for cache data failed
> > >> sd 4:4:0:0: [sda] Assuming drive cache: write through
> > >>  sda: sda1
> > >> sd 4:4:0:0: [sda] Attached SCSI disk
> > >> sd 4:4:0:0: Attached scsi generic sg1 type 0
> > >>     
> > >
> > > Hmm, so the driver looks to do this correctly.  By default it sets a 32
> > > bit DMA mask but it raises it to 64 bits for certain boards which can
> > > support that (based on the PCI ids).  Can you do an lspci -n -v and send
> > > the output?  That will tell me whether the board got a 64 bit mask.
> > >
> > > Thanks,
> > >
> > > James
> > >
> > >   
> > James,
> > 
> > This is when booted with the "mem=4G" override still in place.  If you 
> > need it without that, when it fails to "see" the device, let me know, 
> > and I'll re-boot tonight to gather it:
> > 
> > 01:04.0 0104: 101e:1960 (rev 02)
> >         Subsystem: 101e:0511
> 
> This is sufficient.  That's an AMI Megaraid3.  They're not 64 bit
> capable and they should only have a 32 bit DMA mask.  The block layer
> should be doing the right thing, so there must be something from a >4GB
> pool leaking into the driver somewhere: probably a stray kmalloc of a
> DMA buffer without the right flags ... I'll run over the driver and see
> if I can spot it.

OK, I analysed the code paths; I'm nearly certain the dma_map_sg() is
returning addresses greater than the 32 bits allowable, which would
point to some type of pci gart DMA failure (cc'ing Tomo for input).

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-17 20:28             ` James Bottomley
@ 2009-09-18 13:00               ` FUJITA Tomonori
  2009-09-18 14:11                 ` Matthew Wilcox
  0 siblings, 1 reply; 18+ messages in thread
From: FUJITA Tomonori @ 2009-09-18 13:00 UTC (permalink / raw)
  To: James.Bottomley; +Cc: stunnel, linux-scsi, bo.yang, fujita.tomonori

On Thu, 17 Sep 2009 14:28:11 -0600
James Bottomley <James.Bottomley@suse.de> wrote:

> On Thu, 2009-09-17 at 18:39 +0000, James Bottomley wrote:
> > On Thu, 2009-09-17 at 09:09 -0700, Eddie wrote:
> > > James Bottomley wrote:
> > > > On Wed, 2009-09-16 at 19:59 -0700, Eddie wrote:
> > > >   
> > > >>>> How much memory does your system have?
> > > >>>>
> > > >>>> Best guess in the 64 bit case is that the physical memory the kernel is
> > > >>>> doing DMA to isn't within the range of the card.  You might be able to
> > > >>>> test this by booting with the max_addr=4G parameter in the 64 bit case.
> > > >>>>
> > > >>>> If it is, we'll have to get the DMA mask for this thing set up
> > > >>>> correctly.
> > > >>>>
> > > >>>> James
> > > >>>>   
> > > >>>>         
> > > >>> James,
> > > >>>
> > > >>> It's got 8Gig.
> > > >>>
> > > >>> I'll try your suggestion tonight, when I get home.
> > > >>>
> > > >>> Cheers,
> > > >>> Eddie
> > > >>>
> > > >>>       
> > > >> OK, adding addappend = " max_addr=4G" to my lilo.conf made no 
> > > >> difference.  It still booted with all 8G.  :(
> > > >>
> > > >> But, changing it to addappend = " mem=4G" seemed to do the trick.
> > > >>
> > > >> And, your guess might be correct.  I now see the correct messages for 
> > > >> the MegaRAID:
> > > >>
> > > >> scsi 4:4:0:0: Direct-Access     MegaRAID LD 0 RAID5 1430G N661 PQ: 0 ANSI: 2
> > > >> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> > > >> sd 4:4:0:0: [sda] Write Protect is off
> > > >> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> > > >> sd 4:4:0:0: [sda] Asking for cache data failed
> > > >> sd 4:4:0:0: [sda] Assuming drive cache: write through
> > > >> sd 4:4:0:0: [sda] 2930307072 512-byte hardware sectors: (1.50 TB/1.36 TiB)
> > > >> sd 4:4:0:0: [sda] Write Protect is off
> > > >> sd 4:4:0:0: [sda] Mode Sense: 00 00 00 00
> > > >> sd 4:4:0:0: [sda] Asking for cache data failed
> > > >> sd 4:4:0:0: [sda] Assuming drive cache: write through
> > > >>  sda: sda1
> > > >> sd 4:4:0:0: [sda] Attached SCSI disk
> > > >> sd 4:4:0:0: Attached scsi generic sg1 type 0
> > > >>     
> > > >
> > > > Hmm, so the driver looks to do this correctly.  By default it sets a 32
> > > > bit DMA mask but it raises it to 64 bits for certain boards which can
> > > > support that (based on the PCI ids).  Can you do an lspci -n -v and send
> > > > the output?  That will tell me whether the board got a 64 bit mask.
> > > >
> > > > Thanks,
> > > >
> > > > James
> > > >
> > > >   
> > > James,
> > > 
> > > This is when booted with the "mem=4G" override still in place.  If you 
> > > need it without that, when it fails to "see" the device, let me know, 
> > > and I'll re-boot tonight to gather it:
> > > 
> > > 01:04.0 0104: 101e:1960 (rev 02)
> > >         Subsystem: 101e:0511
> > 
> > This is sufficient.  That's an AMI Megaraid3.  They're not 64 bit
> > capable and they should only have a 32 bit DMA mask.  The block layer
> > should be doing the right thing, so there must be something from a >4GB
> > pool leaking into the driver somewhere: probably a stray kmalloc of a
> > DMA buffer without the right flags ... I'll run over the driver and see
> > if I can spot it.
> 
> OK, I analysed the code paths; I'm nearly certain the dma_map_sg() is
> returning addresses greater than the 32 bits allowable, which would
> point to some type of pci gart DMA failure (cc'ing Tomo for input).

>From a quick look, seems GART IOMMU properly handles dma_mask.

Let's confirm that dma_map_sg returns an invalid address.

Eddie, can you try a kernel with the following patch and send kernel
messages?

Note that the patch doesn't fix the problem; just print debug
information.


diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index 234f0b7..0d0a02f 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -1391,7 +1391,12 @@ megaraid_mbox_mksgl(adapter_t *adapter, scb_t *scb)
 
 	scb->dma_type = MRAID_DMA_WSG;
 
+	printk("%x %lx\n", scp->cmnd[0],
+	       (unsigned long)*(scp->device->host->shost_gendev.parent->dma_mask));
+
 	scsi_for_each_sg(scp, sgl, sgcnt, i) {
+		printk("%lx %u\n", (unsigned long)sg_dma_address(sgl),
+		       sg_dma_len(sgl));
 		ccb->sgl64[i].address	= sg_dma_address(sgl);
 		ccb->sgl64[i].length	= sg_dma_len(sgl);
 	}


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-18 13:00               ` FUJITA Tomonori
@ 2009-09-18 14:11                 ` Matthew Wilcox
  2009-09-18 14:18                   ` FUJITA Tomonori
  0 siblings, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2009-09-18 14:11 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: James.Bottomley, stunnel, linux-scsi, bo.yang

On Fri, Sep 18, 2009 at 10:00:53PM +0900, FUJITA Tomonori wrote:
> Let's confirm that dma_map_sg returns an invalid address.
> 
> Eddie, can you try a kernel with the following patch and send kernel
> messages?
> 
> Note that the patch doesn't fix the problem; just print debug
> information.

This is going to be quite noisy, and not give us all that much information.
Why not try this patch instead?

diff --git a/drivers/scsi/scsi_lib_dma.c b/drivers/scsi/scsi_lib_dma.c
index ac6855c..a179ff8 100644
--- a/drivers/scsi/scsi_lib_dma.c
+++ b/drivers/scsi/scsi_lib_dma.c
@@ -20,7 +20,8 @@
  */
 int scsi_dma_map(struct scsi_cmnd *cmd)
 {
-	int nseg = 0;
+	int i, nseg = 0;
+	struct scatterlist *sgl;
 
 	if (scsi_sg_count(cmd)) {
 		struct device *dev = cmd->device->host->shost_gendev.parent;
@@ -29,6 +30,11 @@ int scsi_dma_map(struct scsi_cmnd *cmd)
 				  cmd->sc_data_direction);
 		if (unlikely(!nseg))
 			return -ENOMEM;
+		scsi_for_each_sg(cmd, sgl, nseg, i) {
+			if (sg_dma_address(sgl) > *dev->dma_mask) {
+				printk("IOMMU bug, command %d, dma mask %llx, but returned %lx %u for phys %lx\n", cmd->cmnd[0], *dev->dma_mask, (unsigned long)sg_dma_address(sgl), sg_dma_len(sgl), (unsigned long)sg_phys(sgl));
+			}
+		}
 	}
 	return nseg;
 }


> diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> index 234f0b7..0d0a02f 100644
> --- a/drivers/scsi/megaraid/megaraid_mbox.c
> +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> @@ -1391,7 +1391,12 @@ megaraid_mbox_mksgl(adapter_t *adapter, scb_t *scb)
>  
>  	scb->dma_type = MRAID_DMA_WSG;
>  
> +	printk("%x %lx\n", scp->cmnd[0],
> +	       (unsigned long)*(scp->device->host->shost_gendev.parent->dma_mask));
> +
>  	scsi_for_each_sg(scp, sgl, sgcnt, i) {
> +		printk("%lx %u\n", (unsigned long)sg_dma_address(sgl),
> +		       sg_dma_len(sgl));
>  		ccb->sgl64[i].address	= sg_dma_address(sgl);
>  		ccb->sgl64[i].length	= sg_dma_len(sgl);
>  	}
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-18 14:11                 ` Matthew Wilcox
@ 2009-09-18 14:18                   ` FUJITA Tomonori
  2009-09-19  7:03                     ` Eddie
  0 siblings, 1 reply; 18+ messages in thread
From: FUJITA Tomonori @ 2009-09-18 14:18 UTC (permalink / raw)
  To: matthew; +Cc: fujita.tomonori, James.Bottomley, stunnel, linux-scsi, bo.yang

On Fri, 18 Sep 2009 08:11:07 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> On Fri, Sep 18, 2009 at 10:00:53PM +0900, FUJITA Tomonori wrote:
> > Let's confirm that dma_map_sg returns an invalid address.
> > 
> > Eddie, can you try a kernel with the following patch and send kernel
> > messages?
> > 
> > Note that the patch doesn't fix the problem; just print debug
> > information.
> 
> This is going to be quite noisy,

Seems that READ_CAPACITY always fails in his environment so it's not
so noisy. :)


> and not give us all that much information.
> Why not try this patch instead?
> 
> diff --git a/drivers/scsi/scsi_lib_dma.c b/drivers/scsi/scsi_lib_dma.c
> index ac6855c..a179ff8 100644
> --- a/drivers/scsi/scsi_lib_dma.c
> +++ b/drivers/scsi/scsi_lib_dma.c
> @@ -20,7 +20,8 @@
>   */
>  int scsi_dma_map(struct scsi_cmnd *cmd)
>  {
> -	int nseg = 0;
> +	int i, nseg = 0;
> +	struct scatterlist *sgl;
>  
>  	if (scsi_sg_count(cmd)) {
>  		struct device *dev = cmd->device->host->shost_gendev.parent;
> @@ -29,6 +30,11 @@ int scsi_dma_map(struct scsi_cmnd *cmd)
>  				  cmd->sc_data_direction);
>  		if (unlikely(!nseg))
>  			return -ENOMEM;
> +		scsi_for_each_sg(cmd, sgl, nseg, i) {
> +			if (sg_dma_address(sgl) > *dev->dma_mask) {

I think that it should be

if (sg_dma_address(sgl) + sg_dma_len(sgl) > *dev->dma_mask)

> +				printk("IOMMU bug, command %d, dma mask %llx, but returned %lx %u for phys %lx\n", cmd->cmnd[0], *dev->dma_mask, (unsigned long)sg_dma_address(sgl), sg_dma_len(sgl), (unsigned long)sg_phys(sgl));
> +			}
> +		}
>  	}
>  	return nseg;
>  }
> 
> 
> > diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> > index 234f0b7..0d0a02f 100644
> > --- a/drivers/scsi/megaraid/megaraid_mbox.c
> > +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> > @@ -1391,7 +1391,12 @@ megaraid_mbox_mksgl(adapter_t *adapter, scb_t *scb)
> >  
> >  	scb->dma_type = MRAID_DMA_WSG;
> >  
> > +	printk("%x %lx\n", scp->cmnd[0],
> > +	       (unsigned long)*(scp->device->host->shost_gendev.parent->dma_mask));
> > +
> >  	scsi_for_each_sg(scp, sgl, sgcnt, i) {
> > +		printk("%lx %u\n", (unsigned long)sg_dma_address(sgl),
> > +		       sg_dma_len(sgl));
> >  		ccb->sgl64[i].address	= sg_dma_address(sgl);
> >  		ccb->sgl64[i].length	= sg_dma_len(sgl);
> >  	}
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Matthew Wilcox				Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours.  We can't possibly take such
> a retrograde step."
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-18 14:18                   ` FUJITA Tomonori
@ 2009-09-19  7:03                     ` Eddie
  2009-09-19  7:55                       ` FUJITA Tomonori
  0 siblings, 1 reply; 18+ messages in thread
From: Eddie @ 2009-09-19  7:03 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: matthew, James.Bottomley, linux-scsi, bo.yang

FUJITA Tomonori wrote:
> On Fri, 18 Sep 2009 08:11:07 -0600
> Matthew Wilcox <matthew@wil.cx> wrote:
>
>   
>> On Fri, Sep 18, 2009 at 10:00:53PM +0900, FUJITA Tomonori wrote:
>>     
>>> Let's confirm that dma_map_sg returns an invalid address.
>>>
>>> Eddie, can you try a kernel with the following patch and send kernel
>>> messages?
>>>
>>> Note that the patch doesn't fix the problem; just print debug
>>> information.
>>>       
>> This is going to be quite noisy,
>>     
>
> Seems that READ_CAPACITY always fails in his environment so it's not
> so noisy. :)
>
>
>   
>> and not give us all that much information.
>> Why not try this patch instead?
>>
>> diff --git a/drivers/scsi/scsi_lib_dma.c b/drivers/scsi/scsi_lib_dma.c
>> index ac6855c..a179ff8 100644
>> --- a/drivers/scsi/scsi_lib_dma.c
>> +++ b/drivers/scsi/scsi_lib_dma.c
>> @@ -20,7 +20,8 @@
>>   */
>>  int scsi_dma_map(struct scsi_cmnd *cmd)
>>  {
>> -	int nseg = 0;
>> +	int i, nseg = 0;
>> +	struct scatterlist *sgl;
>>  
>>  	if (scsi_sg_count(cmd)) {
>>  		struct device *dev = cmd->device->host->shost_gendev.parent;
>> @@ -29,6 +30,11 @@ int scsi_dma_map(struct scsi_cmnd *cmd)
>>  				  cmd->sc_data_direction);
>>  		if (unlikely(!nseg))
>>  			return -ENOMEM;
>> +		scsi_for_each_sg(cmd, sgl, nseg, i) {
>> +			if (sg_dma_address(sgl) > *dev->dma_mask) {
>>     
>
> I think that it should be
>
> if (sg_dma_address(sgl) + sg_dma_len(sgl) > *dev->dma_mask)
>
>   
>> +				printk("IOMMU bug, command %d, dma mask %llx, but returned %lx %u for phys %lx\n", cmd->cmnd[0], *dev->dma_mask, (unsigned long)sg_dma_address(sgl), sg_dma_len(sgl), (unsigned long)sg_phys(sgl));
>> +			}
>> +		}
>>  	}
>>  	return nseg;
>>  }
>>
>>
>>     
>>> diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
>>> index 234f0b7..0d0a02f 100644
>>> --- a/drivers/scsi/megaraid/megaraid_mbox.c
>>> +++ b/drivers/scsi/megaraid/megaraid_mbox.c
>>> @@ -1391,7 +1391,12 @@ megaraid_mbox_mksgl(adapter_t *adapter, scb_t *scb)
>>>  
>>>  	scb->dma_type = MRAID_DMA_WSG;
>>>  
>>> +	printk("%x %lx\n", scp->cmnd[0],
>>> +	       (unsigned long)*(scp->device->host->shost_gendev.parent->dma_mask));
>>> +
>>>  	scsi_for_each_sg(scp, sgl, sgcnt, i) {
>>> +		printk("%lx %u\n", (unsigned long)sg_dma_address(sgl),
>>> +		       sg_dma_len(sgl));
>>>  		ccb->sgl64[i].address	= sg_dma_address(sgl);
>>>  		ccb->sgl64[i].length	= sg_dma_len(sgl);
>>>  	}
>>>
>>>       
OK, there are 2 dmesg outputs stored at ftp://ftp.BogoLinux.net/pub

dmesg.2a contains Fujita's patch and Matthew's

dmesg.2b contains Fujita's patch and his update to Matthew's.

Cheers,
Eddie


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-19  7:03                     ` Eddie
@ 2009-09-19  7:55                       ` FUJITA Tomonori
  2009-09-19 15:24                         ` Eddie
  0 siblings, 1 reply; 18+ messages in thread
From: FUJITA Tomonori @ 2009-09-19  7:55 UTC (permalink / raw)
  To: stunnel; +Cc: fujita.tomonori, matthew, James.Bottomley, linux-scsi, bo.yang

On Sat, 19 Sep 2009 00:03:48 -0700
Eddie <stunnel@attglobal.net> wrote:

> FUJITA Tomonori wrote:
> > On Fri, 18 Sep 2009 08:11:07 -0600
> > Matthew Wilcox <matthew@wil.cx> wrote:
> >
> >   
> >> On Fri, Sep 18, 2009 at 10:00:53PM +0900, FUJITA Tomonori wrote:
> >>     
> >>> Let's confirm that dma_map_sg returns an invalid address.
> >>>
> >>> Eddie, can you try a kernel with the following patch and send kernel
> >>> messages?
> >>>
> >>> Note that the patch doesn't fix the problem; just print debug
> >>> information.
> >>>       
> >> This is going to be quite noisy,
> >>     
> >
> > Seems that READ_CAPACITY always fails in his environment so it's not
> > so noisy. :)
> >
> >
> >   
> >> and not give us all that much information.
> >> Why not try this patch instead?
> >>
> >> diff --git a/drivers/scsi/scsi_lib_dma.c b/drivers/scsi/scsi_lib_dma.c
> >> index ac6855c..a179ff8 100644
> >> --- a/drivers/scsi/scsi_lib_dma.c
> >> +++ b/drivers/scsi/scsi_lib_dma.c
> >> @@ -20,7 +20,8 @@
> >>   */
> >>  int scsi_dma_map(struct scsi_cmnd *cmd)
> >>  {
> >> -	int nseg = 0;
> >> +	int i, nseg = 0;
> >> +	struct scatterlist *sgl;
> >>  
> >>  	if (scsi_sg_count(cmd)) {
> >>  		struct device *dev = cmd->device->host->shost_gendev.parent;
> >> @@ -29,6 +30,11 @@ int scsi_dma_map(struct scsi_cmnd *cmd)
> >>  				  cmd->sc_data_direction);
> >>  		if (unlikely(!nseg))
> >>  			return -ENOMEM;
> >> +		scsi_for_each_sg(cmd, sgl, nseg, i) {
> >> +			if (sg_dma_address(sgl) > *dev->dma_mask) {
> >>     
> >
> > I think that it should be
> >
> > if (sg_dma_address(sgl) + sg_dma_len(sgl) > *dev->dma_mask)
> >
> >   
> >> +				printk("IOMMU bug, command %d, dma mask %llx, but returned %lx %u for phys %lx\n", cmd->cmnd[0], *dev->dma_mask, (unsigned long)sg_dma_address(sgl), sg_dma_len(sgl), (unsigned long)sg_phys(sgl));
> >> +			}
> >> +		}
> >>  	}
> >>  	return nseg;
> >>  }
> >>
> >>
> >>     
> >>> diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> >>> index 234f0b7..0d0a02f 100644
> >>> --- a/drivers/scsi/megaraid/megaraid_mbox.c
> >>> +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> >>> @@ -1391,7 +1391,12 @@ megaraid_mbox_mksgl(adapter_t *adapter, scb_t *scb)
> >>>  
> >>>  	scb->dma_type = MRAID_DMA_WSG;
> >>>  
> >>> +	printk("%x %lx\n", scp->cmnd[0],
> >>> +	       (unsigned long)*(scp->device->host->shost_gendev.parent->dma_mask));
> >>> +
> >>>  	scsi_for_each_sg(scp, sgl, sgcnt, i) {
> >>> +		printk("%lx %u\n", (unsigned long)sg_dma_address(sgl),
> >>> +		       sg_dma_len(sgl));
> >>>  		ccb->sgl64[i].address	= sg_dma_address(sgl);
> >>>  		ccb->sgl64[i].length	= sg_dma_len(sgl);
> >>>  	}
> >>>
> >>>       
> OK, there are 2 dmesg outputs stored at ftp://ftp.BogoLinux.net/pub
> 
> dmesg.2a contains Fujita's patch and Matthew's
> 
> dmesg.2b contains Fujita's patch and his update to Matthew's.

Thanks, now the problem is clear; the driver uses 64bit dma_mask but
the hardware is not capable of 64bit dma according to James. Seems
that hardware lies about the dma capability.

Can you try the following patch?


diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index 234f0b7..9ead856 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -888,6 +888,8 @@ megaraid_init_mbox(adapter_t *adapter)
 	if (((magic64 == HBA_SIGNATURE_64_BIT) &&
 		((adapter->pdev->subsystem_device !=
 		PCI_SUBSYS_ID_MEGARAID_SATA_150_6) &&
+		(adapter->pdev->device !=
+		PCI_DEVICE_ID_AMI_MEGARAID3) &&
 		(adapter->pdev->subsystem_device !=
 		PCI_SUBSYS_ID_MEGARAID_SATA_150_4))) ||
 		(adapter->pdev->vendor == PCI_VENDOR_ID_LSI_LOGIC &&


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-19  7:55                       ` FUJITA Tomonori
@ 2009-09-19 15:24                         ` Eddie
  2009-09-21 13:51                           ` FUJITA Tomonori
  0 siblings, 1 reply; 18+ messages in thread
From: Eddie @ 2009-09-19 15:24 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: matthew, James.Bottomley, linux-scsi, bo.yang

FUJITA Tomonori wrote:
> Thanks, now the problem is clear; the driver uses 64bit dma_mask but
> the hardware is not capable of 64bit dma according to James. Seems
> that hardware lies about the dma capability.
>
> Can you try the following patch?
>
>
> diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> index 234f0b7..9ead856 100644
> --- a/drivers/scsi/megaraid/megaraid_mbox.c
> +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> @@ -888,6 +888,8 @@ megaraid_init_mbox(adapter_t *adapter)
>  	if (((magic64 == HBA_SIGNATURE_64_BIT) &&
>  		((adapter->pdev->subsystem_device !=
>  		PCI_SUBSYS_ID_MEGARAID_SATA_150_6) &&
> +		(adapter->pdev->device !=
> +		PCI_DEVICE_ID_AMI_MEGARAID3) &&
>  		(adapter->pdev->subsystem_device !=
>  		PCI_SUBSYS_ID_MEGARAID_SATA_150_4))) ||
>  		(adapter->pdev->vendor == PCI_VENDOR_ID_LSI_LOGIC &&
>
>   
OK, that appears to have done the trick.

Thanks to all of you.

Cheers,
Eddie


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-19 15:24                         ` Eddie
@ 2009-09-21 13:51                           ` FUJITA Tomonori
  2009-09-21 15:34                             ` James Bottomley
  0 siblings, 1 reply; 18+ messages in thread
From: FUJITA Tomonori @ 2009-09-21 13:51 UTC (permalink / raw)
  To: stunnel, James.Bottomley; +Cc: fujita.tomonori, matthew, linux-scsi, bo.yang

On Sat, 19 Sep 2009 08:24:02 -0700
Eddie <stunnel@attglobal.net> wrote:

> FUJITA Tomonori wrote:
> > Thanks, now the problem is clear; the driver uses 64bit dma_mask but
> > the hardware is not capable of 64bit dma according to James. Seems
> > that hardware lies about the dma capability.
> >
> > Can you try the following patch?
> >
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> > index 234f0b7..9ead856 100644
> > --- a/drivers/scsi/megaraid/megaraid_mbox.c
> > +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> > @@ -888,6 +888,8 @@ megaraid_init_mbox(adapter_t *adapter)
> >  	if (((magic64 == HBA_SIGNATURE_64_BIT) &&
> >  		((adapter->pdev->subsystem_device !=
> >  		PCI_SUBSYS_ID_MEGARAID_SATA_150_6) &&
> > +		(adapter->pdev->device !=
> > +		PCI_DEVICE_ID_AMI_MEGARAID3) &&
> >  		(adapter->pdev->subsystem_device !=
> >  		PCI_SUBSYS_ID_MEGARAID_SATA_150_4))) ||
> >  		(adapter->pdev->vendor == PCI_VENDOR_ID_LSI_LOGIC &&
> >
> >   
> OK, that appears to have done the trick.

Thanks for the confirmation.

James, here's a patch in the proper format.

=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] megaraid: use 32bit DMA mask for AMI Megaraid3

The hardware says that it supports 64bit DMA however it doesn't. This
patch always uses 32bit DMA mask.

Reported-by: Eddie <stunnel@attglobal.net>
Tested-by: Eddie <stunnel@attglobal.net>
Cc: stable@kernel.org
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/scsi/megaraid/megaraid_mbox.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index 234f0b7..9ead856 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -888,6 +888,8 @@ megaraid_init_mbox(adapter_t *adapter)
 	if (((magic64 == HBA_SIGNATURE_64_BIT) &&
 		((adapter->pdev->subsystem_device !=
 		PCI_SUBSYS_ID_MEGARAID_SATA_150_6) &&
+		(adapter->pdev->device !=
+		PCI_DEVICE_ID_AMI_MEGARAID3) &&
 		(adapter->pdev->subsystem_device !=
 		PCI_SUBSYS_ID_MEGARAID_SATA_150_4))) ||
 		(adapter->pdev->vendor == PCI_VENDOR_ID_LSI_LOGIC &&
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-21 13:51                           ` FUJITA Tomonori
@ 2009-09-21 15:34                             ` James Bottomley
  2009-09-21 15:37                               ` Yang, Bo
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2009-09-21 15:34 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: stunnel, matthew, linux-scsi, bo.yang

On Mon, 2009-09-21 at 22:51 +0900, FUJITA Tomonori wrote:
> On Sat, 19 Sep 2009 08:24:02 -0700
> Eddie <stunnel@attglobal.net> wrote:
> 
> > FUJITA Tomonori wrote:
> > > Thanks, now the problem is clear; the driver uses 64bit dma_mask but
> > > the hardware is not capable of 64bit dma according to James. Seems
> > > that hardware lies about the dma capability.
> > >
> > > Can you try the following patch?
> > >
> > >
> > > diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> > > index 234f0b7..9ead856 100644
> > > --- a/drivers/scsi/megaraid/megaraid_mbox.c
> > > +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> > > @@ -888,6 +888,8 @@ megaraid_init_mbox(adapter_t *adapter)
> > >  	if (((magic64 == HBA_SIGNATURE_64_BIT) &&
> > >  		((adapter->pdev->subsystem_device !=
> > >  		PCI_SUBSYS_ID_MEGARAID_SATA_150_6) &&
> > > +		(adapter->pdev->device !=
> > > +		PCI_DEVICE_ID_AMI_MEGARAID3) &&
> > >  		(adapter->pdev->subsystem_device !=
> > >  		PCI_SUBSYS_ID_MEGARAID_SATA_150_4))) ||
> > >  		(adapter->pdev->vendor == PCI_VENDOR_ID_LSI_LOGIC &&
> > >
> > >   
> > OK, that appears to have done the trick.
> 
> Thanks for the confirmation.
> 
> James, here's a patch in the proper format.

Got it ... I'd like to wait on confirmation from LSI on this ... I
actually said the Megaraid3 wasn't 64 bit based on the 64 bit exception
table, but if it really has the firmware marker that says it's 64 bit,
we should make sure the manufacturers agree that it needs to be added to
the exception table.

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: LSI MegaRAID not recognised correctly in 64-bit.  2.6.29.6
  2009-09-21 15:34                             ` James Bottomley
@ 2009-09-21 15:37                               ` Yang, Bo
  0 siblings, 0 replies; 18+ messages in thread
From: Yang, Bo @ 2009-09-21 15:37 UTC (permalink / raw)
  To: James Bottomley, FUJITA Tomonori; +Cc: stunnel, matthew, linux-scsi

I am checking with our FW team...

-----Original Message-----
From: James Bottomley [mailto:James.Bottomley@suse.de] 
Sent: Monday, September 21, 2009 11:34 AM
To: FUJITA Tomonori
Cc: stunnel@attglobal.net; matthew@wil.cx; linux-scsi@vger.kernel.org; Yang, Bo
Subject: Re: LSI MegaRAID not recognised correctly in 64-bit. 2.6.29.6

On Mon, 2009-09-21 at 22:51 +0900, FUJITA Tomonori wrote:
> On Sat, 19 Sep 2009 08:24:02 -0700
> Eddie <stunnel@attglobal.net> wrote:
> 
> > FUJITA Tomonori wrote:
> > > Thanks, now the problem is clear; the driver uses 64bit dma_mask but
> > > the hardware is not capable of 64bit dma according to James. Seems
> > > that hardware lies about the dma capability.
> > >
> > > Can you try the following patch?
> > >
> > >
> > > diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
> > > index 234f0b7..9ead856 100644
> > > --- a/drivers/scsi/megaraid/megaraid_mbox.c
> > > +++ b/drivers/scsi/megaraid/megaraid_mbox.c
> > > @@ -888,6 +888,8 @@ megaraid_init_mbox(adapter_t *adapter)
> > >  	if (((magic64 == HBA_SIGNATURE_64_BIT) &&
> > >  		((adapter->pdev->subsystem_device !=
> > >  		PCI_SUBSYS_ID_MEGARAID_SATA_150_6) &&
> > > +		(adapter->pdev->device !=
> > > +		PCI_DEVICE_ID_AMI_MEGARAID3) &&
> > >  		(adapter->pdev->subsystem_device !=
> > >  		PCI_SUBSYS_ID_MEGARAID_SATA_150_4))) ||
> > >  		(adapter->pdev->vendor == PCI_VENDOR_ID_LSI_LOGIC &&
> > >
> > >   
> > OK, that appears to have done the trick.
> 
> Thanks for the confirmation.
> 
> James, here's a patch in the proper format.

Got it ... I'd like to wait on confirmation from LSI on this ... I
actually said the Megaraid3 wasn't 64 bit based on the 64 bit exception
table, but if it really has the firmware marker that says it's 64 bit,
we should make sure the manufacturers agree that it needs to be added to
the exception table.

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-09-21 15:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-16 16:18 LSI MegaRAID not recognised correctly in 64-bit. 2.6.29.6 Eddie
2009-09-16 16:31 ` James Bottomley
2009-09-16 16:54   ` Eddie
2009-09-17  2:58     ` Eddie
2009-09-17  2:59     ` Eddie
2009-09-17 15:59       ` James Bottomley
2009-09-17 16:09         ` Eddie
2009-09-17 18:39           ` James Bottomley
2009-09-17 20:28             ` James Bottomley
2009-09-18 13:00               ` FUJITA Tomonori
2009-09-18 14:11                 ` Matthew Wilcox
2009-09-18 14:18                   ` FUJITA Tomonori
2009-09-19  7:03                     ` Eddie
2009-09-19  7:55                       ` FUJITA Tomonori
2009-09-19 15:24                         ` Eddie
2009-09-21 13:51                           ` FUJITA Tomonori
2009-09-21 15:34                             ` James Bottomley
2009-09-21 15:37                               ` Yang, Bo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.