All of lore.kernel.org
 help / color / mirror / Atom feed
* Determining which spindle is out of order
@ 2010-11-03 14:13 Nat Makarevitch
  2010-11-03 14:38 ` Roman Mamedov
                   ` (4 more replies)
  0 siblings, 5 replies; 39+ messages in thread
From: Nat Makarevitch @ 2010-11-03 14:13 UTC (permalink / raw)
  To: linux-raid

Hi,

After a spindle (physical hard disk, a "drive") failure in a "md" RAID array,
how can we know which spindle must be replaced?

We want to avoid extracting a working spindle by mistakenly thinking it is the
faulty one...

To solve this problem we put on each spindle a physical label (a tag, not a
partition/disk label) showing its device name (sda, sdb...).

To do so: on an otherwise unused system and sane RAID array we ran "dd" for each
each spindle (individual device) in order to read on it, lighting up its LED.

Then we simulated a crash by physically removing two spindles. Upon reboot the
devices names changed (?!) and our labels weren't right anymore albeit we are
pretty sure they were.

Context: raid10, 10 spindles (8 active + 2 spare), layout : near=1, offset=3. On
the integrated controller + a LSI MPT on-board controller.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 14:13 Determining which spindle is out of order Nat Makarevitch
@ 2010-11-03 14:38 ` Roman Mamedov
  2010-11-03 15:17   ` Graham Mitchell
  2010-11-03 14:43 ` John Robinson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 39+ messages in thread
From: Roman Mamedov @ 2010-11-03 14:38 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 336 bytes --]

On Wed, 3 Nov 2010 14:13:25 +0000 (UTC)
Nat Makarevitch <Shelso@makarevitch.org> wrote:

> To solve this problem we put on each spindle a physical label (a tag, not a
> partition/disk label) showing its device name (sda, sdb...).  

That's totally stupid, use disk serial numbers for labels instead.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 14:13 Determining which spindle is out of order Nat Makarevitch
  2010-11-03 14:38 ` Roman Mamedov
@ 2010-11-03 14:43 ` John Robinson
  2010-11-03 14:45 ` Tim Small
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 39+ messages in thread
From: John Robinson @ 2010-11-03 14:43 UTC (permalink / raw)
  To: Nat Makarevitch; +Cc: linux-raid

On 03/11/2010 14:13, Nat Makarevitch wrote:
> Hi,
>
> After a spindle (physical hard disk, a "drive") failure in a "md" RAID array,
> how can we know which spindle must be replaced?
>
> We want to avoid extracting a working spindle by mistakenly thinking it is the
> faulty one...
>
> To solve this problem we put on each spindle a physical label (a tag, not a
> partition/disk label) showing its device name (sda, sdb...).
>
> To do so: on an otherwise unused system and sane RAID array we ran "dd" for each
> each spindle (individual device) in order to read on it, lighting up its LED.
>
> Then we simulated a crash by physically removing two spindles. Upon reboot the
> devices names changed (?!) and our labels weren't right anymore albeit we are
> pretty sure they were.
>
> Context: raid10, 10 spindles (8 active + 2 spare), layout : near=1, offset=3. On
> the integrated controller + a LSI MPT on-board controller.

That's right, drive letters sda sdb etc are allocated in the order 
they're discovered at boot time, so if you remove what used to be sdc 
and reboot, sdc will now refer to what used to be sdd, sdd to the old 
sde etc.

Have a look at /dev/disk/by-path, in there you'll find symlinks which 
will always have the same names, or indeed be missing, which point to 
sda, sdb etc. In my case I have SATA discs on an ICH10R controller, the 
controller is pci-0000:00:1f.2, it appears as up to 6 SCSI controllers 
numbered 0-5, and the discs appear as HBA 0, device 0, lun 0, along with 
their partitions, as follows:

pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sda
pci-0000:00:1f.2-scsi-0:0:0:0-part1 -> ../../sda1
pci-0000:00:1f.2-scsi-0:0:0:0-part2 -> ../../sda2
pci-0000:00:1f.2-scsi-1:0:0:0 -> ../../sdb
pci-0000:00:1f.2-scsi-1:0:0:0-part1 -> ../../sdb1
pci-0000:00:1f.2-scsi-1:0:0:0-part2 -> ../../sdb2
pci-0000:00:1f.2-scsi-2:0:0:0 -> ../../sdc
pci-0000:00:1f.2-scsi-2:0:0:0-part1 -> ../../sdc1
pci-0000:00:1f.2-scsi-2:0:0:0-part2 -> ../../sdc2
pci-0000:03:00.0-scsi-0:0:0:0 -> ../../scd0

Oh and the last one is a CD ROM drive on a pata_marvell IDE controller.

You might want to relabel your drives according to what you find in 
/dev/disk/by-path.

Cheers,

John.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 14:13 Determining which spindle is out of order Nat Makarevitch
  2010-11-03 14:38 ` Roman Mamedov
  2010-11-03 14:43 ` John Robinson
@ 2010-11-03 14:45 ` Tim Small
  2010-11-03 15:59   ` Jon Hardcastle
  2010-11-03 15:29 ` Mikael Abrahamsson
  2010-11-03 21:54 ` Phil Turmel
  4 siblings, 1 reply; 39+ messages in thread
From: Tim Small @ 2010-11-03 14:45 UTC (permalink / raw)
  To: Nat Makarevitch; +Cc: linux-raid

On 03/11/10 14:13, Nat Makarevitch wrote:

> To solve this problem we put on each spindle a physical label (a tag, not a
> partition/disk label) showing its device name (sda, sdb...).
>   

As you discovered - that's not a good idea, the device names are
dynamically assigned.

> Then we simulated a crash by physically removing two spindles. Upon reboot the
> devices names changed (?!) and our labels weren't right anymore albeit we are
> pretty sure they were.
>   

Best to use the drive serial number maybe (hdparm -I or smartctl -a, or
the links under /dev/devices/by-id, and also printed on the top of the
drive etc.)?

You can also use the physical slot if that info is available and static
(this varies from controller to controller) and is available under
/dev/disk/by-path/

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309


^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Determining which spindle is out of order
  2010-11-03 14:38 ` Roman Mamedov
@ 2010-11-03 15:17   ` Graham Mitchell
  2010-11-03 16:05     ` Roman Mamedov
  0 siblings, 1 reply; 39+ messages in thread
From: Graham Mitchell @ 2010-11-03 15:17 UTC (permalink / raw)
  To: 'Roman Mamedov', linux-raid

> 
> That's totally stupid <snip>
> 
> --
> With respect,
> Roman

Err....



G


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 14:13 Determining which spindle is out of order Nat Makarevitch
                   ` (2 preceding siblings ...)
  2010-11-03 14:45 ` Tim Small
@ 2010-11-03 15:29 ` Mikael Abrahamsson
  2010-11-03 21:54 ` Phil Turmel
  4 siblings, 0 replies; 39+ messages in thread
From: Mikael Abrahamsson @ 2010-11-03 15:29 UTC (permalink / raw)
  To: Nat Makarevitch; +Cc: linux-raid

On Wed, 3 Nov 2010, Nat Makarevitch wrote:

> We want to avoid extracting a working spindle by mistakenly thinking it 
> is the faulty one...

I tend to look at the blinking lights, it's usually quite obvious which 
drive isn't in the array anymore by the fact that the activity light is 
not blinking. Apart from that, as someone else said, use serial numbers.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 14:45 ` Tim Small
@ 2010-11-03 15:59   ` Jon Hardcastle
  2010-11-03 17:17     ` Bill Davidsen
  0 siblings, 1 reply; 39+ messages in thread
From: Jon Hardcastle @ 2010-11-03 15:59 UTC (permalink / raw)
  To: Tim Small; +Cc: Nat Makarevitch, linux-raid

I use something like this.. when i commission a drive I allocate it a
unique number like J70 and I then have a spreadsheet that marrys this
up with the serial number of the device.

hdparm etc gives me the serial number, and the lookup identifies the device.

Oh and i write the J70 on little green sticky labels that I then stick
on all 4 edges of the drive...

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 15:17   ` Graham Mitchell
@ 2010-11-03 16:05     ` Roman Mamedov
  2010-11-03 19:00       ` Jon Hardcastle
  0 siblings, 1 reply; 39+ messages in thread
From: Roman Mamedov @ 2010-11-03 16:05 UTC (permalink / raw)
  To: Graham Mitchell; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

On Wed, 3 Nov 2010 11:17:24 -0400
"Graham Mitchell" <gmitch@woodlea.com> wrote:

> > 
> > That's totally stupid <snip>
> > 
> > --
> > With respect,
> > Roman
> 
> Err....

Yeah, sorry about that -- and to make my reply a bit less useless, I should
elaborate that my suggestion is to not base anything on the sdX or their order
at all. For example even though I want to have a specific list of looked-at
devices in mdadm.conf, I point it directly to specific drives -- not
to /dev/sdX, but to udev symlinks in /dev/disk/by-id/ instead:

# 2TBs
DEVICE /dev/disk/by-id/ata-WDC_WD20EADS-00S2B0_WD-xxxxxxxxxxxx-part*
DEVICE /dev/disk/by-id/ata-WDC_WD20EADS-00S2B0_WD-xxxxxxxxxxxx-part*
DEVICE /dev/disk/by-id/ata-Hitachi_HDS722020ALA330_xxxxxxxxxxxxxx-part*
# 1.5 TBs
DEVICE /dev/disk/by-id/ata-WDC_WD15EADS-00S2B0_WD-xxxxxxxxxxxx-part*
# 1TBs
DEVICE /dev/disk/by-id/ata-Hitachi_HDT721010SLA360_xxxxxxxxxxxxxx-part*
DEVICE /dev/disk/by-id/ata-ST31000333AS_xxxxxxxx-part*

Listed like this it will also make mdadm not consider whole block devices for
addition into RAIDs, only the partitions (which is exactly what I want).

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 15:59   ` Jon Hardcastle
@ 2010-11-03 17:17     ` Bill Davidsen
  2010-11-03 20:03       ` Tim Small
  0 siblings, 1 reply; 39+ messages in thread
From: Bill Davidsen @ 2010-11-03 17:17 UTC (permalink / raw)
  To: Jon; +Cc: Jon Hardcastle, Tim Small, Nat Makarevitch, linux-raid

Jon Hardcastle wrote:
> I use something like this.. when i commission a drive I allocate it a
> unique number like J70 and I then have a spreadsheet that marrys this
> up with the serial number of the device.
>
> hdparm etc gives me the serial number, and the lookup identifies the device.
>
> Oh and i write the J70 on little green sticky labels that I then stick
> on all 4 edges of the drive...
>    

When I ran servers for an ISP we had a utility which blinked the light 
on the drive (assumes it hasn't gone utterly belly up, of course).

-- 
Bill Davidsen<davidsen@tmr.com>
   "We can't solve today's problems by using the same thinking we
    used in creating them." - Einstein


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 16:05     ` Roman Mamedov
@ 2010-11-03 19:00       ` Jon Hardcastle
  0 siblings, 0 replies; 39+ messages in thread
From: Jon Hardcastle @ 2010-11-03 19:00 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: Graham Mitchell, linux-raid

On 3 November 2010 16:05, Roman Mamedov <roman@rm.pp.ru> wrote:
> On Wed, 3 Nov 2010 11:17:24 -0400
> "Graham Mitchell" <gmitch@woodlea.com> wrote:
>
>> >
>> > That's totally stupid <snip>
>> >
>> > --
>> > With respect,
>> > Roman
>>
>> Err....
>
> Yeah, sorry about that -- and to make my reply a bit less useless, I should
> elaborate that my suggestion is to not base anything on the sdX or their order
> at all. For example even though I want to have a specific list of looked-at
> devices in mdadm.conf, I point it directly to specific drives -- not
> to /dev/sdX, but to udev symlinks in /dev/disk/by-id/ instead:
>
> # 2TBs
> DEVICE /dev/disk/by-id/ata-WDC_WD20EADS-00S2B0_WD-xxxxxxxxxxxx-part*
> DEVICE /dev/disk/by-id/ata-WDC_WD20EADS-00S2B0_WD-xxxxxxxxxxxx-part*
> DEVICE /dev/disk/by-id/ata-Hitachi_HDS722020ALA330_xxxxxxxxxxxxxx-part*
> # 1.5 TBs
> DEVICE /dev/disk/by-id/ata-WDC_WD15EADS-00S2B0_WD-xxxxxxxxxxxx-part*
> # 1TBs
> DEVICE /dev/disk/by-id/ata-Hitachi_HDT721010SLA360_xxxxxxxxxxxxxx-part*
> DEVICE /dev/disk/by-id/ata-ST31000333AS_xxxxxxxx-part*
>
> Listed like this it will also make mdadm not consider whole block devices for
> addition into RAIDs, only the partitions (which is exactly what I want).
>
> --
> With respect,
> Roman
>

See occasionally little gems of information like that come up... and
they change my world.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 17:17     ` Bill Davidsen
@ 2010-11-03 20:03       ` Tim Small
  0 siblings, 0 replies; 39+ messages in thread
From: Tim Small @ 2010-11-03 20:03 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Jon, Jon Hardcastle, Nat Makarevitch, linux-raid

On 03/11/10 17:17, Bill Davidsen wrote:
> When I ran servers for an ISP we had a utility which blinked the light
> on the drive (assumes it hasn't gone utterly belly up, of course).
>

I believe that this is part of the SCSI standard, certainly a brief look
around a few months ago didn't turn up a way of doing this for SATA
drives (i.e. nothing in the relevant standards, particular drive vendors
may have their own non-standard commands to do this, or enclosure
manufacturers may have out-of-band methods for this).

If anyone knows better, please let me know!


Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 14:13 Determining which spindle is out of order Nat Makarevitch
                   ` (3 preceding siblings ...)
  2010-11-03 15:29 ` Mikael Abrahamsson
@ 2010-11-03 21:54 ` Phil Turmel
  2010-11-03 22:26   ` Roman Mamedov
                     ` (2 more replies)
  4 siblings, 3 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-03 21:54 UTC (permalink / raw)
  To: Nat Makarevitch; +Cc: linux-raid

On 11/3/2010 2:13 PM, Nat Makarevitch wrote:
> Hi,
> 
> After a spindle (physical hard disk, a "drive") failure in a "md" RAID array,
> how can we know which spindle must be replaced?
> 
> We want to avoid extracting a working spindle by mistakenly thinking it is the
> faulty one...

I wrote a little script that would tell me device name and serial number for each host port on my motherboard, along with anything else that lists a scsi host in sysfs.  Output like so:

Controller device @ pci0000:00/0000:00:1c.1/0000:06:00.0 [ahci]
  RAID bus controller: Marvell Technology Group Ltd. 88SE6145 SATA II PCI-E controller (rev a1)
    host4: [Empty]
    host5: /dev/sdd ATA WDC WD5000AAKS-7 {SN: WD-WMAWF1370668}
    host6: [Empty]
    host7: [Empty]
    host8: [Empty]
Controller device @ pci0000:00/0000:00:1f.1 [ata_piix]
  IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01)
    host9: [Empty]
    host10: [Empty]
Controller device @ pci0000:00/0000:00:1f.2 [ahci]
  SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01)
    host0: /dev/sda ATA ST31000333AS {SN: 9TE1LTW0}
    host1: /dev/sdb ATA ST31000333AS {SN: 9TE1MAJT}
    host2: /dev/sdc ATA ST31000333AS {SN: 9TE1MV1R}
    host3: /dev/sr0 HL-DT-ST BD-RE GBW-H20L

Shows me my empty ports, too.  As long as I keep my cables straight to my hot-swap bays, getting the right drive is a snap.

HTH,

Phil

I hereby release the following script into the public domain:

#! /bin/bash
#
# Examine specific system host devices to identify the drives attached
#

function describe_controller () {
        unset SUBSYSTEM PCI_ID DEVICE PCI_SLOT_NAME Manufacturer Product Serial
        eval `udevadm info --query=all --path="$1" |grep '^E: ' |cut -c 4-`
        echo "Controller device @ ${1##/sys/devices/} [$DRIVER]"
        if [[ -n "$PCI_SLOT_NAME" ]] ; then
                echo -e "  `lspci -s $PCI_SLOT_NAME |cut -d\  -f2-`"
        fi
        if [[ "${MODALIAS:0:4}" == "usb:" ]] ; then
                eval `lsusb -D ${DEVICE/proc/dev/} |sed -r -n -e 's% *i(Manufacturer|Product|Serial) +[0-9]+ +(.+) *$%\1="\2"%;tFND;b;:FND;p'`
                echo -e "  [$Manufacturer] $Product {SN: $Serial}"
        fi
}

function describe_device () {
        targ=${1%/block/*}
        vnd="`cat $targ/vendor`"
        mdl=`cat $targ/model`
        rdev=`readlink -f "$1"`
        if [[ -d $rdev ]] ; then
                bdev="`basename $rdev`"
                sn="`sginfo -s /dev/$bdev |sed -r -n -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
                if [[ -n "$sn" ]] ; then
                        echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
                else
                        echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
                fi
        else
                echo -e "    $bhost: Unknown $rdev"
        fi
}

function check_host () {
        local found=0
        local pController=
        while read shost ; do
                host=`dirname "$shost"`
                controller=`dirname "$host"`
                bhost=`basename "$host"`
                if [[ "$controller" != "$pController" ]] ; then
                        pController="$controller"
                        describe_controller "$controller"
                fi
                for dev in $host/target*/*/block/* ; do
                        if [[ "${dev: -1}" == '*' ]] ; then
                                echo -e "    $bhost: [Empty]"
                        else
                                describe_device "$dev"
                        fi
                done
        done
}

find /sys/devices/ -name scsi_host |check_host


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 21:54 ` Phil Turmel
@ 2010-11-03 22:26   ` Roman Mamedov
  2010-11-04  9:29   ` Tom Carlson
  2010-11-06 10:22   ` Leslie Rhorer
  2 siblings, 0 replies; 39+ messages in thread
From: Roman Mamedov @ 2010-11-03 22:26 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Nat Makarevitch, linux-raid

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

On Wed, 03 Nov 2010 21:54:54 +0000
Phil Turmel <philip@turmel.org> wrote:

> I wrote a little script that would tell me device name and serial number for
> each host port on my motherboard, along with anything else that lists a scsi
> host in sysfs.

Thanks for the great script. However it was partially failing for me at first,
with error messages like "Corporation: command not found" or "Technology:
command not found", failing to get most of the data properly. Turns out my
"udevadm info" output for my controllers contains non-quoted values with
spaces, which do not "eval" correctly:

E: ID_VENDOR_FROM_DATABASE=nVidia Corporation
E: ID_VENDOR_FROM_DATABASE=JMicron Technology Corp.

As you do not use any of those further in the function anyway, I just added |
'grep -v " "' after "cut", and it worked fine afterwards.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-03 21:54 ` Phil Turmel
  2010-11-03 22:26   ` Roman Mamedov
@ 2010-11-04  9:29   ` Tom Carlson
  2010-11-06 10:22   ` Leslie Rhorer
  2 siblings, 0 replies; 39+ messages in thread
From: Tom Carlson @ 2010-11-04  9:29 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Nat Makarevitch, linux-raid

Really great little script there -- thank you! I have saved for use in
my (Currently offline) fileserver :-)

Thanks,

T

On 3 November 2010 21:54, Phil Turmel <philip@turmel.org> wrote:
> On 11/3/2010 2:13 PM, Nat Makarevitch wrote:
>> Hi,
>>
>> After a spindle (physical hard disk, a "drive") failure in a "md" RAID array,
>> how can we know which spindle must be replaced?
>>
>> We want to avoid extracting a working spindle by mistakenly thinking it is the
>> faulty one...
>
> I wrote a little script that would tell me device name and serial number for each host port on my motherboard, along with anything else that lists a scsi host in sysfs.  Output like so:
>
> Controller device @ pci0000:00/0000:00:1c.1/0000:06:00.0 [ahci]
>  RAID bus controller: Marvell Technology Group Ltd. 88SE6145 SATA II PCI-E controller (rev a1)
>    host4: [Empty]
>    host5: /dev/sdd ATA WDC WD5000AAKS-7 {SN: WD-WMAWF1370668}
>    host6: [Empty]
>    host7: [Empty]
>    host8: [Empty]
> Controller device @ pci0000:00/0000:00:1f.1 [ata_piix]
>  IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01)
>    host9: [Empty]
>    host10: [Empty]
> Controller device @ pci0000:00/0000:00:1f.2 [ahci]
>  SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01)
>    host0: /dev/sda ATA ST31000333AS {SN: 9TE1LTW0}
>    host1: /dev/sdb ATA ST31000333AS {SN: 9TE1MAJT}
>    host2: /dev/sdc ATA ST31000333AS {SN: 9TE1MV1R}
>    host3: /dev/sr0 HL-DT-ST BD-RE GBW-H20L
>
> Shows me my empty ports, too.  As long as I keep my cables straight to my hot-swap bays, getting the right drive is a snap.
>
> HTH,
>
> Phil
>
> I hereby release the following script into the public domain:
>
> #! /bin/bash
> #
> # Examine specific system host devices to identify the drives attached
> #
>
> function describe_controller () {
>        unset SUBSYSTEM PCI_ID DEVICE PCI_SLOT_NAME Manufacturer Product Serial
>        eval `udevadm info --query=all --path="$1" |grep '^E: ' |cut -c 4-`
>        echo "Controller device @ ${1##/sys/devices/} [$DRIVER]"
>        if [[ -n "$PCI_SLOT_NAME" ]] ; then
>                echo -e "  `lspci -s $PCI_SLOT_NAME |cut -d\  -f2-`"
>        fi
>        if [[ "${MODALIAS:0:4}" == "usb:" ]] ; then
>                eval `lsusb -D ${DEVICE/proc/dev/} |sed -r -n -e 's% *i(Manufacturer|Product|Serial) +[0-9]+ +(.+) *$%\1="\2"%;tFND;b;:FND;p'`
>                echo -e "  [$Manufacturer] $Product {SN: $Serial}"
>        fi
> }
>
> function describe_device () {
>        targ=${1%/block/*}
>        vnd="`cat $targ/vendor`"
>        mdl=`cat $targ/model`
>        rdev=`readlink -f "$1"`
>        if [[ -d $rdev ]] ; then
>                bdev="`basename $rdev`"
>                sn="`sginfo -s /dev/$bdev |sed -r -n -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
>                if [[ -n "$sn" ]] ; then
>                        echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
>                else
>                        echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
>                fi
>        else
>                echo -e "    $bhost: Unknown $rdev"
>        fi
> }
>
> function check_host () {
>        local found=0
>        local pController=
>        while read shost ; do
>                host=`dirname "$shost"`
>                controller=`dirname "$host"`
>                bhost=`basename "$host"`
>                if [[ "$controller" != "$pController" ]] ; then
>                        pController="$controller"
>                        describe_controller "$controller"
>                fi
>                for dev in $host/target*/*/block/* ; do
>                        if [[ "${dev: -1}" == '*' ]] ; then
>                                echo -e "    $bhost: [Empty]"
>                        else
>                                describe_device "$dev"
>                        fi
>                done
>        done
> }
>
> find /sys/devices/ -name scsi_host |check_host
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Determining which spindle is out of order
  2010-11-03 21:54 ` Phil Turmel
  2010-11-03 22:26   ` Roman Mamedov
  2010-11-04  9:29   ` Tom Carlson
@ 2010-11-06 10:22   ` Leslie Rhorer
  2010-11-06 15:12     ` Phil Turmel
  2 siblings, 1 reply; 39+ messages in thread
From: Leslie Rhorer @ 2010-11-06 10:22 UTC (permalink / raw)
  To: 'Phil Turmel'; +Cc: linux-raid



> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Phil Turmel
> Sent: Wednesday, November 03, 2010 4:55 PM
> To: Nat Makarevitch
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Determining which spindle is out of order
> 
> On 11/3/2010 2:13 PM, Nat Makarevitch wrote:
> > Hi,
> >
> > After a spindle (physical hard disk, a "drive") failure in a "md" RAID
> array,
> > how can we know which spindle must be replaced?
> >
> > We want to avoid extracting a working spindle by mistakenly thinking it
> is the
> > faulty one...
> 
> I wrote a little script that would tell me device name and serial number
> for each host port on my motherboard, along with anything else that lists
> a scsi host in sysfs.  Output like so:
> 
> Controller device @ pci0000:00/0000:00:1c.1/0000:06:00.0 [ahci]
>   RAID bus controller: Marvell Technology Group Ltd. 88SE6145 SATA II PCI-
> E controller (rev a1)
>     host4: [Empty]
>     host5: /dev/sdd ATA WDC WD5000AAKS-7 {SN: WD-WMAWF1370668}
>     host6: [Empty]
>     host7: [Empty]
>     host8: [Empty]
> Controller device @ pci0000:00/0000:00:1f.1 [ata_piix]
>   IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller
> (rev 01)
>     host9: [Empty]
>     host10: [Empty]
> Controller device @ pci0000:00/0000:00:1f.2 [ahci]
>   SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI
> Controller (rev 01)
>     host0: /dev/sda ATA ST31000333AS {SN: 9TE1LTW0}
>     host1: /dev/sdb ATA ST31000333AS {SN: 9TE1MAJT}
>     host2: /dev/sdc ATA ST31000333AS {SN: 9TE1MV1R}
>     host3: /dev/sr0 HL-DT-ST BD-RE GBW-H20L
> 
> Shows me my empty ports, too.  As long as I keep my cables straight to my
> hot-swap bays, getting the right drive is a snap.

	I haven't had a chance to dig into the script, but it doesn't
produce any output when I run it on one of my servers, and on the other one
it produces errors on line 7, but otherwise seems to work.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 10:22   ` Leslie Rhorer
@ 2010-11-06 15:12     ` Phil Turmel
       [not found]       ` <4CD57867.4010207@anonymous.org.uk>
                         ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-06 15:12 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: linux-raid

On 11/06/2010 06:22 AM, Leslie Rhorer wrote:
> 
> 
>> -----Original Message-----
>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
>> owner@vger.kernel.org] On Behalf Of Phil Turmel
>> Sent: Wednesday, November 03, 2010 4:55 PM
>> To: Nat Makarevitch
>> Cc: linux-raid@vger.kernel.org
>> Subject: Re: Determining which spindle is out of order
>>
>> On 11/3/2010 2:13 PM, Nat Makarevitch wrote:
>>> Hi,
>>>
>>> After a spindle (physical hard disk, a "drive") failure in a "md" RAID
>> array,
>>> how can we know which spindle must be replaced?
>>>
>>> We want to avoid extracting a working spindle by mistakenly thinking it
>> is the
>>> faulty one...
>>
>> I wrote a little script that would tell me device name and serial number
>> for each host port on my motherboard, along with anything else that lists
>> a scsi host in sysfs.  Output like so:
>>
>> Controller device @ pci0000:00/0000:00:1c.1/0000:06:00.0 [ahci]
>>   RAID bus controller: Marvell Technology Group Ltd. 88SE6145 SATA II PCI-
>> E controller (rev a1)
>>     host4: [Empty]
>>     host5: /dev/sdd ATA WDC WD5000AAKS-7 {SN: WD-WMAWF1370668}
>>     host6: [Empty]
>>     host7: [Empty]
>>     host8: [Empty]
>> Controller device @ pci0000:00/0000:00:1f.1 [ata_piix]
>>   IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller
>> (rev 01)
>>     host9: [Empty]
>>     host10: [Empty]
>> Controller device @ pci0000:00/0000:00:1f.2 [ahci]
>>   SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI
>> Controller (rev 01)
>>     host0: /dev/sda ATA ST31000333AS {SN: 9TE1LTW0}
>>     host1: /dev/sdb ATA ST31000333AS {SN: 9TE1MAJT}
>>     host2: /dev/sdc ATA ST31000333AS {SN: 9TE1MV1R}
>>     host3: /dev/sr0 HL-DT-ST BD-RE GBW-H20L
>>
>> Shows me my empty ports, too.  As long as I keep my cables straight to my
>> hot-swap bays, getting the right drive is a snap.
> 
> 	I haven't had a chance to dig into the script, but it doesn't
> produce any output when I run it on one of my servers, and on the other one
> it produces errors on line 7, but otherwise seems to work.
> 

Thanks for the feedback.  The script only looks in sysfs for controllers
implementing the scsi_host interface.  So it won't pick up anything using
the legacy IDE interface.  If that's not the case on the first server, I'd
like to see lspci -vvv for the controller in question.

As for the errors on line #7, that's likely to be the whitespace problem that
Roman pointed out.  Based on his comment, I've adjusted the script to be more
robust (a little faster, too).  Please give it a try:

#! /bin/bash
#
# Examine specific system host devices to identify the drives attached
#

function describe_controller () {
	unset SUBSYSTEM PCI_ID DEVICE PCI_SLOT_NAME Manufacturer Product Serial
	eval `udevadm info --query=all --path="$1" | \
		sed -rn -e 's/^E: (\w+)=(.+)$/\1="\2"/;T;p'`
	echo "Controller device @ ${1##/sys/devices/} [$DRIVER]"
	if [[ -n "$PCI_SLOT_NAME" ]] ; then
		echo -e "  `lspci -s $PCI_SLOT_NAME |cut -d\  -f2-`"
		return
	fi
	if [[ "${MODALIAS:0:4}" == "usb:" ]] ; then
		eval `lsusb -D ${DEVICE/proc/dev/} | \
			sed -rn -e 's% *i(Manufacturer|Product|Serial) +[0-9]+ +(.+) *$%\1="\2"%;T;p'`
		echo -e "  [$Manufacturer] $Product {SN: $Serial}"
		return
	fi
	echo -e "  $SUBSYSTEM $MODALIAS"
}

function describe_device () {
	targ=${1%/block/*}
	vnd="`cat $targ/vendor`"
	mdl=`cat $targ/model`
	rdev=`readlink -f "$1"`
	if [[ -d $rdev ]] ; then
		bdev="`basename $rdev`"
		sn="`sginfo -s /dev/$bdev | \
			sed -rn -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
		if [[ -n "$sn" ]] ; then
			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
		else
			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
		fi
	else
		echo -e "    $bhost: Unknown $rdev"
	fi
}

function check_host () {
	local found=0
	local pController=
	while read shost ; do
		host=`dirname "$shost"`
		controller=`dirname "$host"`
		bhost=`basename "$host"`
		if [[ "$controller" != "$pController" ]] ; then
			pController="$controller"
			describe_controller "$controller"
		fi
		for dev in $host/target*/*/block/* ; do
			if [[ "${dev: -1}" == '*' ]] ; then
				echo -e "    $bhost: [Empty]"
			else
				describe_device "$dev"
			fi
		done
	done
}

find /sys/devices/ -name scsi_host |check_host

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
       [not found]       ` <4CD57867.4010207@anonymous.org.uk>
@ 2010-11-06 16:02         ` Phil Turmel
  2010-11-06 16:11           ` Mathias Burén
                             ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-06 16:02 UTC (permalink / raw)
  To: John Robinson; +Cc: linux-raid

[added linux-raid CC: back in]

On 11/06/2010 11:46 AM, John Robinson wrote:
> On 06/11/2010 15:12, Phil Turmel wrote:
> [...]
>> Thanks for the feedback.  The script only looks in sysfs for controllers
>> implementing the scsi_host interface.  So it won't pick up anything using
>> the legacy IDE interface.  If that's not the case on the first server, I'd
>> like to see lspci -vvv for the controller in question.
> 
> I get no output on my CentOS 5, kernel-xen-2.6.18-194.8.1.el5.centos.plus, much the same as the CentOS/RHEL kernel-xen-2.6.18-194.8.1.el5.
> 
> Here's my lspci -vvv for my storage/SCSI devices:

[snip /]

> [...]
>> find /sys/devices/ -name scsi_host |check_host
> 
> This may be the culprit, this find command finds nothing, but I think my devices still support the sysfs scsi_host interface:
> 
> [root@beast ~]# find /sys/devices/ -name scsi_host
> [root@beast ~]# find /sys/devices/ -name *scsi_host*
> /sys/devices/pci0000:00/0000:00:1f.2/host7/scsi_host:host7
> /sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host:host6
> /sys/devices/pci0000:00/0000:00:1f.2/host5/scsi_host:host5
> /sys/devices/pci0000:00/0000:00:1f.2/host4/scsi_host:host4
> /sys/devices/pci0000:00/0000:00:1f.2/host3/scsi_host:host3
> /sys/devices/pci0000:00/0000:00:1f.2/host2/scsi_host:host2
> /sys/devices/pci0000:00/0000:00:1e.0/0000:05:01.1/host9/scsi_host:host9
> /sys/devices/pci0000:00/0000:00:1e.0/0000:05:01.0/host8/scsi_host:host8
> /sys/devices/pci0000:00/0000:00:1c.4/0000:03:00.0/host1/scsi_host:host1
> /sys/devices/pci0000:00/0000:00:1c.4/0000:03:00.0/host0/scsi_host:host0
> [root@beast ~]#
> 
> When I change the script to use my find command, I get:
> 
> [root@beast ~]# ~john/projects/describe_scsi/describe_scsi
> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
> Controller device @ pci0000:00/0000:00:1f.2 []
>     host7: [Empty]
>     host6: [Empty]
>     host5: [Empty]
>     host4: [Empty]
>     host3: [Empty]
>     host2: [Empty]
> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
> Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.1 []
>     host9: [Empty]
> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
> Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.0 []
>     host8: [Empty]
> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
> Controller device @ pci0000:00/0000:00:1c.4/0000:03:00.0 []
>     host1: [Empty]
>     host0: [Empty]
> [root@beast ~]#

Indeed.  The sysfs layout changed since kernel 2.6.18.  I'm guessing the use of
CONFIG_SYSFS_DEPRECATED and/or CONFIG_SYSFS_DEPRECATED_V2 will interfere with my
script in current kernels.

I'll poke around in one of my VMs when I get a chance.

> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)

Heh.  Anyone know the equivalent command in earlier versions of udev?

> 
> Cheers,
> 
> John.
> 

Regards,

Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 16:02         ` Phil Turmel
@ 2010-11-06 16:11           ` Mathias Burén
  2010-11-06 16:45           ` Jan Ceuleers
  2010-11-07 12:53           ` John Robinson
  2 siblings, 0 replies; 39+ messages in thread
From: Mathias Burén @ 2010-11-06 16:11 UTC (permalink / raw)
  To: Phil Turmel; +Cc: John Robinson, linux-raid

On 6 November 2010 16:02, Phil Turmel <philip@turmel.org> wrote:
> [added linux-raid CC: back in]
>
> On 11/06/2010 11:46 AM, John Robinson wrote:
>> On 06/11/2010 15:12, Phil Turmel wrote:
>> [...]
>>> Thanks for the feedback.  The script only looks in sysfs for controllers
>>> implementing the scsi_host interface.  So it won't pick up anything using
>>> the legacy IDE interface.  If that's not the case on the first server, I'd
>>> like to see lspci -vvv for the controller in question.
>>
>> I get no output on my CentOS 5, kernel-xen-2.6.18-194.8.1.el5.centos.plus, much the same as the CentOS/RHEL kernel-xen-2.6.18-194.8.1.el5.
>>
>> Here's my lspci -vvv for my storage/SCSI devices:
>
> [snip /]
>
>> [...]
>>> find /sys/devices/ -name scsi_host |check_host
>>
>> This may be the culprit, this find command finds nothing, but I think my devices still support the sysfs scsi_host interface:
>>
>> [root@beast ~]# find /sys/devices/ -name scsi_host
>> [root@beast ~]# find /sys/devices/ -name *scsi_host*
>> /sys/devices/pci0000:00/0000:00:1f.2/host7/scsi_host:host7
>> /sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host:host6
>> /sys/devices/pci0000:00/0000:00:1f.2/host5/scsi_host:host5
>> /sys/devices/pci0000:00/0000:00:1f.2/host4/scsi_host:host4
>> /sys/devices/pci0000:00/0000:00:1f.2/host3/scsi_host:host3
>> /sys/devices/pci0000:00/0000:00:1f.2/host2/scsi_host:host2
>> /sys/devices/pci0000:00/0000:00:1e.0/0000:05:01.1/host9/scsi_host:host9
>> /sys/devices/pci0000:00/0000:00:1e.0/0000:05:01.0/host8/scsi_host:host8
>> /sys/devices/pci0000:00/0000:00:1c.4/0000:03:00.0/host1/scsi_host:host1
>> /sys/devices/pci0000:00/0000:00:1c.4/0000:03:00.0/host0/scsi_host:host0
>> [root@beast ~]#
>>
>> When I change the script to use my find command, I get:
>>
>> [root@beast ~]# ~john/projects/describe_scsi/describe_scsi
>> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
>> Controller device @ pci0000:00/0000:00:1f.2 []
>>     host7: [Empty]
>>     host6: [Empty]
>>     host5: [Empty]
>>     host4: [Empty]
>>     host3: [Empty]
>>     host2: [Empty]
>> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
>> Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.1 []
>>     host9: [Empty]
>> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
>> Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.0 []
>>     host8: [Empty]
>> /home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: command not found
>> Controller device @ pci0000:00/0000:00:1c.4/0000:03:00.0 []
>>     host1: [Empty]
>>     host0: [Empty]
>> [root@beast ~]#
>
> Indeed.  The sysfs layout changed since kernel 2.6.18.  I'm guessing the use of
> CONFIG_SYSFS_DEPRECATED and/or CONFIG_SYSFS_DEPRECATED_V2 will interfere with my
> script in current kernels.
>
> I'll poke around in one of my VMs when I get a chance.
>
>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>
> Heh.  Anyone know the equivalent command in earlier versions of udev?
>
>>
>> Cheers,
>>
>> John.
>>
>
> Regards,
>
> Phil
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Hi, just tested the script (nice!) and it seems to work fine on 2.6.36
(Archlinux) 64-bit:


[fackamato@ion bin]$ sudo ./drivescan.sh
Controller device @ pci0000:00/0000:00:0b.0 [ahci]
  SATA controller: nVidia Corporation MCP79 AHCI Controller (rev b1)
    host0: /dev/sda ATA Corsair CSSD-F60
    host1: /dev/sdb ATA WDC WD20EARS-00M
    host2: /dev/sdc ATA WDC WD20EARS-00M
    host3: /dev/sdd ATA WDC WD20EARS-00M
    host4: [Empty]
    host5: [Empty]
Controller device @ pci0000:00/0000:00:16.0/0000:05:00.0 [sata_mv]
  SCSI storage controller: HighPoint Technologies, Inc. RocketRAID
230x 4 Port SATA-II Controller (rev 02)
    host6: [Empty]
    host7: /dev/sde ATA SAMSUNG HD204UI
    host8: /dev/sdf ATA WDC WD20EARS-00M
    host9: /dev/sdg ATA SAMSUNG HD204UI
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 16:02         ` Phil Turmel
  2010-11-06 16:11           ` Mathias Burén
@ 2010-11-06 16:45           ` Jan Ceuleers
  2010-11-06 19:39             ` Phil Turmel
  2010-11-07 12:53           ` John Robinson
  2 siblings, 1 reply; 39+ messages in thread
From: Jan Ceuleers @ 2010-11-06 16:45 UTC (permalink / raw)
  To: linux-raid

On 06/11/10 17:02, Phil Turmel wrote:
> Indeed.  The sysfs layout changed since kernel 2.6.18.  I'm guessing the use of
> CONFIG_SYSFS_DEPRECATED and/or CONFIG_SYSFS_DEPRECATED_V2 will interfere with my
> script in current kernels.

Nice.

Output on one of my machines is however not as expected:

Controller device @ pci0000:00/0000:00:1c.1/0000:02:00.0 [ahci]
   SATA controller: JMicron Technology Corp. JMB360 AHCI Controller (rev 02)
     host4: [Empty]
Controller device @ pci0000:00/0000:00:1f.2 [ata_piix]
   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 4 port 
SATA IDE Controller (rev 06)
     host0: /dev/sda ATA WDC WD20EADS-00R {SN: WD-WCAVY4080404}
     host1: /dev/sdb ATA ST3500418AS {SN: 9VMK33L9}
     host1: /dev/sdc ATA ST3500418AS {SN: 9VMM6EY4}
Controller device @ pci0000:00/0000:00:1f.5 [ata_piix]
   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 2 port 
SATA IDE Controller (rev 06)
     host2: [Empty]
     host3: [Empty]

This machine has seven SATA ports: one provided by the JMicron chip, the 
other six by the Intel H55 south bridge. Only three ports are currently 
used, but I had expected another [Empty] entry.

Here's what's in /sys/devices:

root@zotac:~# find /sys/devices/ -name scsi_host
/sys/devices/pci0000:00/0000:00:1c.1/0000:02:00.0/host4/scsi_host
/sys/devices/pci0000:00/0000:00:1f.2/host0/scsi_host
/sys/devices/pci0000:00/0000:00:1f.2/host1/scsi_host
/sys/devices/pci0000:00/0000:00:1f.5/host2/scsi_host
/sys/devices/pci0000:00/0000:00:1f.5/host3/scsi_host

Not sure what to make of that...

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 16:45           ` Jan Ceuleers
@ 2010-11-06 19:39             ` Phil Turmel
  2010-11-06 20:16               ` Leslie Rhorer
                                 ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-06 19:39 UTC (permalink / raw)
  To: Jan Ceuleers; +Cc: linux-raid

On 11/06/2010 12:45 PM, Jan Ceuleers wrote:
> On 06/11/10 17:02, Phil Turmel wrote:
>> Indeed.  The sysfs layout changed since kernel 2.6.18.  I'm guessing the use of
>> CONFIG_SYSFS_DEPRECATED and/or CONFIG_SYSFS_DEPRECATED_V2 will interfere with my
>> script in current kernels.
> 
> Nice.
> 
> Output on one of my machines is however not as expected:
> 
> Controller device @ pci0000:00/0000:00:1c.1/0000:02:00.0 [ahci]
>   SATA controller: JMicron Technology Corp. JMB360 AHCI Controller (rev 02)
>     host4: [Empty]
> Controller device @ pci0000:00/0000:00:1f.2 [ata_piix]
>   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 4 port SATA IDE Controller (rev 06)
>     host0: /dev/sda ATA WDC WD20EADS-00R {SN: WD-WCAVY4080404}
>     host1: /dev/sdb ATA ST3500418AS {SN: 9VMK33L9}
>     host1: /dev/sdc ATA ST3500418AS {SN: 9VMM6EY4}
> Controller device @ pci0000:00/0000:00:1f.5 [ata_piix]
>   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 2 port SATA IDE Controller (rev 06)
>     host2: [Empty]
>     host3: [Empty]
> 
> This machine has seven SATA ports: one provided by the JMicron chip, the other six by the Intel H55 south bridge. Only three ports are currently used, but I had expected another [Empty] entry.
> 
> Here's what's in /sys/devices:
> 
> root@zotac:~# find /sys/devices/ -name scsi_host
> /sys/devices/pci0000:00/0000:00:1c.1/0000:02:00.0/host4/scsi_host
> /sys/devices/pci0000:00/0000:00:1f.2/host0/scsi_host
> /sys/devices/pci0000:00/0000:00:1f.2/host1/scsi_host
> /sys/devices/pci0000:00/0000:00:1f.5/host2/scsi_host
> /sys/devices/pci0000:00/0000:00:1f.5/host3/scsi_host
> 
> Not sure what to make of that...

I'm guessing it's an artifact of IDE compatibility mode.  You can see host1 reports two drives, and my script is only expecting one.  Master vs. slave emulation, perhaps?  Can you check your BIOS for legacy IDE vs. AHCI mode setting?

I suspect my script will have similar problems with port multipliers.

Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Determining which spindle is out of order
  2010-11-06 15:12     ` Phil Turmel
       [not found]       ` <4CD57867.4010207@anonymous.org.uk>
@ 2010-11-06 19:58       ` Leslie Rhorer
  2010-11-06 21:17       ` John Robinson
  2 siblings, 0 replies; 39+ messages in thread
From: Leslie Rhorer @ 2010-11-06 19:58 UTC (permalink / raw)
  To: 'Phil Turmel'; +Cc: linux-raid



> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Phil Turmel
> Sent: Saturday, November 06, 2010 10:12 AM
> To: Leslie Rhorer
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Determining which spindle is out of order
> 
> On 11/06/2010 06:22 AM, Leslie Rhorer wrote:
> >
> >
> >> -----Original Message-----
> >> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> >> owner@vger.kernel.org] On Behalf Of Phil Turmel
> >> Sent: Wednesday, November 03, 2010 4:55 PM
> >> To: Nat Makarevitch
> >> Cc: linux-raid@vger.kernel.org
> >> Subject: Re: Determining which spindle is out of order
> >>
> >> On 11/3/2010 2:13 PM, Nat Makarevitch wrote:
> >>> Hi,
> >>>
> >>> After a spindle (physical hard disk, a "drive") failure in a "md" RAID
> >> array,
> >>> how can we know which spindle must be replaced?
> >>>
> >>> We want to avoid extracting a working spindle by mistakenly thinking
> it
> >> is the
> >>> faulty one...
> >>
> >> I wrote a little script that would tell me device name and serial
> number
> >> for each host port on my motherboard, along with anything else that
> lists
> >> a scsi host in sysfs.  Output like so:
> >>
> >> Controller device @ pci0000:00/0000:00:1c.1/0000:06:00.0 [ahci]
> >>   RAID bus controller: Marvell Technology Group Ltd. 88SE6145 SATA II
> PCI-
> >> E controller (rev a1)
> >>     host4: [Empty]
> >>     host5: /dev/sdd ATA WDC WD5000AAKS-7 {SN: WD-WMAWF1370668}
> >>     host6: [Empty]
> >>     host7: [Empty]
> >>     host8: [Empty]
> >> Controller device @ pci0000:00/0000:00:1f.1 [ata_piix]
> >>   IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller
> >> (rev 01)
> >>     host9: [Empty]
> >>     host10: [Empty]
> >> Controller device @ pci0000:00/0000:00:1f.2 [ahci]
> >>   SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI
> >> Controller (rev 01)
> >>     host0: /dev/sda ATA ST31000333AS {SN: 9TE1LTW0}
> >>     host1: /dev/sdb ATA ST31000333AS {SN: 9TE1MAJT}
> >>     host2: /dev/sdc ATA ST31000333AS {SN: 9TE1MV1R}
> >>     host3: /dev/sr0 HL-DT-ST BD-RE GBW-H20L
> >>
> >> Shows me my empty ports, too.  As long as I keep my cables straight to
> my
> >> hot-swap bays, getting the right drive is a snap.
> >
> > 	I haven't had a chance to dig into the script, but it doesn't
> > produce any output when I run it on one of my servers, and on the other
> one
> > it produces errors on line 7, but otherwise seems to work.
> >
> 
> Thanks for the feedback.  The script only looks in sysfs for controllers
> implementing the scsi_host interface.  So it won't pick up anything using
> the legacy IDE interface.  If that's not the case on the first server, I'd
> like to see lspci -vvv for the controller in question.

	It's the same controller on both servers.  The motherboards are
different, although both are Asus motherboards hosting AMD Athlon 64 x 2
CPUs.  At the moment they are running different kernels.  The one that is
working is running 2.6.32-3-amd64, and the one that is not is running
2.6.26-2-amd64.  Below is the result from lspci on the failing system.

06:00.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA
Controller (rev 02)
        Subsystem: Silicon Image, Inc. Device 7124
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping+ SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 32
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at fdcff000 (64-bit, non-prefetchable) [size=128]
        Region 2: Memory at fdcf0000 (64-bit, non-prefetchable) [size=32K]
        Region 4: I/O ports at ac00 [size=16]
        Expansion ROM at fdc00000 [disabled] [size=512K]
        Capabilities: [64] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [40] PCI-X non-bridge device
                Command: DPERE- ERO+ RBC=512 OST=12
                Status: Dev=06:00.0 64bit+ 133MHz+ SCD- USC- DC=simple
DMMRBC=2048 DMOST=12 DMCRS=128 RSCEM- 266MHz- 533MHz-
        Capabilities: [54] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Kernel driver in use: sata_sil24
        Kernel modules: sata_sil24


^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Determining which spindle is out of order
  2010-11-06 19:39             ` Phil Turmel
@ 2010-11-06 20:16               ` Leslie Rhorer
  2010-11-06 20:23               ` Mr. James W. Laferriere
  2010-11-07  7:51               ` Jan Ceuleers
  2 siblings, 0 replies; 39+ messages in thread
From: Leslie Rhorer @ 2010-11-06 20:16 UTC (permalink / raw)
  To: 'Phil Turmel'; +Cc: linux-raid



> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Phil Turmel
> Sent: Saturday, November 06, 2010 2:39 PM
> To: Jan Ceuleers
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Determining which spindle is out of order
> 
> On 11/06/2010 12:45 PM, Jan Ceuleers wrote:
> > On 06/11/10 17:02, Phil Turmel wrote:
> >> Indeed.  The sysfs layout changed since kernel 2.6.18.  I'm guessing
> the use of
> >> CONFIG_SYSFS_DEPRECATED and/or CONFIG_SYSFS_DEPRECATED_V2 will
> interfere with my
> >> script in current kernels.
> >
> > Nice.
> >
> > Output on one of my machines is however not as expected:
> >
> > Controller device @ pci0000:00/0000:00:1c.1/0000:02:00.0 [ahci]
> >   SATA controller: JMicron Technology Corp. JMB360 AHCI Controller (rev
> 02)
> >     host4: [Empty]
> > Controller device @ pci0000:00/0000:00:1f.2 [ata_piix]
> >   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 4 port
> SATA IDE Controller (rev 06)
> >     host0: /dev/sda ATA WDC WD20EADS-00R {SN: WD-WCAVY4080404}
> >     host1: /dev/sdb ATA ST3500418AS {SN: 9VMK33L9}
> >     host1: /dev/sdc ATA ST3500418AS {SN: 9VMM6EY4}
> > Controller device @ pci0000:00/0000:00:1f.5 [ata_piix]
> >   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 2 port
> SATA IDE Controller (rev 06)
> >     host2: [Empty]
> >     host3: [Empty]
> >
> > This machine has seven SATA ports: one provided by the JMicron chip, the
> other six by the Intel H55 south bridge. Only three ports are currently
> used, but I had expected another [Empty] entry.
> >
> > Here's what's in /sys/devices:
> >
> > root@zotac:~# find /sys/devices/ -name scsi_host
> > /sys/devices/pci0000:00/0000:00:1c.1/0000:02:00.0/host4/scsi_host
> > /sys/devices/pci0000:00/0000:00:1f.2/host0/scsi_host
> > /sys/devices/pci0000:00/0000:00:1f.2/host1/scsi_host
> > /sys/devices/pci0000:00/0000:00:1f.5/host2/scsi_host
> > /sys/devices/pci0000:00/0000:00:1f.5/host3/scsi_host
> >
> > Not sure what to make of that...
> 
> I'm guessing it's an artifact of IDE compatibility mode.  You can see
> host1 reports two drives, and my script is only expecting one.  Master vs.
> slave emulation, perhaps?  Can you check your BIOS for legacy IDE vs. AHCI
> mode setting?
> 
> I suspect my script will have similar problems with port multipliers.

	Both of my servers employ PMs, and it's working on one of them.
Actually, it's working on the PMs, but not the boot drives, which are hosted
off the motherboard.  They are IDE, so that's not too surprising.

Controller device @ pci0000:00/0000:00:02.0/0000:02:00.0/0000:03:00.0
[sata_sil24]
  Mass storage controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA
Controller (rev 02)
    host5: /dev/sda ATA WDC WD15EADS-00R {SN: WD-WCAVY0111608}
    host5: /dev/sdb ATA WDC WD15EADS-00R {SN: WD-WCAVY0123958}
    host5: /dev/sdc ATA WDC WD15EADS-00R {SN: WD-WCAUP0013905}
    host5: /dev/sdd ATA WDC WD15EADS-00P {SN: WD-WMAVU1944127}
    host7: /dev/sde ATA WDC WD15EARS-00Z {SN: WD-WMAVU2315499}
    host7: /dev/sdf ATA WDC WD15EARS-00Z {SN: WD-WMAVU1169621}
    host7: /dev/sdg ATA WDC WD15EARS-00Z {SN: WD-WMAVU1313973}
    host7: /dev/sdh ATA WDC WD15EADS-00P {SN: WD-WCAVU0303989}
    host9: /dev/sdi ATA WDC WD15EADS-00P {SN: WD-WMAVU1822645}
    host9: /dev/sdj ATA WDC WD15EARS-00Z {SN: WD-WMAVU2904844}
    host10: [Empty]
Controller device @ pci0000:00/0000:00:11.0 [ahci]
  SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [IDE
mode]
    host0: [Empty]
    host1: [Empty]
    host2: [Empty]
    host3: [Empty]
Controller device @ pci0000:00/0000:00:14.4/0000:06:06.0 [sata_via]
  RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller
(rev 50)
    host4: [Empty]
    host6: [Empty]
    host8: /dev/sr0 LITE-ON DVD SOHD-167T


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 19:39             ` Phil Turmel
  2010-11-06 20:16               ` Leslie Rhorer
@ 2010-11-06 20:23               ` Mr. James W. Laferriere
  2010-11-07  7:51               ` Jan Ceuleers
  2 siblings, 0 replies; 39+ messages in thread
From: Mr. James W. Laferriere @ 2010-11-06 20:23 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Jan Ceuleers, linux-raid

 	Hello Phil ,

On Sat, 6 Nov 2010, Phil Turmel wrote:
> On 11/06/2010 12:45 PM, Jan Ceuleers wrote:
>> On 06/11/10 17:02, Phil Turmel wrote:
>>> Indeed.  The sysfs layout changed since kernel 2.6.18.  I'm guessing the use of
>>> CONFIG_SYSFS_DEPRECATED and/or CONFIG_SYSFS_DEPRECATED_V2 will interfere with my
>>> script in current kernels.
>>
>> Nice.
>>
>> Output on one of my machines is however not as expected:
>>
>> Controller device @ pci0000:00/0000:00:1c.1/0000:02:00.0 [ahci]
>>   SATA controller: JMicron Technology Corp. JMB360 AHCI Controller (rev 02)
>>     host4: [Empty]
>> Controller device @ pci0000:00/0000:00:1f.2 [ata_piix]
>>   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 4 port SATA IDE Controller (rev 06)
>>     host0: /dev/sda ATA WDC WD20EADS-00R {SN: WD-WCAVY4080404}
>>     host1: /dev/sdb ATA ST3500418AS {SN: 9VMK33L9}
>>     host1: /dev/sdc ATA ST3500418AS {SN: 9VMM6EY4}
>> Controller device @ pci0000:00/0000:00:1f.5 [ata_piix]
>>   IDE interface: Intel Corporation 5 Series/3400 Series Chipset 2 port SATA IDE Controller (rev 06)
>>     host2: [Empty]
>>     host3: [Empty]
>>
>> This machine has seven SATA ports: one provided by the JMicron chip, the other six by the Intel H55 south bridge. Only three ports are currently used, but I had expected another [Empty] entry.
>>
>> Here's what's in /sys/devices:
>>
>> root@zotac:~# find /sys/devices/ -name scsi_host
>> /sys/devices/pci0000:00/0000:00:1c.1/0000:02:00.0/host4/scsi_host
>> /sys/devices/pci0000:00/0000:00:1f.2/host0/scsi_host
>> /sys/devices/pci0000:00/0000:00:1f.2/host1/scsi_host
>> /sys/devices/pci0000:00/0000:00:1f.5/host2/scsi_host
>> /sys/devices/pci0000:00/0000:00:1f.5/host3/scsi_host
>>
>> Not sure what to make of that...
>
> I'm guessing it's an artifact of IDE compatibility mode.  You can see host1 
reports two drives, and my script is only expecting one.  Master vs. slave 
emulation, perhaps?  Can you check your BIOS for legacy IDE vs. AHCI mode 
setting?
>
> I suspect my script will have similar problems with port multipliers.
>
> Phil
 	Can someone running near to Linux 2.6.30.6 or slightly less post 
(privately) their .config file so I can see what option I am missing as I 
definately have scsi devices & hosts ,  But there are no 'scsi_host' entries in 
the /sys file system .

fe: ....

  root@filesrv1:~ # for XXX in `lspci | grep -i scsi | awk '{print $1}'` ; do lspci -v -v -v -v -v -s ${XXX} ; done
00:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 66MHz  Ultra3 SCSI Adapter (rev 01)
         Subsystem: LSI Logic / Symbios Logic LSI53C1000/1000R/1010R/1010-66 PCI to Ultra160 SCSI Controller
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 72 (4250ns min, 4500ns max), cache line size 08
         Interrupt: pin A routed to IRQ 29
         Region 0: I/O ports at c400 [size=256]
         Region 1: Memory at fe9ff800 (64-bit, non-prefetchable) [size=1K]
         Region 3: Memory at fe9f6000 (64-bit, non-prefetchable) [size=8K]
         Expansion ROM at fe9f0000 [disabled] [size=16K]
         Capabilities: [40] Power Management version 2
                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 66MHz  Ultra3 SCSI Adapter (rev 01)
         Subsystem: LSI Logic / Symbios Logic LSI53C1000/1000R/1010R/1010-66 PCI to Ultra160 SCSI Controller
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 72 (4250ns min, 4500ns max), cache line size 08
         Interrupt: pin B routed to IRQ 28
         Region 0: I/O ports at c800 [size=256]
         Region 1: Memory at fe9ffc00 (64-bit, non-prefetchable) [size=1K]
         Region 3: Memory at fe9fc000 (64-bit, non-prefetchable) [size=8K]
         Expansion ROM at fe9f8000 [disabled] [size=16K]
         Capabilities: [40] Power Management version 2
                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-

  root@filesrv1:~ # find /sys/devices/ -name scsi_host

 		Tia ,  JimL
-- 
+------------------------------------------------------------------+
| James   W.   Laferriere | System    Techniques | Give me VMS     |
| Network&System Engineer | 3237     Holden Road |  Give me Linux  |
| babydr@baby-dragons.com | Fairbanks, AK. 99709 |   only  on  AXP |
+------------------------------------------------------------------+

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 15:12     ` Phil Turmel
       [not found]       ` <4CD57867.4010207@anonymous.org.uk>
  2010-11-06 19:58       ` Leslie Rhorer
@ 2010-11-06 21:17       ` John Robinson
  2 siblings, 0 replies; 39+ messages in thread
From: John Robinson @ 2010-11-06 21:17 UTC (permalink / raw)
  To: Linux RAID

(Resending because I forgot to cc the list originally)

On 06/11/2010 15:12, Phil Turmel wrote:
[...]
> Thanks for the feedback.  The script only looks in sysfs for controllers
> implementing the scsi_host interface.  So it won't pick up anything using
> the legacy IDE interface.  If that's not the case on the first server, I'd
> like to see lspci -vvv for the controller in question.

I get no output on my CentOS 5, 
kernel-xen-2.6.18-194.8.1.el5.centos.plus, much the same as the 
CentOS/RHEL kernel-xen-2.6.18-194.8.1.el5.

Here's my lspci -vvv for my storage/SCSI devices:

00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA 
AHCI Controller (prog-if 01 [AHCI 1.0])
         Subsystem: ASUSTeK Computer Inc. P5Q Deluxe Motherboard
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
         Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium 
 >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin B routed to IRQ 251
         Region 0: I/O ports at 9c00 [size=8]
         Region 1: I/O ports at 9880 [size=4]
         Region 2: I/O ports at 9800 [size=8]
         Region 3: I/O ports at 9480 [size=4]
         Region 4: I/O ports at 9400 [size=32]
         Region 5: Memory at f5ffe800 (32-bit, non-prefetchable) [size=2K]
         Capabilities: [80] Message Signalled Interrupts: 64bit- 
Queue=0/4 Enable+
                 Address: fee0100c  Data: 4129
         Capabilities: [70] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot+,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [a8] #12 [0010]
         Capabilities: [b0] Vendor Specific Information

00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
         Subsystem: ASUSTeK Computer Inc. Unknown device 82d4
         Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
         Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium 
 >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Interrupt: pin C routed to IRQ 18
         Region 0: Memory at f5fff400 (64-bit, non-prefetchable) [size=256]
         Region 4: I/O ports at 0400 [size=32]

03:00.0 IDE interface: Marvell Technology Group Ltd. 88SE6121 SATA II 
Controller (rev b2) (prog-if 8f [Master SecP SecO PriP PriO])
         Subsystem: ASUSTeK Computer Inc. Unknown device 82e0
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0, Cache Line Size: 32 bytes
         Interrupt: pin A routed to IRQ 16
         Region 0: I/O ports at dc00 [size=8]
         Region 1: I/O ports at d880 [size=4]
         Region 2: I/O ports at d800 [size=8]
         Region 3: I/O ports at d480 [size=4]
         Region 4: I/O ports at d400 [size=16]
         Region 5: Memory at fafffc00 (32-bit, non-prefetchable) [size=1K]
         Capabilities: [48] Power Management version 2
                 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA 
PME(D0+,D1+,D2-,D3hot+,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=1 PME-
         Capabilities: [50] Message Signalled Interrupts: 64bit- 
Queue=0/0 Enable-
                 Address: 00000000  Data: 0000
         Capabilities: [e0] Express Legacy Endpoint IRQ 0
                 Device: Supported: MaxPayload 128 bytes, PhantFunc 0, 
ExtTag-
                 Device: Latency L0s unlimited, L1 unlimited
                 Device: AtnBtn- AtnInd- PwrInd-
                 Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                 Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop-
                 Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                 Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s, Port 0
                 Link: Latency L0s <256ns, L1 unlimited
                 Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                 Link: Speed 2.5Gb/s, Width x1

05:01.0 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m 
(rev 01)
         Subsystem: Compaq Computer Corporation Compaq 64-Bit/66MHz Dual 
Channel Wide Ultra3 SCSI Adapter
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
         Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium 
 >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 64 (10000ns min, 6250ns max), Cache Line Size: 32 bytes
         Interrupt: pin A routed to IRQ 17
         BIST result: 00
         Region 0: I/O ports at e400 [disabled] [size=256]
         Region 1: Memory at febfe000 (64-bit, non-prefetchable) [size=4K]
         Expansion ROM at e0000000 [disabled] [size=128K]
         Capabilities: [dc] Power Management version 2
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-

05:01.1 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m 
(rev 01)
         Subsystem: Compaq Computer Corporation Compaq 64-Bit/66MHz Dual 
Channel Wide Ultra3 SCSI Adapter
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
         Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium 
 >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 64 (10000ns min, 6250ns max), Cache Line Size: 32 bytes
         Interrupt: pin B routed to IRQ 18
         BIST result: 00
         Region 0: I/O ports at e800 [disabled] [size=256]
         Region 1: Memory at febff000 (64-bit, non-prefetchable) [size=4K]
         Expansion ROM at e0020000 [disabled] [size=128K]
         Capabilities: [dc] Power Management version 2
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-


[...]
> find /sys/devices/ -name scsi_host |check_host

This may be the culprit, this find command finds nothing, but I think my 
devices still support the sysfs scsi_host interface:

[root@beast ~]# find /sys/devices/ -name scsi_host
[root@beast ~]# find /sys/devices/ -name *scsi_host*
/sys/devices/pci0000:00/0000:00:1f.2/host7/scsi_host:host7
/sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host:host6
/sys/devices/pci0000:00/0000:00:1f.2/host5/scsi_host:host5
/sys/devices/pci0000:00/0000:00:1f.2/host4/scsi_host:host4
/sys/devices/pci0000:00/0000:00:1f.2/host3/scsi_host:host3
/sys/devices/pci0000:00/0000:00:1f.2/host2/scsi_host:host2
/sys/devices/pci0000:00/0000:00:1e.0/0000:05:01.1/host9/scsi_host:host9
/sys/devices/pci0000:00/0000:00:1e.0/0000:05:01.0/host8/scsi_host:host8
/sys/devices/pci0000:00/0000:00:1c.4/0000:03:00.0/host1/scsi_host:host1
/sys/devices/pci0000:00/0000:00:1c.4/0000:03:00.0/host0/scsi_host:host0
[root@beast ~]#

When I change the script to use my find command, I get:

[root@beast ~]# ~john/projects/describe_scsi/describe_scsi
/home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: 
command not found
Controller device @ pci0000:00/0000:00:1f.2 []
     host7: [Empty]
     host6: [Empty]
     host5: [Empty]
     host4: [Empty]
     host3: [Empty]
     host2: [Empty]
/home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: 
command not found
Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.1 []
     host9: [Empty]
/home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: 
command not found
Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.0 []
     host8: [Empty]
/home/john/projects/describe_scsi/describe_scsi: line 8: udevadm: 
command not found
Controller device @ pci0000:00/0000:00:1c.4/0000:03:00.0 []
     host1: [Empty]
     host0: [Empty]
[root@beast ~]#

Now I need to find udevadm I guess. It must have been introduced since 
the udev version that comes with RHEL/CentOS 5, which is 
udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since 
version 118 or so. Never mind :-)

Cheers,

John.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 19:39             ` Phil Turmel
  2010-11-06 20:16               ` Leslie Rhorer
  2010-11-06 20:23               ` Mr. James W. Laferriere
@ 2010-11-07  7:51               ` Jan Ceuleers
  2 siblings, 0 replies; 39+ messages in thread
From: Jan Ceuleers @ 2010-11-07  7:51 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid

On 06/11/10 20:39, Phil Turmel wrote:
> On 11/06/2010 12:45 PM, Jan Ceuleers wrote:
>> Controller device @ pci0000:00/0000:00:1c.1/0000:02:00.0 [ahci]
>>    SATA controller: JMicron Technology Corp. JMB360 AHCI Controller (rev 02)
>>      host4: [Empty]
>> Controller device @ pci0000:00/0000:00:1f.2 [ata_piix]
>>    IDE interface: Intel Corporation 5 Series/3400 Series Chipset 4 port SATA IDE Controller (rev 06)
>>      host0: /dev/sda ATA WDC WD20EADS-00R {SN: WD-WCAVY4080404}
>>      host1: /dev/sdb ATA ST3500418AS {SN: 9VMK33L9}
>>      host1: /dev/sdc ATA ST3500418AS {SN: 9VMM6EY4}
>> Controller device @ pci0000:00/0000:00:1f.5 [ata_piix]
>>    IDE interface: Intel Corporation 5 Series/3400 Series Chipset 2 port SATA IDE Controller (rev 06)
>>      host2: [Empty]
>>      host3: [Empty]
>>
[...]
> I'm guessing it's an artifact of IDE compatibility mode.
[...]

Spot on. After switching the BIOS to AHCI mode, output now looks like this:

Controller device @ pci0000:00/0000:00:1c.1/0000:02:00.0 [ahci]
   SATA controller: JMicron Technology Corp. JMB360 AHCI Controller (rev 02)
     host6: [Empty]
Controller device @ pci0000:00/0000:00:1f.2 [ahci]
   SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 
port SATA AHCI Controller (rev 06)
     host0: /dev/sda ATA WDC WD20EADS-00R {SN: WD-WCAVY4080404}
     host1: /dev/sdb ATA ST3500418AS {SN: 9VMK33L9}
     host2: [Empty]
     host3: /dev/sdc ATA ST3500418AS {SN: 9VMM6EY4}
     host4: [Empty]
     host5: [Empty]

So the south bridge is now seen as the single chip that it is, instead 
of two emulated-ones. It is now also controlled by the ahci driver. And 
all of the unused ports are there.

Great stuff, thanks.

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-06 16:02         ` Phil Turmel
  2010-11-06 16:11           ` Mathias Burén
  2010-11-06 16:45           ` Jan Ceuleers
@ 2010-11-07 12:53           ` John Robinson
  2010-11-07 13:21             ` Phil Turmel
  2 siblings, 1 reply; 39+ messages in thread
From: John Robinson @ 2010-11-07 12:53 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid

On 06/11/2010 16:02, Phil Turmel wrote:
> On 11/06/2010 11:46 AM, John Robinson wrote:
[...]
>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>
> Heh.  Anyone know the equivalent command in earlier versions of udev?

I think it's `udevinfo` instead of `udevadm info` - the comment in the 
ChangeLog for udev-117 is "udevadm: merge all udev tools into a single 
binary". But it doesn't work terribly well:

[root@beast describe_scsi]# udevinfo -q all -p 
/devices/pci0000\:00/0000\:00\:1f.2/
no record for '/devices/pci0000:00/0000:00:1f.2/' in database

That's unfortunate. But it does know about that device if asked differently:

[root@beast describe_scsi]# udevinfo -a -p 
/devices/pci0000\:00/0000\:00\:1f.2/

Udevinfo starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

   looking at device '/devices/pci0000:00/0000:00:1f.2':
     KERNEL=="0000:00:1f.2"
     SUBSYSTEM=="pci"
     SYSFS{broken_parity_status}=="0"
     SYSFS{enable}=="1"
 
SYSFS{modalias}=="pci:v00008086d00003A22sv00001043sd000082D4bc01sc06i01"
 
SYSFS{local_cpus}=="7fffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff"
     SYSFS{irq}=="251"
     SYSFS{class}=="0x010601"
     SYSFS{subsystem_device}=="0x82d4"
     SYSFS{subsystem_vendor}=="0x1043"
     SYSFS{device}=="0x3a22"
     SYSFS{vendor}=="0x8086"

   looking at parent device '/devices/pci0000:00':
     ID=="pci0000:00"
     BUS==""
     DRIVER==""

[root@beast describe_scsi]#

And that output is obviously not what the rest of your script wants. I 
can ask about individual drives though:

[root@beast describe_scsi]# udevinfo -q all -p /block/sda
P: /block/sda
N: sda
S: disk/by-id/scsi-SATA_Hitachi_HDS7210_JP2921HQ0J0PZA
S: disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0
E: ID_VENDOR=ATA
E: ID_MODEL=Hitachi_HDS72101
E: ID_REVISION=JP4O
E: ID_SERIAL=SATA_Hitachi_HDS7210_JP2921HQ0J0PZA
E: ID_TYPE=disk
E: ID_BUS=scsi
E: ID_PATH=pci-0000:00:1f.2-scsi-0:0:0:0
[root@beast describe_scsi]#

I suspect the udev version in EL5 just isn't going to give up the info 
you need, even if you did rewrite for the different sysfs paths :-(

Thanks for your efforts though!

Cheers,

John.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 12:53           ` John Robinson
@ 2010-11-07 13:21             ` Phil Turmel
  2010-11-07 13:43               ` John Robinson
  0 siblings, 1 reply; 39+ messages in thread
From: Phil Turmel @ 2010-11-07 13:21 UTC (permalink / raw)
  To: John Robinson; +Cc: linux-raid

On 11/07/2010 07:53 AM, John Robinson wrote:
> On 06/11/2010 16:02, Phil Turmel wrote:
>> On 11/06/2010 11:46 AM, John Robinson wrote:
> [...]
>>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>>
>> Heh.  Anyone know the equivalent command in earlier versions of udev?
> 
> I think it's `udevinfo` instead of `udevadm info` - the comment in the ChangeLog for udev-117 is "udevadm: merge all udev tools into a single binary". But it doesn't work terribly well:
> 
> [root@beast describe_scsi]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2/
> no record for '/devices/pci0000:00/0000:00:1f.2/' in database
> 
> That's unfortunate. But it does know about that device if asked differently:
> 
> [root@beast describe_scsi]# udevinfo -a -p /devices/pci0000\:00/0000\:00\:1f.2/

Hmmm.  Can you try both of the above without the trailing slash?

[snip /]

> I suspect the udev version in EL5 just isn't going to give up the info you need, even if you did rewrite for the different sysfs paths :-(

The information is there, though.  I'll poke at it in the near future.

> Thanks for your efforts though!

You're welcome.

> Cheers,
> 
> John.

Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 13:21             ` Phil Turmel
@ 2010-11-07 13:43               ` John Robinson
  2010-11-07 14:43                 ` Phil Turmel
  2010-11-07 20:52                 ` Roman Mamedov
  0 siblings, 2 replies; 39+ messages in thread
From: John Robinson @ 2010-11-07 13:43 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid

On 07/11/2010 13:21, Phil Turmel wrote:
> On 11/07/2010 07:53 AM, John Robinson wrote:
>> On 06/11/2010 16:02, Phil Turmel wrote:
>>> On 11/06/2010 11:46 AM, John Robinson wrote:
>> [...]
>>>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>>>
>>> Heh.  Anyone know the equivalent command in earlier versions of udev?
>>
>> I think it's `udevinfo` instead of `udevadm info` - the comment in the ChangeLog for udev-117 is "udevadm: merge all udev tools into a single binary". But it doesn't work terribly well:
>>
>> [root@beast describe_scsi]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2/
>> no record for '/devices/pci0000:00/0000:00:1f.2/' in database
>>
>> That's unfortunate. But it does know about that device if asked differently:
>>
>> [root@beast describe_scsi]# udevinfo -a -p /devices/pci0000\:00/0000\:00\:1f.2/
>
> Hmmm.  Can you try both of the above without the trailing slash?

Just the same output, however I ask the question:

[root@beast ~]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2
no record for '/devices/pci0000:00/0000:00:1f.2' in database
[root@beast ~]# udevinfo -q all -p /devices/pci0000:00/0000:00:1f.2
no record for '/devices/pci0000:00/0000:00:1f.2' in database
[root@beast ~]# udevinfo -q all -p /sys/devices/pci0000:00/0000:00:1f.2
no record for '/devices/pci0000:00/0000:00:1f.2' in database
[root@beast ~]#

And all with "-a" instead of "-q all" produce the output I posted before.

> [snip /]
>
>> I suspect the udev version in EL5 just isn't going to give up the info you need, even if you did rewrite for the different sysfs paths :-(
>
> The information is there, though.  I'll poke at it in the near future.

Please don't feel you have to turn this into a project, though.

Cheers,

John.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 13:43               ` John Robinson
@ 2010-11-07 14:43                 ` Phil Turmel
  2010-11-07 15:04                   ` Mathias Burén
                                     ` (2 more replies)
  2010-11-07 20:52                 ` Roman Mamedov
  1 sibling, 3 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-07 14:43 UTC (permalink / raw)
  To: John Robinson; +Cc: linux-raid

On 11/07/2010 08:43 AM, John Robinson wrote:
> On 07/11/2010 13:21, Phil Turmel wrote:
>> On 11/07/2010 07:53 AM, John Robinson wrote:
>>> On 06/11/2010 16:02, Phil Turmel wrote:
>>>> On 11/06/2010 11:46 AM, John Robinson wrote:
>>> [...]
>>>>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>>>>
>>>> Heh.  Anyone know the equivalent command in earlier versions of udev?
>>>
>>> I think it's `udevinfo` instead of `udevadm info` - the comment in the ChangeLog for udev-117 is "udevadm: merge all udev tools into a single binary". But it doesn't work terribly well:
>>>
>>> [root@beast describe_scsi]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2/
>>> no record for '/devices/pci0000:00/0000:00:1f.2/' in database
>>>
>>> That's unfortunate. But it does know about that device if asked differently:
>>>
>>> [root@beast describe_scsi]# udevinfo -a -p /devices/pci0000\:00/0000\:00\:1f.2/
>>
>> Hmmm.  Can you try both of the above without the trailing slash?
> 
> Just the same output, however I ask the question:
> 
> [root@beast ~]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2
> no record for '/devices/pci0000:00/0000:00:1f.2' in database
> [root@beast ~]# udevinfo -q all -p /devices/pci0000:00/0000:00:1f.2
> no record for '/devices/pci0000:00/0000:00:1f.2' in database
> [root@beast ~]# udevinfo -q all -p /sys/devices/pci0000:00/0000:00:1f.2
> no record for '/devices/pci0000:00/0000:00:1f.2' in database
> [root@beast ~]#
> 
> And all with "-a" instead of "-q all" produce the output I posted before.

The modern udevadm gives me that with --attribute-walk.  It's purpose is to report
the conditions one might want to use in an udev rule.  It doesn't provide the human
descriptions I'm looking for.

> Please don't feel you have to turn this into a project, though.

Too late.  Here's a version that doesn't use udevadm at all...

#! /bin/bash
#
# Examine specific system host devices to identify the drives attached
#

function describe_controller () {
	local device driver modprefix serial slotname
	driver="`readlink -f \"$1/driver\"`"
	driver="`basename $driver`"
	modprefix="`cut -d: -f1 <\"$1/modalias\"`"
	echo "Controller device @ ${1##/sys/devices/} [$driver]"
	if [[ "$modprefix" == "pci" ]] ; then
		slotname="`basename \"$1\"`"
		echo -n "  `lspci -s $slotname |cut -d\  -f2-`"
		return
	fi
	if [[ "$modprefix" == "usb" ]] ; then
		if [[ -f "$1/busnum" ]] ; then
			device="`cat \"$1/busnum\"`:`cat \"$1/devnum\"`"
			serial="`cat \"$1/serial\"`"
		else
			device="`cat \"$1/../busnum\"`:`cat \"$1/../devnum\"`"
			serial="`cat \"$1/../serial\"`"
		fi
		echo "  `lsusb -s $device` {SN: $serial}"
		return
	fi
	echo -e "  `cat \"$1/modalias\"`"
}

function describe_device () {
	targ=${1%/block/*}
	vnd="`cat $targ/vendor`"
	mdl=`cat $targ/model`
	rdev=`readlink -f "$1"`
	if [[ -d $rdev ]] ; then
		bdev="`basename $rdev`"
		sn="`sginfo -s /dev/$bdev | \
			sed -rn -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
		if [[ -n "$sn" ]] ; then
			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
		else
			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
		fi
	else
		echo -e "    $bhost: Unknown $rdev"
	fi
}

function check_host () {
	local found=0
	local pController=
	while read shost ; do
		host=`dirname "$shost"`
		controller=`dirname "$host"`
		bhost=`basename "$host"`
		if [[ "$controller" != "$pController" ]] ; then
			pController="$controller"
			describe_controller "$controller"
		fi
		for dev in $host/target*/*/block/* ; do
			if [[ "${dev: -1}" == '*' ]] ; then
				echo -e "    $bhost: [Empty]"
			else
				describe_device "$dev"
			fi
		done
	done
}

find /sys/devices/ -name scsi_host |check_host

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 14:43                 ` Phil Turmel
@ 2010-11-07 15:04                   ` Mathias Burén
  2010-11-07 15:19                   ` John Robinson
  2010-11-08 21:05                   ` Mr. James W. Laferriere
  2 siblings, 0 replies; 39+ messages in thread
From: Mathias Burén @ 2010-11-07 15:04 UTC (permalink / raw)
  To: Phil Turmel; +Cc: John Robinson, linux-raid

On 7 November 2010 14:43, Phil Turmel <philip@turmel.org> wrote:
> On 11/07/2010 08:43 AM, John Robinson wrote:
>> On 07/11/2010 13:21, Phil Turmel wrote:
>>> On 11/07/2010 07:53 AM, John Robinson wrote:
>>>> On 06/11/2010 16:02, Phil Turmel wrote:
>>>>> On 11/06/2010 11:46 AM, John Robinson wrote:
>>>> [...]
>>>>>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>>>>>
>>>>> Heh.  Anyone know the equivalent command in earlier versions of udev?
>>>>
>>>> I think it's `udevinfo` instead of `udevadm info` - the comment in the ChangeLog for udev-117 is "udevadm: merge all udev tools into a single binary". But it doesn't work terribly well:
>>>>
>>>> [root@beast describe_scsi]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2/
>>>> no record for '/devices/pci0000:00/0000:00:1f.2/' in database
>>>>
>>>> That's unfortunate. But it does know about that device if asked differently:
>>>>
>>>> [root@beast describe_scsi]# udevinfo -a -p /devices/pci0000\:00/0000\:00\:1f.2/
>>>
>>> Hmmm.  Can you try both of the above without the trailing slash?
>>
>> Just the same output, however I ask the question:
>>
>> [root@beast ~]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2
>> no record for '/devices/pci0000:00/0000:00:1f.2' in database
>> [root@beast ~]# udevinfo -q all -p /devices/pci0000:00/0000:00:1f.2
>> no record for '/devices/pci0000:00/0000:00:1f.2' in database
>> [root@beast ~]# udevinfo -q all -p /sys/devices/pci0000:00/0000:00:1f.2
>> no record for '/devices/pci0000:00/0000:00:1f.2' in database
>> [root@beast ~]#
>>
>> And all with "-a" instead of "-q all" produce the output I posted before.
>
> The modern udevadm gives me that with --attribute-walk.  It's purpose is to report
> the conditions one might want to use in an udev rule.  It doesn't provide the human
> descriptions I'm looking for.
>
>> Please don't feel you have to turn this into a project, though.
>
> Too late.  Here's a version that doesn't use udevadm at all...
>
> #! /bin/bash
> #
> # Examine specific system host devices to identify the drives attached
> #
>
> function describe_controller () {
>        local device driver modprefix serial slotname
>        driver="`readlink -f \"$1/driver\"`"
>        driver="`basename $driver`"
>        modprefix="`cut -d: -f1 <\"$1/modalias\"`"
>        echo "Controller device @ ${1##/sys/devices/} [$driver]"
>        if [[ "$modprefix" == "pci" ]] ; then
>                slotname="`basename \"$1\"`"
>                echo -n "  `lspci -s $slotname |cut -d\  -f2-`"
>                return
>        fi
>        if [[ "$modprefix" == "usb" ]] ; then
>                if [[ -f "$1/busnum" ]] ; then
>                        device="`cat \"$1/busnum\"`:`cat \"$1/devnum\"`"
>                        serial="`cat \"$1/serial\"`"
>                else
>                        device="`cat \"$1/../busnum\"`:`cat \"$1/../devnum\"`"
>                        serial="`cat \"$1/../serial\"`"
>                fi
>                echo "  `lsusb -s $device` {SN: $serial}"
>                return
>        fi
>        echo -e "  `cat \"$1/modalias\"`"
> }
>
> function describe_device () {
>        targ=${1%/block/*}
>        vnd="`cat $targ/vendor`"
>        mdl=`cat $targ/model`
>        rdev=`readlink -f "$1"`
>        if [[ -d $rdev ]] ; then
>                bdev="`basename $rdev`"
>                sn="`sginfo -s /dev/$bdev | \
>                        sed -rn -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
>                if [[ -n "$sn" ]] ; then
>                        echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
>                else
>                        echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
>                fi
>        else
>                echo -e "    $bhost: Unknown $rdev"
>        fi
> }
>
> function check_host () {
>        local found=0
>        local pController=
>        while read shost ; do
>                host=`dirname "$shost"`
>                controller=`dirname "$host"`
>                bhost=`basename "$host"`
>                if [[ "$controller" != "$pController" ]] ; then
>                        pController="$controller"
>                        describe_controller "$controller"
>                fi
>                for dev in $host/target*/*/block/* ; do
>                        if [[ "${dev: -1}" == '*' ]] ; then
>                                echo -e "    $bhost: [Empty]"
>                        else
>                                describe_device "$dev"
>                        fi
>                done
>        done
> }
>
> find /sys/devices/ -name scsi_host |check_host
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Nice, now it shows the S/N of the devices in my system:


Controller device @ pci0000:00/0000:00:0b.0 [ahci]
  SATA controller: nVidia Corporation MCP79 AHCI Controller (rev b1)
 host0: /dev/sda ATA Corsair CSSD-F60 {SN: 10326505580009990027}
    host1: /dev/sdb ATA WDC WD20EARS-00M {SN: WD-WCAZA1022443}
    host2: /dev/sdc ATA WDC WD20EARS-00M {SN: WD-WMAZ20152590}
    host3: /dev/sdd ATA WDC WD20EARS-00M {SN: WD-WMAZ20188479}
    host4: [Empty]
    host5: [Empty]
Controller device @ pci0000:00/0000:00:16.0/0000:05:00.0 [sata_mv]
  SCSI storage controller: HighPoint Technologies, Inc. RocketRAID
230x 4 Port SATA-II Controller (rev 02)    host6: [Empty]
    host7: /dev/sde ATA SAMSUNG HD204UI {SN: S2HGJ1RZ800964 }
    host8: /dev/sdf ATA WDC WD20EARS-00M {SN: WD-WCAZA1000331}
    host9: /dev/sdg ATA SAMSUNG HD204UI {SN: S2HGJ1RZ800850 }

// Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 14:43                 ` Phil Turmel
  2010-11-07 15:04                   ` Mathias Burén
@ 2010-11-07 15:19                   ` John Robinson
  2010-11-07 18:39                     ` Phil Turmel
  2010-11-08 21:05                   ` Mr. James W. Laferriere
  2 siblings, 1 reply; 39+ messages in thread
From: John Robinson @ 2010-11-07 15:19 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 8611 bytes --]

On 07/11/2010 14:43, Phil Turmel wrote:
> On 11/07/2010 08:43 AM, John Robinson wrote:
[...]
>> Please don't feel you have to turn this into a project, though.
>
> Too late.  Here's a version that doesn't use udevadm at all...

OK, it's an improvement because after I've changed the find command to 
find '*scsi_host*', it lists my controllers, but finds them all empty. I 
noted that the script was looking for subdirectories called block but 
mine have names like block:sda so I changed the script again to refer to 
'block*' both in the loop in check_host and in the substitution at the 
top of describe_device. There's still something not quite right with 
trying to read CentOS/RHEL 5 / kernel 2.6.18 sysfs, because this was the 
output I got:

[root@beast ~]# ~john/projects/describe_scsi/describe_scsi_2
Controller device @ pci0000:00/0000:00:1f.2 [ahci]
   SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI 
Controller    host7: [Empty]
     host6: [Empty]
     host5: [Empty]
     host4: Unknown /sys/block/sdc/dev
sginfo(open): No such file or directory
file=/dev/4:0:0:0, or no corresponding sg device found
Is sg driver loaded?
     host4: /dev/4:0:0:0 ATA ST31000528AS
sginfo(open): No such file or directory
file=/dev/holders, or no corresponding sg device found
Is sg driver loaded?
     host4: /dev/holders ATA ST31000528AS
sginfo(open): No such file or directory
file=/dev/queue, or no corresponding sg device found
Is sg driver loaded?
     host4: /dev/queue ATA ST31000528AS
     host4: Unknown /sys/block/sdc/range
     host4: Unknown /sys/block/sdc/removable
     host4: /dev/sdc1 ATA ST31000528AS {SN: 9VP4XCQP}
     host4: /dev/sdc2 ATA ST31000528AS {SN: 9VP4XCQP}
     host4: Unknown /sys/block/sdc/size
sginfo(open): No such file or directory
file=/dev/slaves, or no corresponding sg device found
Is sg driver loaded?
     host4: /dev/slaves ATA ST31000528AS
     host4: Unknown /sys/block/sdc/stat
sginfo(open): No such file or directory
file=/dev/block, or no corresponding sg device found
Is sg driver loaded?
     host4: /dev/block ATA ST31000528AS
     host4: Unknown /sys/block/sdc/uevent
     host3: Unknown /sys/block/sdb/dev
sginfo(open): No such file or directory
file=/dev/3:0:0:0, or no corresponding sg device found
Is sg driver loaded?
     host3: /dev/3:0:0:0 ATA SAMSUNG HD103UJ
sginfo(open): No such file or directory
file=/dev/holders, or no corresponding sg device found
Is sg driver loaded?
     host3: /dev/holders ATA SAMSUNG HD103UJ
sginfo(open): No such file or directory
file=/dev/queue, or no corresponding sg device found
Is sg driver loaded?
     host3: /dev/queue ATA SAMSUNG HD103UJ
     host3: Unknown /sys/block/sdb/range
     host3: Unknown /sys/block/sdb/removable
     host3: /dev/sdb1 ATA SAMSUNG HD103UJ {SN: S1PVJ1CQ602162 }
     host3: /dev/sdb2 ATA SAMSUNG HD103UJ {SN: S1PVJ1CQ602162 }
     host3: Unknown /sys/block/sdb/size
sginfo(open): No such file or directory
file=/dev/slaves, or no corresponding sg device found
Is sg driver loaded?
     host3: /dev/slaves ATA SAMSUNG HD103UJ
     host3: Unknown /sys/block/sdb/stat
sginfo(open): No such file or directory
file=/dev/block, or no corresponding sg device found
Is sg driver loaded?
     host3: /dev/block ATA SAMSUNG HD103UJ
     host3: Unknown /sys/block/sdb/uevent
     host2: Unknown /sys/block/sda/dev
sginfo(open): No such file or directory
file=/dev/2:0:0:0, or no corresponding sg device found
Is sg driver loaded?
     host2: /dev/2:0:0:0 ATA Hitachi HDS72101
sginfo(open): No such file or directory
file=/dev/holders, or no corresponding sg device found
Is sg driver loaded?
     host2: /dev/holders ATA Hitachi HDS72101
sginfo(open): No such file or directory
file=/dev/queue, or no corresponding sg device found
Is sg driver loaded?
     host2: /dev/queue ATA Hitachi HDS72101
     host2: Unknown /sys/block/sda/range
     host2: Unknown /sys/block/sda/removable
     host2: /dev/sda1 ATA Hitachi HDS72101 {SN: JP2921HQ0J0PZA}
     host2: /dev/sda2 ATA Hitachi HDS72101 {SN: JP2921HQ0J0PZA}
     host2: Unknown /sys/block/sda/size
sginfo(open): No such file or directory
file=/dev/slaves, or no corresponding sg device found
Is sg driver loaded?
     host2: /dev/slaves ATA Hitachi HDS72101
     host2: Unknown /sys/block/sda/stat
sginfo(open): No such file or directory
file=/dev/block, or no corresponding sg device found
Is sg driver loaded?
     host2: /dev/block ATA Hitachi HDS72101
     host2: Unknown /sys/block/sda/uevent
Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.1 [aic7xxx]
   SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 
01)    host9: [Empty]
Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.0 [aic7xxx]
   SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 
01)    host8: [Empty]
Controller device @ pci0000:00/0000:00:1c.4/0000:03:00.0 [pata_marvell]
   IDE interface: Marvell Technology Group Ltd. 88SE6121 SATA II 
Controller (rev b2)    host1: [Empty]
     host0: Unknown /sys/block/sr0/dev
sginfo(open): No such file or directory
file=/dev/0:0:0:0, or no corresponding sg device found
Is sg driver loaded?
     host0: /dev/0:0:0:0 HL-DT-ST DVD-RAM GH22NP20
sginfo(open): No such file or directory
file=/dev/holders, or no corresponding sg device found
Is sg driver loaded?
     host0: /dev/holders HL-DT-ST DVD-RAM GH22NP20
sginfo(open): No such file or directory
file=/dev/queue, or no corresponding sg device found
Is sg driver loaded?
     host0: /dev/queue HL-DT-ST DVD-RAM GH22NP20
     host0: Unknown /sys/block/sr0/range
     host0: Unknown /sys/block/sr0/removable
     host0: Unknown /sys/block/sr0/size
sginfo(open): No such file or directory
file=/dev/slaves, or no corresponding sg device found
Is sg driver loaded?
     host0: /dev/slaves HL-DT-ST DVD-RAM GH22NP20
     host0: Unknown /sys/block/sr0/stat
sginfo(open): No such file or directory
file=/dev/block, or no corresponding sg device found
Is sg driver loaded?
     host0: /dev/block HL-DT-ST DVD-RAM GH22NP20
     host0: Unknown /sys/block/sr0/uevent
[root@beast ~]#

So it's finding my drives but also trying to describe_device lots of 
wrong things - maybe need a different expansion for the loop in check_host?

Hopefully the below will give you a better idea of what the sysfs layout 
is like on CentOS/RHEL 5:

[root@beast 0000:00:1f.2]# pwd
/sys/devices/pci0000:00/0000:00:1f.2
[root@beast 0000:00:1f.2]# ls -F
broken_parity_status  enable  host7/      resource0  subsystem@
bus@                  host2/  irq         resource1  subsystem_device
class                 host3/  local_cpus  resource2  subsystem_vendor
config                host4/  modalias    resource3  uevent
device                host5/  power/      resource4  vendor
driver@               host6/  resource    resource5
[root@beast 0000:00:1f.2]# ls -FR host2
host2:
power/  scsi_host:host2@  target2:0:0/  uevent

host2/power:
state  wakeup

host2/target2:0:0:
2:0:0:0/  power/  uevent

host2/target2:0:0/2:0:0:0:
block:sda@      iocounterbits  queue_type            state
bus@            iodone_cnt     rescan                subsystem@
delete          ioerr_cnt      rev                   sw_activity
device_blocked  iorequest_cnt  scsi_device:2:0:0:0@  timeout
dh_state        model          scsi_disk:2:0:0:0@    type
driver@         power/         scsi_generic:sg1@     uevent
generic@        queue_depth    scsi_level            vendor

host2/target2:0:0/2:0:0:0/power:
state  wakeup

host2/target2:0:0/power:
state  wakeup
[root@beast 0000:00:1f.2]# ls -FR host2/target2\:0\:0/2\:0\:0\:0/block\:sda/
host2/target2:0:0/2:0:0:0/block:sda/:
dev      holders/  range      sda1/  size     stat        uevent
device@  queue/    removable  sda2/  slaves/  subsystem@

host2/target2:0:0/2:0:0:0/block:sda/holders:

host2/target2:0:0/2:0:0:0/block:sda/queue:
iosched/  max_hw_sectors_kb  nr_requests    scheduler
iostats   max_sectors_kb     read_ahead_kb

host2/target2:0:0/2:0:0:0/block:sda/queue/iosched:
back_seek_max      fifo_expire_sync  slice_async     slice_sync
back_seek_penalty  quantum           slice_async_rq
fifo_expire_async  queued            slice_idle

host2/target2:0:0/2:0:0:0/block:sda/sda1:
dev  holders/  size  start  stat  subsystem@  uevent

host2/target2:0:0/2:0:0:0/block:sda/sda1/holders:
md0@

host2/target2:0:0/2:0:0:0/block:sda/sda2:
dev  holders/  size  start  stat  subsystem@  uevent

host2/target2:0:0/2:0:0:0/block:sda/sda2/holders:
md1@

host2/target2:0:0/2:0:0:0/block:sda/slaves:
[root@beast 0000:00:1f.2]#


Hope this helps. I've also attached my edited version of the script.

Many thanks,

John.

[-- Attachment #2: describe_scsi_2 --]
[-- Type: text/plain, Size: 1835 bytes --]

#! /bin/bash
#
# Examine specific system host devices to identify the drives attached
#

function describe_controller () {
	local device driver modprefix serial slotname
	driver="`readlink -f \"$1/driver\"`"
	driver="`basename $driver`"
	modprefix="`cut -d: -f1 <\"$1/modalias\"`"
	echo "Controller device @ ${1##/sys/devices/} [$driver]"
	if [[ "$modprefix" == "pci" ]] ; then
		slotname="`basename \"$1\"`"
		echo -n "  `lspci -s $slotname |cut -d\  -f2-`"
		return
	fi
	if [[ "$modprefix" == "usb" ]] ; then
		if [[ -f "$1/busnum" ]] ; then
			device="`cat \"$1/busnum\"`:`cat \"$1/devnum\"`"
			serial="`cat \"$1/serial\"`"
		else
			device="`cat \"$1/../busnum\"`:`cat \"$1/../devnum\"`"
			serial="`cat \"$1/../serial\"`"
		fi
		echo "  `lsusb -s $device` {SN: $serial}"
		return
	fi
	echo -e "  `cat \"$1/modalias\"`"
}

function describe_device () {
	targ=${1%/block*/*}
	vnd="`cat $targ/vendor`"
	mdl=`cat $targ/model`
	rdev=`readlink -f "$1"`
	if [[ -d $rdev ]] ; then
		bdev="`basename $rdev`"
		sn="`sginfo -s /dev/$bdev | \
			sed -rn -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
		if [[ -n "$sn" ]] ; then
			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
		else
			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
		fi
	else
		echo -e "    $bhost: Unknown $rdev"
	fi
}

function check_host () {
	local found=0
	local pController=
	while read shost ; do
		host=`dirname "$shost"`
		controller=`dirname "$host"`
		bhost=`basename "$host"`
		if [[ "$controller" != "$pController" ]] ; then
			pController="$controller"
			describe_controller "$controller"
		fi
		for dev in $host/target*/*/block*/* ; do
			if [[ "${dev: -1}" == '*' ]] ; then
				echo -e "    $bhost: [Empty]"
			else
				describe_device "$dev"
			fi
		done
	done
}

find /sys/devices/ -name *scsi_host* |check_host

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 15:19                   ` John Robinson
@ 2010-11-07 18:39                     ` Phil Turmel
  2010-11-07 20:46                       ` Leslie Rhorer
  2010-11-07 21:24                       ` Andreas Dröscher
  0 siblings, 2 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-07 18:39 UTC (permalink / raw)
  To: John Robinson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]

On 11/07/2010 10:19 AM, John Robinson wrote:
> On 07/11/2010 14:43, Phil Turmel wrote:
>> On 11/07/2010 08:43 AM, John Robinson wrote:
> [...]
>>> Please don't feel you have to turn this into a project, though.
>>
>> Too late.  Here's a version that doesn't use udevadm at all...
> 
> OK, it's an improvement because after I've changed the find command to find '*scsi_host*', it lists my controllers, but finds them all empty. I noted that the script was looking for subdirectories called block but mine have names like block:sda so I changed the script again to refer to 'block*' both in the loop in check_host and in the substitution at the top of describe_device. There's still something not quite right with trying to read CentOS/RHEL 5 / kernel 2.6.18 sysfs, because this was the output I got:

I think I understand the older sysfs directory format now.

> Hope this helps. I've also attached my edited version of the script.

I did another version, with regular expressions to accommodate the variations.  Please give it a shot.

Regards,

Phil

[-- Attachment #2: lsdrv --]
[-- Type: text/plain, Size: 1891 bytes --]

#! /bin/bash
#
# Examine specific system host devices to identify the drives attached
#

function describe_controller () {
	local device driver modprefix serial slotname
	driver="`readlink -f \"$1/driver\"`"
	driver="`basename $driver`"
	modprefix="`cut -d: -f1 <\"$1/modalias\"`"
	echo "Controller device @ ${1##/sys/devices/} [$driver]"
	if [[ "$modprefix" == "pci" ]] ; then
		slotname="`basename \"$1\"`"
		echo "  `lspci -s $slotname |cut -d\  -f2-`"
		return
	fi
	if [[ "$modprefix" == "usb" ]] ; then
		if [[ -f "$1/busnum" ]] ; then
			device="`cat \"$1/busnum\"`:`cat \"$1/devnum\"`"
			serial="`cat \"$1/serial\"`"
		else
			device="`cat \"$1/../busnum\"`:`cat \"$1/../devnum\"`"
			serial="`cat \"$1/../serial\"`"
		fi
		echo "  `lsusb -s $device` {SN: $serial}"
		return
	fi
	echo -e "  `cat \"$1/modalias\"`"
}

function describe_device () {
	local empty=1
	while read device ; do
		empty=0
		if [[ "$device" =~ ^(.+)/block[/:](.+)$ ]] ; then
			targ="${BASH_REMATCH[1]}"
			bdev="${BASH_REMATCH[2]}"
			vnd="$(< $targ/vendor)"
			mdl="$(< $targ/model)"
			sn="`sginfo -s /dev/$bdev | \
				sed -rn -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
			if [[ -n "$sn" ]] ; then
				echo -e "    $1: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
			else
				echo -e "    $1: `echo /dev/$bdev $vnd $mdl`"
			fi
		else
			echo -e "    $1: Unknown $device"
		fi
	done
	[[ $empty -eq 1 ]] && echo -e "    $1: [Empty]"
}

function check_host () {
	local found=0
	local pController=
	while read shost ; do
		host=`dirname "$shost"`
		controller=`dirname "$host"`
		bhost=`basename "$host"`
		if [[ "$controller" != "$pController" ]] ; then
			pController="$controller"
			describe_controller "$controller"
		fi
		find $host -regex '.+/target[0-9:]+/[0-9:]+/block[:/][^/]+' |describe_device "$bhost"
	done
}

find /sys/devices/ -regex '.+/scsi_host\(:block\)?' |check_host

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Determining which spindle is out of order
  2010-11-07 18:39                     ` Phil Turmel
@ 2010-11-07 20:46                       ` Leslie Rhorer
  2010-11-07 21:22                         ` John Robinson
  2010-11-07 21:24                       ` Andreas Dröscher
  1 sibling, 1 reply; 39+ messages in thread
From: Leslie Rhorer @ 2010-11-07 20:46 UTC (permalink / raw)
  To: 'Phil Turmel'; +Cc: linux-raid



> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Phil Turmel
> Sent: Sunday, November 07, 2010 12:39 PM
> To: John Robinson
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Determining which spindle is out of order
> 
> On 11/07/2010 10:19 AM, John Robinson wrote:
> > On 07/11/2010 14:43, Phil Turmel wrote:
> >> On 11/07/2010 08:43 AM, John Robinson wrote:
> > [...]
> >>> Please don't feel you have to turn this into a project, though.
> >>
> >> Too late.  Here's a version that doesn't use udevadm at all...
> >
> > OK, it's an improvement because after I've changed the find command to
> find '*scsi_host*', it lists my controllers, but finds them all empty. I
> noted that the script was looking for subdirectories called block but mine
> have names like block:sda so I changed the script again to refer to
> 'block*' both in the loop in check_host and in the substitution at the top
> of describe_device. There's still something not quite right with trying to
> read CentOS/RHEL 5 / kernel 2.6.18 sysfs, because this was the output I
> got:
> 
> I think I understand the older sysfs directory format now.
> 
> > Hope this helps. I've also attached my edited version of the script.
> 
> I did another version, with regular expressions to accommodate the
> variations.  Please give it a shot.

	The regular expression doesn't work, here, but the rest of the
script now works on the older server.  I replaced the find statement with

find /sys/devices/  -name "scsi_host*" |check_host

	and it works.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 13:43               ` John Robinson
  2010-11-07 14:43                 ` Phil Turmel
@ 2010-11-07 20:52                 ` Roman Mamedov
  2010-11-09 14:40                   ` Phil Turmel
  1 sibling, 1 reply; 39+ messages in thread
From: Roman Mamedov @ 2010-11-07 20:52 UTC (permalink / raw)
  To: John Robinson; +Cc: Phil Turmel, linux-raid

[-- Attachment #1: Type: text/plain, Size: 539 bytes --]

On Sun, 07 Nov 2010 13:43:09 +0000
John Robinson <john.robinson@anonymous.org.uk> wrote:

> Please don't feel you have to turn this into a project, though.

Well, on the contrary, why not? There is clearly a lack of such a tool in
GNU/Linux, so having a designated name for the script, and a small website
with version history and a download link would be very nice, rather than
trying to fish the out latest version from the mailing list...

So I suggest that you *do* turn this into a project. :)

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 20:46                       ` Leslie Rhorer
@ 2010-11-07 21:22                         ` John Robinson
  2010-11-08 18:59                           ` John Robinson
  0 siblings, 1 reply; 39+ messages in thread
From: John Robinson @ 2010-11-07 21:22 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: 'Phil Turmel', linux-raid

On 07/11/2010 20:46, Leslie Rhorer wrote:
[...]
>> I did another version, with regular expressions to accommodate the
>> variations.  Please give it a shot.
>
> 	The regular expression doesn't work, here, but the rest of the
> script now works on the older server.  I replaced the find statement with
>
> find /sys/devices/  -name "scsi_host*" |check_host
>
> 	and it works.

I also changed the find command again and it also works, with one nit, 
which is probably just CentOS/RHEL being odd. See the following output:

# ~john/projects/lsdrv/lsdrv
Controller device @ pci0000:00/0000:00:1f.2 [ahci]
   SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI 
Controller
     host7: [Empty]
     host6: [Empty]
     host5: [Empty]
     host4: /dev/sdc ATA ST31000528AS {SN: 9VP4XCQP}
     host3: /dev/sdb ATA SAMSUNG HD103UJ {SN: S1PVJ1CQ602162 }
     host2: /dev/sda ATA Hitachi HDS72101 {SN: JP2921HQ0J0PZA}
Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.1 [aic7xxx]
   SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01)
     host9: [Empty]
Controller device @ pci0000:00/0000:00:1e.0/0000:05:01.0 [aic7xxx]
   SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01)
     host8: [Empty]
Controller device @ pci0000:00/0000:00:1c.4/0000:03:00.0 [pata_marvell]
   IDE interface: Marvell Technology Group Ltd. 88SE6121 SATA II 
Controller (rev b2)
     host1: [Empty]
sginfo(open): No such file or directory
file=/dev/sr0, or no corresponding sg device found
Is sg driver loaded?
     host0: /dev/sr0 HL-DT-ST DVD-RAM GH22NP20

The various links in sysfs refer to sr0 but I don't have a /dev/sr0, I 
have /dev/scd0. I guess that's the udev rules Red Hat chose. Anyway, my 
fix was to add 2>/dev/null into the sginfo command, which makes the 
warning go away - sginfo -s /dev/scd0 gives me no serial anyway.

The only other improvement would be better formatting for controllers 
with multiple devices, e.g. IDE interfaces, port multipliers and real 
SCSI cards, which could have 2, 5 or 15 devices attached so displaying 
the SCSI device ID could be helpful in those cases. It doesn't apply to 
me but I thought of it when someone had both primary and slave devices 
on their IDE controller, and someone else mentioned port multipliers.

Cheers,

John.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 18:39                     ` Phil Turmel
  2010-11-07 20:46                       ` Leslie Rhorer
@ 2010-11-07 21:24                       ` Andreas Dröscher
  1 sibling, 0 replies; 39+ messages in thread
From: Andreas Dröscher @ 2010-11-07 21:24 UTC (permalink / raw)
  To: linux-raid

Am 07.11.2010 19:39, schrieb Phil Turmel:
> On 11/07/2010 10:19 AM, John Robinson wrote:
> I did another version, with regular expressions to accommodate the variations.
> Please give it a shot.

As I don't have sginfo on my system installed, I replaced the sn= line with

sn=`smartctl -i /dev/$bdev | awk '/Serial Number/{print $3}'`

Now the script works like a charm

Best Wishes
Andreas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 21:22                         ` John Robinson
@ 2010-11-08 18:59                           ` John Robinson
  0 siblings, 0 replies; 39+ messages in thread
From: John Robinson @ 2010-11-08 18:59 UTC (permalink / raw)
  To: 'Phil Turmel'; +Cc: linux-raid

On 07/11/2010 21:22, John Robinson wrote:
[...]
> I also changed the find command again and it also works, with one nit,
> which is probably just CentOS/RHEL being odd.

Found another nitlet. I plugged in a USB hard drive caddy, with a drive 
in it, and it appears as follows:

Controller device @ pci0000:00/0000:00:1a.7/usb1/1-3/1-3:1.0 [usb-storage]
cat: /sys/devices/pci0000:00/0000:00:1a.7/usb1/1-3/1-3:1.0/../busnum: No 
such file or directory
   Bus 001 Device 002: ID 152d:2339 JMicron Technology Corp. / JMicron 
USA Technology Corp.  {SN: 021F807334FF}
     host10: /dev/sdd SAMSUNG HD400LJ

If there's anything I can help with, just let me know.

Cheers,

John.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 14:43                 ` Phil Turmel
  2010-11-07 15:04                   ` Mathias Burén
  2010-11-07 15:19                   ` John Robinson
@ 2010-11-08 21:05                   ` Mr. James W. Laferriere
  2 siblings, 0 replies; 39+ messages in thread
From: Mr. James W. Laferriere @ 2010-11-08 21:05 UTC (permalink / raw)
  To: Phil Turmel; +Cc: John Robinson, linux-raid

 	Hello Phil ,

On Sun, 7 Nov 2010, Phil Turmel wrote:

> On 11/07/2010 08:43 AM, John Robinson wrote:
>> On 07/11/2010 13:21, Phil Turmel wrote:
>>> On 11/07/2010 07:53 AM, John Robinson wrote:
>>>> On 06/11/2010 16:02, Phil Turmel wrote:
>>>>> On 11/06/2010 11:46 AM, John Robinson wrote:
>>>> [...]
>>>>>> Now I need to find udevadm I guess. It must have been introduced since the udev version that comes with RHEL/CentOS 5, which is udev-095-14.21.el5_5.1. rpmfind.net suggests it's only been in since version 118 or so. Never mind :-)
>>>>>
>>>>> Heh.  Anyone know the equivalent command in earlier versions of udev?
>>>>
>>>> I think it's `udevinfo` instead of `udevadm info` - the comment in the ChangeLog for udev-117 is "udevadm: merge all udev tools into a single binary". But it doesn't work terribly well:
>>>>
>>>> [root@beast describe_scsi]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2/
>>>> no record for '/devices/pci0000:00/0000:00:1f.2/' in database
>>>>
>>>> That's unfortunate. But it does know about that device if asked differently:
>>>>
>>>> [root@beast describe_scsi]# udevinfo -a -p /devices/pci0000\:00/0000\:00\:1f.2/
>>>
>>> Hmmm.  Can you try both of the above without the trailing slash?
>>
>> Just the same output, however I ask the question:
>>
>> [root@beast ~]# udevinfo -q all -p /devices/pci0000\:00/0000\:00\:1f.2
>> no record for '/devices/pci0000:00/0000:00:1f.2' in database
>> [root@beast ~]# udevinfo -q all -p /devices/pci0000:00/0000:00:1f.2
>> no record for '/devices/pci0000:00/0000:00:1f.2' in database
>> [root@beast ~]# udevinfo -q all -p /sys/devices/pci0000:00/0000:00:1f.2
>> no record for '/devices/pci0000:00/0000:00:1f.2' in database
>> [root@beast ~]#
>>
>> And all with "-a" instead of "-q all" produce the output I posted before.
>
> The modern udevadm gives me that with --attribute-walk.  It's purpose is to report
> the conditions one might want to use in an udev rule.  It doesn't provide the human
> descriptions I'm looking for.
>
>> Please don't feel you have to turn this into a project, though.
>
> Too late.  Here's a version that doesn't use udevadm at all...
>
> #! /bin/bash
> #
> # Examine specific system host devices to identify the drives attached
> #
>
> function describe_controller () {
> 	local device driver modprefix serial slotname
> 	driver="`readlink -f \"$1/driver\"`"
> 	driver="`basename $driver`"
> 	modprefix="`cut -d: -f1 <\"$1/modalias\"`"
> 	echo "Controller device @ ${1##/sys/devices/} [$driver]"
> 	if [[ "$modprefix" == "pci" ]] ; then
> 		slotname="`basename \"$1\"`"
> 		echo -n "  `lspci -s $slotname |cut -d\  -f2-`"
> 		return
> 	fi
> 	if [[ "$modprefix" == "usb" ]] ; then
> 		if [[ -f "$1/busnum" ]] ; then
> 			device="`cat \"$1/busnum\"`:`cat \"$1/devnum\"`"
> 			serial="`cat \"$1/serial\"`"
> 		else
> 			device="`cat \"$1/../busnum\"`:`cat \"$1/../devnum\"`"
> 			serial="`cat \"$1/../serial\"`"
> 		fi
> 		echo "  `lsusb -s $device` {SN: $serial}"
> 		return
> 	fi
> 	echo -e "  `cat \"$1/modalias\"`"
> }
>
> function describe_device () {
> 	targ=${1%/block/*}
> 	vnd="`cat $targ/vendor`"
> 	mdl=`cat $targ/model`
> 	rdev=`readlink -f "$1"`
> 	if [[ -d $rdev ]] ; then
> 		bdev="`basename $rdev`"
> 		sn="`sginfo -s /dev/$bdev | \
> 			sed -rn -e \"/Serial Number/{s%^.+' *(.+) *'.*\\\$%\\\\1%;p;q}\"`" &>/dev/null
> 		if [[ -n "$sn" ]] ; then
> 			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl {SN: $sn}`"
> 		else
> 			echo -e "    $bhost: `echo /dev/$bdev $vnd $mdl`"
> 		fi
> 	else
> 		echo -e "    $bhost: Unknown $rdev"
> 	fi
> }
>
> function check_host () {
> 	local found=0
> 	local pController=
> 	while read shost ; do
> 		host=`dirname "$shost"`
> 		controller=`dirname "$host"`
> 		bhost=`basename "$host"`
> 		if [[ "$controller" != "$pController" ]] ; then
> 			pController="$controller"
> 			describe_controller "$controller"
> 		fi
> 		for dev in $host/target*/*/block/* ; do
> 			if [[ "${dev: -1}" == '*' ]] ; then
> 				echo -e "    $bhost: [Empty]"
> 			else
> 				describe_device "$dev"
> 			fi
> 		done
> 	done
> }
>
> find /sys/devices/ -name scsi_host |check_host


 	I get the following on my (Ancient) Slackware 10.2.0 server .


# linuxraid-check_host-20101108.sh
Controller device @ pci0000:00/0000:00:01.0 [sym53c8xx]
lspci: -f: Invalid slot number
       host0: [Empty]
Controller device @ pci0000:00/0000:00:01.1 [sym53c8xx]
lspci: -f: Invalid slot number
       host1: [Empty]
basename: too few arguments
Try `basename --help' for more information.
Controller device @ pci0000:02/0000:02:02.0 []
lspci: -f: Invalid slot number
       host2: [Empty]


# lspci --version
lspci version 2.1.11

basename --version
basename (GNU coreutils) 5.2.1
...snipped...


# for XXX in `lspci | grep -i scsi | awk '{print $1}'` ; do lspci -v -v -v -v -v -s ${XXX} ; done

00:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 66MHz  Ultra3 
SCSI Adapter (rev 01)
         Subsystem: LSI Logic / Symbios Logic LSI53C1000/1000R/1010R/1010-66 PCI to Ultra160 SCSI Controller
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 72 (4250ns min, 4500ns max), cache line size 08
         Interrupt: pin A routed to IRQ 29
         Region 0: I/O ports at c400 [size=256]
         Region 1: Memory at fe9ff800 (64-bit, non-prefetchable) [size=1K]
         Region 3: Memory at fe9f6000 (64-bit, non-prefetchable) [size=8K]
         Expansion ROM at fe9f0000 [disabled] [size=16K]
         Capabilities: [40] Power Management version 2
                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 66MHz  Ultra3 
SCSI Adapter (rev 01)
         Subsystem: LSI Logic / Symbios Logic LSI53C1000/1000R/1010R/1010-66 PCI to Ultra160 SCSI Controller
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
         Latency: 72 (4250ns min, 4500ns max), cache line size 08
         Interrupt: pin B routed to IRQ 28
         Region 0: I/O ports at c800 [size=256]
         Region 1: Memory at fe9ffc00 (64-bit, non-prefetchable) [size=1K]
         Region 3: Memory at fe9fc000 (64-bit, non-prefetchable) [size=8K]
         Expansion ROM at fe9f8000 [disabled] [size=16K]
         Capabilities: [40] Power Management version 2
                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-

# fdisk -l | grep '^Disk'
Disk /dev/sdb: 734.0 GB, 734076600320 bytes
Disk /dev/sda: 18.2 GB, 18210037760 byte

 	Tia ,  JimL
-- 
+------------------------------------------------------------------+
| James   W.   Laferriere | System    Techniques | Give me VMS     |
| Network&System Engineer | 3237     Holden Road |  Give me Linux  |
| babydr@baby-dragons.com | Fairbanks, AK. 99709 |   only  on  AXP |
+------------------------------------------------------------------+

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Determining which spindle is out of order
  2010-11-07 20:52                 ` Roman Mamedov
@ 2010-11-09 14:40                   ` Phil Turmel
  0 siblings, 0 replies; 39+ messages in thread
From: Phil Turmel @ 2010-11-09 14:40 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: John Robinson, linux-raid

On 11/07/2010 03:52 PM, Roman Mamedov wrote:
> On Sun, 07 Nov 2010 13:43:09 +0000
> John Robinson <john.robinson@anonymous.org.uk> wrote:
> 
>> Please don't feel you have to turn this into a project, though.
> 
> Well, on the contrary, why not? There is clearly a lack of such a tool in
> GNU/Linux, so having a designated name for the script, and a small website
> with version history and a download link would be very nice, rather than
> trying to fish the out latest version from the mailing list...
> 
> So I suggest that you *do* turn this into a project. :)

Well, I threw it out as public domain specifically because I don't have enough time available to manage a formal project.  I don't think the script itself is professional enough to want my copyright attached.  I'm happy to kick simple adjustments around on a mailing list, as time permits.

If *you* have the time, go for it.  :)

Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2010-11-09 14:40 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-03 14:13 Determining which spindle is out of order Nat Makarevitch
2010-11-03 14:38 ` Roman Mamedov
2010-11-03 15:17   ` Graham Mitchell
2010-11-03 16:05     ` Roman Mamedov
2010-11-03 19:00       ` Jon Hardcastle
2010-11-03 14:43 ` John Robinson
2010-11-03 14:45 ` Tim Small
2010-11-03 15:59   ` Jon Hardcastle
2010-11-03 17:17     ` Bill Davidsen
2010-11-03 20:03       ` Tim Small
2010-11-03 15:29 ` Mikael Abrahamsson
2010-11-03 21:54 ` Phil Turmel
2010-11-03 22:26   ` Roman Mamedov
2010-11-04  9:29   ` Tom Carlson
2010-11-06 10:22   ` Leslie Rhorer
2010-11-06 15:12     ` Phil Turmel
     [not found]       ` <4CD57867.4010207@anonymous.org.uk>
2010-11-06 16:02         ` Phil Turmel
2010-11-06 16:11           ` Mathias Burén
2010-11-06 16:45           ` Jan Ceuleers
2010-11-06 19:39             ` Phil Turmel
2010-11-06 20:16               ` Leslie Rhorer
2010-11-06 20:23               ` Mr. James W. Laferriere
2010-11-07  7:51               ` Jan Ceuleers
2010-11-07 12:53           ` John Robinson
2010-11-07 13:21             ` Phil Turmel
2010-11-07 13:43               ` John Robinson
2010-11-07 14:43                 ` Phil Turmel
2010-11-07 15:04                   ` Mathias Burén
2010-11-07 15:19                   ` John Robinson
2010-11-07 18:39                     ` Phil Turmel
2010-11-07 20:46                       ` Leslie Rhorer
2010-11-07 21:22                         ` John Robinson
2010-11-08 18:59                           ` John Robinson
2010-11-07 21:24                       ` Andreas Dröscher
2010-11-08 21:05                   ` Mr. James W. Laferriere
2010-11-07 20:52                 ` Roman Mamedov
2010-11-09 14:40                   ` Phil Turmel
2010-11-06 19:58       ` Leslie Rhorer
2010-11-06 21:17       ` John Robinson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.