All of lore.kernel.org
 help / color / mirror / Atom feed
* smartctl causing HSM violation on sata_nv, 2.6.18
@ 2006-09-27 18:33 Jim Paris
  2006-09-28  6:54 ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Jim Paris @ 2006-09-27 18:33 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

Hi Tejun,

My NVIDIA SATA controller is having some problems with smartctl on
2.6.18 (+ the previously mentioned sata_nv patch).  If I try to enable
Attribute Autosafe (smartctl -S on) or Automatic Offline (smartctl -o
on), the controller craps out (but recovers).  Executing the same
command on an identical disk connected to a SiI3132 works fine.  Other
SMART stuff (reading attributes, running self-tests) seems to be
behaving just fine.  

-jim


### sata_nv controller (CK804):

# smartctl -data -S on /dev/disk/by-path/pci-0000:00:07.0-scsi-0:0:0:0
smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
Error SMART Enable Auto-save failed: Input/output error
Smartctl: SMART Enable Attribute Autosave Failed.

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.


### sata_sil24 controller (SiI3132):

# smartctl -data -S on /dev/disk/by-path/pci-0000:04:00.0-scsi-0:0:0:0
smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Attribute Autosave Enabled.


### Kernel log for NVIDIA case:

[36911.153208] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[36911.153245] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
[36911.462381] ata1: soft resetting port
[36911.618322] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[36911.620269] ata1.00: configured for UDMA/133
[36911.620277] ata1: EH complete
[36911.620410] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[36911.620442] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
[36911.930163] ata1: soft resetting port
[36912.086097] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[36912.087984] ata1.00: configured for UDMA/133
[36912.087996] ata1: EH complete
[36912.088126] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[36912.088158] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
[36912.397930] ata1: soft resetting port
[36912.553871] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[36912.555790] ata1.00: configured for UDMA/133
[36912.555801] ata1: EH complete
[36912.555931] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[36912.555963] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
[36912.865705] ata1: soft resetting port
[36913.021646] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[36913.023482] ata1.00: configured for UDMA/133
[36913.023488] ata1: EH complete
[36913.023621] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[36913.023653] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
[36913.333482] ata1: soft resetting port
[36913.489422] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[36913.491320] ata1.00: configured for UDMA/133
[36913.491327] ata1: EH complete
[36913.491461] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[36913.491493] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
[36913.801255] ata1: soft resetting port
[36913.957198] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[36913.959100] ata1.00: configured for UDMA/133
[36913.959110] ata1: EH complete
[36913.959384] SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
[36913.959530] sda: Write Protect is off
[36913.959534] sda: Mode Sense: 00 3a 00 00
[36913.959801] SCSI device sda: drive cache: write back
[36913.960018] SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
[36913.960352] sda: Write Protect is off
[36913.960357] sda: Mode Sense: 00 3a 00 00
[36913.960544] SCSI device sda: drive cache: write back

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smartctl causing HSM violation on sata_nv, 2.6.18
  2006-09-27 18:33 smartctl causing HSM violation on sata_nv, 2.6.18 Jim Paris
@ 2006-09-28  6:54 ` Tejun Heo
  2006-09-28  8:09   ` Jim Paris
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2006-09-28  6:54 UTC (permalink / raw)
  To: Jim Paris; +Cc: linux-ide, smartmontools-support

Hello, Jim Paris, Bruce Allen.

On Wed, Sep 27, 2006 at 02:33:39PM -0400, Jim Paris wrote:
> Hi Tejun,
> 
> My NVIDIA SATA controller is having some problems with smartctl on
> 2.6.18 (+ the previously mentioned sata_nv patch).  If I try to enable
> Attribute Autosafe (smartctl -S on) or Automatic Offline (smartctl -o
> on), the controller craps out (but recovers).  Executing the same
> command on an identical disk connected to a SiI3132 works fine.  Other
> SMART stuff (reading attributes, running self-tests) seems to be
> behaving just fine.  

This is because smartctl issues AUTOSAVE and AUTO_OFFLINE w/
HDIO_DRIVE_CMD.  Both SMART subcommands are non-data but still use
non-zero NSECT field.  HDIO_DRIVE_CMD assumes data-in protocol when
NSECT is non-zero.  libata HSM implementation is stricter than ide's
and declares HSM violation when device reports command complete when
it's expecting DRQ.

> ### sata_nv controller (CK804):
> 
> # smartctl -data -S on /dev/disk/by-path/pci-0000:00:07.0-scsi-0:0:0:0
> smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> === START OF ENABLE/DISABLE COMMANDS SECTION ===
> Error SMART Enable Auto-save failed: Input/output error
> Smartctl: SMART Enable Attribute Autosave Failed.
> 
> A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
> 
> 
> ### sata_sil24 controller (SiI3132):
> 
> # smartctl -data -S on /dev/disk/by-path/pci-0000:04:00.0-scsi-0:0:0:0
> smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> === START OF ENABLE/DISABLE COMMANDS SECTION ===
> SMART Attribute Autosave Enabled.

sata_sil24 works because the controller hardware snoops the command
and determines protocol by itself.  So, regardless of what the ioctl
says, it executes the command with non-data protocol.

The following patch against smartmontools-5.36 converts it to use
HDIO_DRIVE_TASK ioctl for AUTOSAVE and AUTO_OFFLINE which don't have
the above issue.

Thanks.

diff -uNr smartmontools-5.36/os_linux.c smartmontools-5.36-fixed/os_linux.c
--- smartmontools-5.36/os_linux.c	2006-04-13 02:02:19.000000000 +0900
+++ smartmontools-5.36-fixed/os_linux.c	2006-09-28 15:41:06.000000000 +0900
@@ -383,14 +383,10 @@
 //   1 if the command succeeded and disk SMART status is "FAILING"
 
 
-// huge value of buffer size needed because HDIO_DRIVE_CMD assumes
-// that buff[3] is the data size.  Since the ATA_SMART_AUTOSAVE and
-// ATA_SMART_AUTO_OFFLINE use values of 0xf1 and 0xf8 we need the space.
-// Otherwise a 4+512 byte buffer would be enough.
-#define STRANGE_BUFFER_LENGTH (4+512*0xf8)
+#define BUFFER_LEN (4+512)
 
 int ata_command_interface(int device, smart_command_set command, int select, char *data){
-  unsigned char buff[STRANGE_BUFFER_LENGTH];
+  unsigned char buff[BUFFER_LEN];
   // positive: bytes to write to caller.  negative: bytes to READ from
   // caller. zero: non-data command
   int copydata=0;
@@ -407,7 +403,7 @@
   // buff[2] contains the ATA SECTOR COUNT REGISTER
   
   // clear out buff.  Large enough for HDIO_DRIVE_CMD (4+512 bytes)
-  memset(buff, 0, STRANGE_BUFFER_LENGTH);
+  memset(buff, 0, BUFFER_LEN);
 
   buff[0]=ATA_SMART_CMD;
   switch (command){
@@ -457,12 +453,14 @@
     buff[2]=ATA_SMART_STATUS;
     break;
   case AUTO_OFFLINE:
-    buff[2]=ATA_SMART_AUTO_OFFLINE;
-    buff[3]=select;   // YET NOTE - THIS IS A NON-DATA COMMAND!!
+    // NSECT is 241 for enable but no data transfer.  Use TASK ioctl.
+    buff[1]=ATA_SMART_AUTO_OFFLINE;
+    buff[2]=select;
     break;
   case AUTOSAVE:
-    buff[2]=ATA_SMART_AUTOSAVE;
-    buff[3]=select;   // YET NOTE - THIS IS A NON-DATA COMMAND!!
+    // NSECT is 248 for enable but no data transfer.  Use TASK ioctl.
+    buff[1]=ATA_SMART_AUTOSAVE;
+    buff[2]=select;
     break;
   case IMMEDIATE_OFFLINE:
     buff[2]=ATA_SMART_IMMEDIATE_OFFLINE;
@@ -517,7 +515,7 @@
     
   // There are two different types of ioctls().  The HDIO_DRIVE_TASK
   // one is this:
-  if (command==STATUS_CHECK){
+  if (command==AUTO_OFFLINE || command==AUTOSAVE || command==STATUS_CHECK){
     int retval;
 
     // NOT DOCUMENTED in /usr/src/linux/include/linux/hdreg.h. You


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smartctl causing HSM violation on sata_nv, 2.6.18
  2006-09-28  6:54 ` Tejun Heo
@ 2006-09-28  8:09   ` Jim Paris
  2006-10-02 18:33     ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Jim Paris @ 2006-09-28  8:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, smartmontools-support

Hi Tejun,

Tejun Heo wrote:
> The following patch against smartmontools-5.36 converts it to use
> HDIO_DRIVE_TASK ioctl for AUTOSAVE and AUTO_OFFLINE which don't have
> the above issue.

This patch works like a charm.  Now "smartctl -S on" and "smartctl -o on"
works as expected on all of my SATA and IDE controllers.  Thank you!

-jim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smartctl causing HSM violation on sata_nv, 2.6.18
  2006-09-28  8:09   ` Jim Paris
@ 2006-10-02 18:33     ` Tejun Heo
  2006-10-02 23:24       ` Doug Maxey
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2006-10-02 18:33 UTC (permalink / raw)
  To: Jim Paris; +Cc: linux-ide, smartmontools-support

Jim Paris wrote:
> Hi Tejun,
> 
> Tejun Heo wrote:
>> The following patch against smartmontools-5.36 converts it to use
>> HDIO_DRIVE_TASK ioctl for AUTOSAVE and AUTO_OFFLINE which don't have
>> the above issue.
> 
> This patch works like a charm.  Now "smartctl -S on" and "smartctl -o on"
> works as expected on all of my SATA and IDE controllers.  Thank you!

I've got delivery failure notice for smartmontools-support mail address. 
  Does anyone know how to contact smartmontools author?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: smartctl causing HSM violation on sata_nv, 2.6.18
  2006-10-02 18:33     ` Tejun Heo
@ 2006-10-02 23:24       ` Doug Maxey
  0 siblings, 0 replies; 5+ messages in thread
From: Doug Maxey @ 2006-10-02 23:24 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jim Paris, linux-ide, smartmontools-support


On Tue, 03 Oct 2006 03:33:42 +0900, Tejun Heo wrote:
> 
> I've got delivery failure notice for smartmontools-support mail address. 
>   Does anyone know how to contact smartmontools author?
> 

Bruce Allen <ballen@gravity.phys.uwm.edu> is the maintainer.

++doug


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-10-02 23:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-27 18:33 smartctl causing HSM violation on sata_nv, 2.6.18 Jim Paris
2006-09-28  6:54 ` Tejun Heo
2006-09-28  8:09   ` Jim Paris
2006-10-02 18:33     ` Tejun Heo
2006-10-02 23:24       ` Doug Maxey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.