All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: dgilbert@interlog.com, "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>, Hannes Reinecke <hare@suse.de>,
	emilne@redhat.com,
	device-mapper development <dm-devel@redhat.com>,
	linux-scsi@vger.kernel.org
Subject: Re: SCSI's heuristics for enabling WRITE SAME still need work [was: dm mpath: disable WRITE SAME if it fails]
Date: Thu, 26 Sep 2013 15:41:38 +0200	[thread overview]
Message-ID: <52443992.4040806@fastmail.fm> (raw)
In-Reply-To: <5243C880.6050609@interlog.com>

On 09/26/2013 07:39 AM, Douglas Gilbert wrote:
> On 13-09-25 08:44 PM, Martin K. Petersen wrote:
>>>>>>> "Bernd" == Bernd Schubert <bernd.schubert@fastmail.fm> writes:
>>
>> Hey Bernd,
>>
>> Bernd> I'm afraid we have another problem. I'm currently working on to
>> Bernd> get discard working for our LSI2008 HBAs with attached sata-SSDs
>> Bernd> and the heuristics in sd_read_write_same with based on VPD page
>> Bernd> 0x89 is not correct for this HBA - its SATL supports write-same
>>
>> This has nothing to do with the WRITE SAME heuristics.
>>
>> It's true that depending on wind and whether we might issue WRITE
>> SAME(10) or (16) with the UNMAP bit set to perform discard operations on
>> the low level device. But we use a set of different (and somewhat more
>> reliable) heuristics to decide which command to send down for that
>> purpose.
>>
>> For discards to a SATA device to work you need a recent phase LSI
>> firmware. And you need the target mode firmware (IT). There is no
>> UNMAP->DSM TRIM translation in the RAID (IR) firmware.
>>
>> If your SATA SSDs reports DSM TRIM support, the LSI firmware will set
>> LBPME=1 in READ CAPACITY(16) and the LOGICAL BLOCK PROVISIONING VPD page
>> will indicate a preference for the UNMAP command (LBPU=1).
>>
>> Also, LSI firmware is well-behaved in general and will report ILLEGAL
>> REQUEST when you send down a command that can't be handled.
>
> An example with a LSI 9212-4i4e running the latest firmware
> (P17) connected to a SATA SSD (via an expander):
>
> # sg_vpd /dev/sg1 -p sinq
> standard INQUIRY:
>    PQual=0  Device_type=0  RMB=0  version=0x06  [SPC-4]
>    [AERC=0]  [TrmTsk=0]  NormACA=0  HiSUP=1  Resp_data_format=2
>    SCCS=0  ACC=0  TPGS=0  3PC=0  Protect=0  [BQue=0]
>    EncServ=0  MultiP=0  [MChngr=0]  [ACKREQQ=0]  Addr16=0
>    [RelAdr=0]  WBus16=0  Sync=0  Linked=0  [TranDis=0]  CmdQue=1
>    Vendor_identification: ATA
>    Product_identification: INTEL SSDSA2M080
>    Product_revision_level: 02M3
>
> # sg_vpd /dev/sg1 -p bl
> Block limits VPD page (SBC):
>    Write same no zero (WSNZ): 0
>    Maximum compare and write length: 0 blocks
>    Optimal transfer length granularity: 0 blocks
>    Maximum transfer length: 0 blocks
>    Optimal transfer length: 0 blocks
>    Maximum prefetch length: 0 blocks
>    Maximum unmap LBA count: 4194303
>    Maximum unmap block descriptor count: 32
>    Optimal unmap granularity: 1
>    Unmap granularity alignment valid: 0
>    Unmap granularity alignment: 0
>    Maximum write same length: 0x0 blocks
>
> # sg_vpd /dev/sg1 -p lbpv
> Logical block provisioning VPD page (SBC):
>    Unmap command supported (LBPU): 1
>    Write same (16) with unmap bit supported (LBWS): 1
>    Write same (10) with unmap bit supported (LBWS10): 0
>    Logical block provisioning read zeros (LBPRZ): 0
>    Anchored LBAs supported (ANC_SUP): 1
>    Threshold exponent: 0
>    Descriptor present (DP): 0
>    Provisioning type: 0
>
> # sg_opcodes -n /dev/sg1
> Report supported operation codes: operation not supported
>
>
> Room for improvement there. It also supports a useful set
> of mode pages (including some chageable fields) and two
> log pages.
>

Both types of systems we have in-house neither block limits vpd nor 
READ_CAP16 return anything that would indicate discard is supported. But 
UNMAP and WRITE SAME unmap(*) just work fine.

I certainly don't want to cause any more write-same trouble, but as all 
layers initially have to assume write same is supported anyway and need 
to dynamically disable it if it fails, can't we also enable discard by 
default with WRITE SAME16 unmap?
I'm going to send a PoC patch later on.

The older system I can play with for a few days has an Intel510 
(SSDSC2MH25) connected to an LSI SAS9211-8i via a sas enclosure.
Ignoring identificication string and revision level, sg_vdp output is 
almost the same here, but with the exception of

(wheezy)node02:~# sg_vpd /dev/sdb -p bl
Block limits VPD page (SBC):
[...]
   Maximum unmap LBA count: 0
   Maximum unmap block descriptor count: 0
   Optimal unmap granularity: 0
[...]


I think interesting for discard is also read cap 16:

> (wheezy)node02:~# sg_readcap --16 /dev/sda
> Read Capacity results:
>    Protection: prot_en=0, p_type=0, p_i_exponent=0
>    Logical block provisioning: lbpme=0, lbprz=0
>    Last logical block address=490350671 (0x1d3a284f), Number of logical blocks=490350672
>    Logical block length=512 bytes
>    Logical blocks per physical block exponent=0
>    Lowest aligned logical block address=0
> Hence:
>    Device size: 251059544064 bytes, 239429.0 MiB, 251.06 GB


So again no indication of discard support.


We also long ago flushed  the IT fw and just recently updated to fw 
version 17
> mpt2sas0: LSISAS2008: FWVersion(17.00.01.00), ChipRevision(0x03), BiosVersion(07.33.00.00)
> mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)


Thanks,
Bernd


PS: LSI SATL with FWv17 seems to have an unmap bug - I cannot unmap the 
last sector:

> (wheezy)node02:~# cat /sys/block/sdb/size
> 488397168

> (wheezy)node02:~# sg_write_same --16 --unmap --verbose --lba=488397167 --num=1 /dev/sdb
> Default data-out buffer set to 512 zeros

So write same works. But then unmap fails:

> (wheezy)node02:~# sg_unmap --verbose --lba=488397167 --num=1 /dev/sdb
>     unmap cdb: 42 00 00 00 00 00 00 00 18 00
> unmap:  Fixed format, current;  Sense key: Illegal Request
>  Additional sense: Logical block address out of range
>   Info fld=0x1d1c596f [488397167]
> bad field in UNMAP cdb

All sectors before that work fine:

> (wheezy)node02:~# sg_unmap --verbose --lba=0 --num=488397167 /dev/sdb
>     unmap cdb: 42 00 00 00 00 00 00 00 18 00
> (








  reply	other threads:[~2013-09-26 13:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-19 16:13 [PATCH] dm mpath: disable WRITE SAME if it fails Mike Snitzer
2013-09-20 21:21 ` SCSI's heuristics for enabling WRITE SAME still need work [was: dm mpath: disable WRITE SAME if it fails] Mike Snitzer
2013-09-20 22:03   ` Martin K. Petersen
2013-09-21 15:28     ` Douglas Gilbert
2013-09-23 18:18     ` Ewan Milne
2013-09-24  5:39       ` [dm-devel] " Hannes Reinecke
2013-09-24 12:34         ` Mike Snitzer
2013-09-24 13:49           ` Martin K. Petersen
2013-09-24 15:15             ` Mike Snitzer
2013-09-25 20:52             ` Bernd Schubert
2013-09-25 22:12               ` Douglas Gilbert
2013-09-26  0:44               ` Martin K. Petersen
2013-09-26  5:39                 ` Douglas Gilbert
2013-09-26 13:41                   ` Bernd Schubert [this message]
2013-09-26 14:42                     ` Martin K. Petersen
2013-09-26 15:34                       ` Bernd Schubert
2013-09-26 15:47                       ` Douglas Gilbert
2013-09-26 18:42                         ` Saxena, Sumit
2013-09-24 19:12         ` [dm-devel] " Jeremy Linton
2013-09-24 19:37           ` Douglas Gilbert
2013-09-24  9:37     ` Paolo Bonzini
2013-09-24 13:25       ` James Bottomley
2013-09-24 18:39   ` [dm-devel] " Mikulas Patocka
2013-09-24 20:44     ` Martin K. Petersen
2013-09-24 22:02       ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52443992.4040806@fastmail.fm \
    --to=bernd.schubert@fastmail.fm \
    --cc=dgilbert@interlog.com \
    --cc=dm-devel@redhat.com \
    --cc=emilne@redhat.com \
    --cc=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.