All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roberto Spadim <roberto@spadim.com.br>
To: doug@easyco.com
Cc: Chris Worley <worleys@gmail.com>,
	"Scott E. Armitage" <launchpad@scott.armitage.name>,
	David Brown <david@westcontrol.com>,
	linux-raid@vger.kernel.org
Subject: Re: SSD - TRIM command
Date: Wed, 9 Feb 2011 17:22:19 -0200	[thread overview]
Message-ID: <AANLkTimDWW-n7a9VR-GnCbb4g+QXy_ZmZtoRoWWGCGr9@mail.gmail.com> (raw)
In-Reply-To: <AANLkTi=+w-q8s-y-K-24ZwSLsLrJ44Oh9LiF3ReeyYdP@mail.gmail.com>

i agree with ppps
that´s why ecc, checksum and parity is usefull (raid5,6) (raid1 if you
read from all mirror to check difference and select the 'right disk')

2011/2/9 Doug Dumitru <doug@easyco.com>:
> I work with SSDs arrays all the time, so I have a couple of thoughts
> about trim and md.
>
> 'trim' is still necessary.  SandForce controllers are "better" at
> this, but still need free space to do their work.  I had a set of SF
> drives drop to 22 MB/sec writes because they were full and scrambled.
> It takes a lot of effort to get them that messed up, but it can still
> happen.  Trim brings them back.
>
> The bottom line is that SSDs do block re-organization on the fly and
> free space makes the re-org more efficient.  More efficient means
> faster, and as importantly less wear amplification.
>
> Most SSDs (and I think the latest trim spec) are deterministic on
> trim'd sectors.  If you trim a sector, they read that sector as zeros.
>  This makes raid much "safer".
>
> raid/0,1,10 should be fine to echo discard commands down to the
> downstream drives in the bio request.  It is then up to the physical
> device driver to turn the discard bio request into an ATA (or SCSI)
> trim.  Most block devices don't seem to understand discard requests
> yet, but this will get better over time.
>
> raid/4,5,6 is a lot more complicated.  With raid/4,5 with an even
> number of drives, you can trim whole stripes safely.  Pieces of
> stripes get interesting because you have to treat a trim as a write of
> zeros and re-calc parity.  raid/6 will always have parity issues
> regardless of how many drives there are.  Even worse is that
> raid/4,5,6 parity read/modify/write operations tend to chatter the FTL
> (Flash Translation Layer) logic and make matters worse (often much
> worse).  If you are not streaming long linear writes, raid/4,5,6 in a
> heavy write environment is a probably a very bad idea for most SSDs.
>
> Another issue with trim is how "async" it behaves.  You can trim a lot
> of data to a drive, but it is hard to tell when the drive actually is
> ready afterwards.  Some drives also choke on trim requests that come
> at them too fast or requests that are too long.  The behavior can be
> quite random.  So then comes the issue of how many "user knobs" to
> supply to tune what trims where.  Again, raid/0,1,10 are pretty easy.
> Raid/4,5,6 really requires that you know the precise geometry and
> control the IO.  Way beyond what ext4 understands at this point.
>
> Trim can also be "faked" with some drives.  Again, looking at the
> SandForce based drives, these drive internally de-dupe so you can fake
> write data and help the drives get free space.  Do this by filling the
> drive with zeros (ie, dd if=/dev/zero of=big.file bs=1M), do a sync,
> and then delete the big.file.  This works through md, across SANs,
> from XEN virtuals, or wherever.  With SandForce drives, this is not as
> effective as a trim, but better than nothing.  Unfortunately, only
> SandForce drives and Flash Supercharger understand zero's this way.  A
> filesystem option that "zeros discarded sectors" would actually make
> as much sense in some deployment settings as the discard option (not
> sure, but ext# might already have this).  NTFS has actually supported
> this since XP as a security enhancement.
>
> Doug Dumitru
> EasyCo LLC
>
> ps:  My background with this has been the development of Flash
> SuperCharger.  I am not trying to run an advert here, but the care and
> feeding of SSDs can be interesting.  Flash SuperCharger breaks most of
> these rules, but it does know the exact geometry of what it is driving
> and plays excessive games to drives SSDs at their exact "sweet spot".
> One of our licensees just sent me some benchmarks at > 500,000 4K
> random writes/sec for a moderate sized array running raid/5.
>
> pps:  Failures of SSDs are different than HDDs.  SSDs can and do fail
> and need raid for many applications.  If you need high write IOPS, it
> pretty much has to be raid/1,10 (unless you run our Flash SuperCharger
> layer).
>
> ppps:  I have seen SSDs silently return corrupted data.  Disks do this
> as well.  A paper from 2 years ago quoted disk silent error rates as
> high as 1 bad block every 73TB read.  Very scary stuff, but probably
> beyond the scope of what md can address.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-02-09 19:22 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-07 20:07 SSD - TRIM command Roberto Spadim
2011-02-08 17:37 ` maurice
2011-02-08 18:31   ` Roberto Spadim
     [not found]     ` <AANLkTik5SumqyTN5LZVntna8nunvPe7v38TSFf9eCfcU@mail.gmail.com>
2011-02-08 20:50       ` Roberto Spadim
2011-02-08 21:18         ` maurice
2011-02-08 21:33           ` Roberto Spadim
2011-02-09  7:44   ` Stan Hoeppner
2011-02-09  9:05     ` Eric D. Mudama
2011-02-09 15:45       ` Chris Worley
2011-02-09 13:29     ` David Brown
2011-02-09 14:39       ` Roberto Spadim
2011-02-09 15:00         ` Scott E. Armitage
2011-02-09 15:52           ` Chris Worley
2011-02-09 19:15             ` Doug Dumitru
2011-02-09 19:22               ` Roberto Spadim [this message]
2011-02-09 16:19           ` Eric D. Mudama
2011-02-09 16:28             ` Scott E. Armitage
2011-02-09 17:17               ` Eric D. Mudama
2011-02-09 18:18                 ` Roberto Spadim
2011-02-09 18:24                   ` Piergiorgio Sartor
2011-02-09 18:30                     ` Roberto Spadim
2011-02-09 18:38                       ` Piergiorgio Sartor
2011-02-09 18:46                         ` Roberto Spadim
2011-02-09 18:52                           ` Roberto Spadim
2011-02-09 19:13                           ` Piergiorgio Sartor
2011-02-09 19:16                             ` Roberto Spadim
2011-02-09 19:21                               ` Piergiorgio Sartor
2011-02-09 19:27                                 ` Roberto Spadim
2011-02-21 18:24             ` Phillip Susi
2011-02-21 18:30               ` Roberto Spadim
2011-02-09 15:49         ` David Brown
2011-02-21 18:20           ` Phillip Susi
2011-02-21 18:25             ` Roberto Spadim
2011-02-21 18:34               ` Phillip Susi
2011-02-21 18:48                 ` Roberto Spadim
2011-02-21 18:51               ` Mathias Burén
2011-02-21 19:32                 ` Roberto Spadim
2011-02-21 19:38                   ` Mathias Burén
2011-02-21 19:39                     ` Mathias Burén
2011-02-21 19:43                       ` Roberto Spadim
2011-02-21 20:45                       ` Phillip Susi
2011-02-21 19:39                   ` Roberto Spadim
2011-02-21 19:51                     ` Doug Dumitru
2011-02-21 19:57                       ` Roberto Spadim
2011-02-21 20:47                     ` Phillip Susi
2011-02-21 21:02                       ` Mathias Burén
2011-02-21 22:52                         ` Roberto Spadim
2011-02-21 23:41                           ` Mathias Burén
2011-02-21 23:42                             ` Mathias Burén
2011-02-21 23:52                               ` Roberto Spadim
2011-02-22  0:25                                 ` Mathias Burén
2011-02-22  0:30                                 ` Brendan Conoboy
2011-02-22  0:36                                 ` Eric D. Mudama
2011-02-22  1:46                                   ` Roberto Spadim
2011-02-22  1:52                                     ` Mathias Burén
2011-02-22  1:55                                       ` Roberto Spadim
2011-02-22  2:01                                         ` Eric D. Mudama
2011-02-22  2:02                                         ` Mikael Abrahamsson
2011-02-22  2:22                                           ` Guy Watkins
2011-02-22  2:27                                             ` Roberto Spadim
2011-02-22  3:45                                               ` NeilBrown
2011-02-22  4:37                                                 ` Roberto Spadim
2011-02-22  2:38                                         ` Phillip Susi
2011-02-22  3:29                                           ` Roberto Spadim
2011-02-22  3:42                                             ` Roberto Spadim
2011-02-22  4:04                                             ` Phillip Susi
2011-02-22  4:30                                               ` Roberto Spadim
2011-02-22 14:45                                                 ` Phillip Susi
2011-02-22 17:15                                                   ` Roberto Spadim
2011-02-22  0:32                           ` Eric D. Mudama

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimDWW-n7a9VR-GnCbb4g+QXy_ZmZtoRoWWGCGr9@mail.gmail.com \
    --to=roberto@spadim.com.br \
    --cc=david@westcontrol.com \
    --cc=doug@easyco.com \
    --cc=launchpad@scott.armitage.name \
    --cc=linux-raid@vger.kernel.org \
    --cc=worleys@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.