All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Worley <worleys@gmail.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: "Majed B." <majedb@gmail.com>, Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Intel Updates SSDs, Supports TRIM, Faster Writes
Date: Tue, 10 Nov 2009 10:22:13 -0700	[thread overview]
Message-ID: <f3177b9e0911100922u435993behd0889ae284cf0ec4@mail.gmail.com> (raw)
In-Reply-To: <yq1ljienxql.fsf@sermon.lab.mkp.net>

On Tue, Nov 10, 2009 at 9:36 AM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:
>
> Chris> The only problem is SSD's put Solid State Storage (SSS) behind
> Chris> SATA/SAS controllers... while compatible w/ old disk technology,
> Chris> it severely limits performance (i.e. none of these SSD drives do
> Chris> even 300MB/s... while SSS drives do 800MB/s).
>
> You are arguing that the SATA/SCSI protocols are inhibiting factors on
> the grounds that PCIe solid state devices are faster.
>
> Performance inside a flash device is gated by the number of channels you
> run in parallel.  There is not much point in increasing the number of
> channels if your physical interconnect (3Gbps SATA, say) can't handle
> the traffic.  Hence the drive towards 6Gbps interconnects and beyond for
> both SATA and SAS.

Absolutely agreed: the SSD manufacturers will limit their NAND
performance given the performance limitations of the controller
front-end.  Also, given their management layer is an on-board ASIC,
they further limit their performance in this design.

>
> Also, not all SSS boards present a memory-style device to the host.
> Several shipping SSS boards use a regular SAS HBA backed by multiple
> SATA/SAS targets which again comprise of multiple flash channels.  And
> the performance of these devices is absolutely on par with the
> memory-based devices.  Without requiring proprietary drivers, and
> without reinventing filesystems and I/O stack.

I'm not talking about memory-based or -looking devices.  A block
device is all you need, and you don't have to re-write file systems to
put one atop a block device.

Those using legacy controller technology can overcome the issue by
using multiple devices.  We've been talking single device performance.
 I can get 6GB/s using 8 SSS drives.  Scalability is much easier when
you start with really fast individual components.  So, legacy
controllers are still a bad design.

>
> We have been pushing tens of gigabytes per second through the storage
> stack for years when connected to arrays which - given their large
> non-volatile caches - are virtually indistinguishable from SSDs.  We're
> constantly tweaking and tuning.  Jens has done a lot of work to bring
> down command latency, I have worked on storage topology which allows us
> to uniquely identify the characteristics of the physical storage device
> so we can issue I/O in an optimal fashion.

And I do appreciate all your work.  I fear, in this case, discard will
be optimized for the slower technology... we won't be getting all
that's available from it.
>
> Note that I don't think that memory-based SSS devices are without merit.

Let's call it CPU-based.  "Memory-based" sounds like RAM-based
storage... we're not talking about that.

> But it's baloney to claim that a storage-flavored interface inherently
> means bad performance.

You need an epiphany here.  Between the SAS/SATA controllers and the
on-board drive logic, SSD's are a bad design when it comes to
performance.  They are dwarfed, in performance, by CPU-based
controllers.  CPU's have much more performance for handling the
management needed by NAND, and there are so many cores these days
going unused.

SSD's do win the "compatibility" argument.  It's too bad we didn't
invent thumb drives that were floppy compatible ;)

>
>
> Chris> So it looks like "design by committee" Linux is well behind
> Chris> Windows 7, while Linux contemplates slowing new technology down
> Chris> to optimize for ill-designed SSD's.
>
> We're not slowing anything, nor are we optimizing for ill-designed SSDs.
>
> Because initial TRIM performance was absolutely appalling

Only on SSD's behind legacy controllers.  It worked great as-is with SSS.

> there was a
> lot of discussion about the merits of doing weekly scrubs instead of
> issuing TRIM on the fly.  However, Windows 7 shipped issuing TRIM in
> realtime which means that all the early SSDs with lame duck DSM
> performance are headed straight for the garbage bin.

Too bad the legacy design doesn't go with them ;)

Chris
>
> Futhermore, unlike Windows 7 we can't pretend everything is desktop
> class ATA.  We've spent a lot of time making sure that our block layer
> discard support works equally well for both ATA DSM (TRIM) as well as
> SCSI WRITE SAME and UNMAP used by high-end arrays.  All three commands
> have been moving targets and none of them are technically set in stone
> in their respective standards bodies yet.
>
> So I think it would be a stretch to claim that TRIM is well tested and
> stable in the industry.  intel just pulled their latest X25-M firmware
> because of problems with Windows 7...
>
> --
> Martin K. Petersen      Oracle Linux Engineering
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-11-10 17:22 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen
2009-11-08 22:30 ` Thomas Fjellstrom
2009-11-09  1:13 ` Majed B.
2009-11-09 16:37   ` Chris Worley
2009-11-09 16:42     ` Majed B.
2009-11-09 16:59       ` Chris Worley
2009-11-10  9:42         ` Kasper Sandberg
2009-11-10 15:39           ` Chris Worley
2009-11-10 15:43             ` Majed B.
2009-11-10 15:58               ` Chris Worley
2009-11-10 16:01                 ` Majed B.
2009-11-10 16:15                   ` Robin Hill
2009-11-10 16:31                     ` Chris Worley
2009-11-10 16:18                   ` Chris Worley
2009-11-10 18:31                     ` Majed B.
2009-11-10 23:03                       ` Mathieu Chouquet-Stringer
2009-11-11  2:52                         ` Majed B.
2009-11-10 18:40                     ` Kasper Sandberg
2009-11-10 15:48             ` Asdo
2009-11-10 16:04               ` Chris Worley
2009-11-11 18:02                 ` Default User
2009-11-10 18:38             ` Kasper Sandberg
2009-11-10 16:36         ` Martin K. Petersen
2009-11-10 17:22           ` Chris Worley [this message]
2009-11-10 20:11             ` Martin K. Petersen
2009-11-10 20:45               ` Chris Worley
2009-11-10 22:35                 ` Martin K. Petersen
2009-11-11 18:17                   ` Chris Worley
2009-11-10 21:01               ` Greg Freemyer
2009-11-10 21:17                 ` Chris Worley
2009-11-10 22:56                 ` Martin K. Petersen
2009-11-11 17:00                   ` Greg Freemyer
2009-11-12  5:50                     ` Martin K. Petersen
2009-11-09 18:42   ` Greg Freemyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3177b9e0911100922u435993behd0889ae284cf0ec4@mail.gmail.com \
    --to=worleys@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=majedb@gmail.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.