All of lore.kernel.org
 help / color / mirror / Atom feed
From: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>
To: Roberto Spadim <roberto@spadim.com.br>
Cc: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>,
	"Eric D. Mudama" <edmudama@bounceswoosh.org>,
	"Scott E. Armitage" <launchpad@scott.armitage.name>,
	David Brown <david@westcontrol.com>,
	linux-raid@vger.kernel.org
Subject: Re: SSD - TRIM command
Date: Wed, 9 Feb 2011 20:21:02 +0100	[thread overview]
Message-ID: <20110209192101.GA20745@lazy.lzy> (raw)
In-Reply-To: <AANLkTina-4yvjFgR4dxvvYvNZ56gfZ4S324OWh-zBjji@mail.gmail.com>

> yeah =)
> a question...
> if i send a TRIM to a sector
> if i read from it
> what i have?
> 0x00000000000000000000000000000000000 ?
> if yes, we could translate TRIM to WRITE on devices without TRIM (hard disks)
> just to have the same READ information

It seems the 0x0 is not a standard. Return values
seem to be quite undefined, even if 0x0 *might*
be common.

Second, why do you want to emulate the 0x0 thing?

I do not see the point of writing zero on a device
which do not support TRIM. Just do nothing seems a
better choice, even in mixed environment.

bye,

pg
 
> 2011/2/9 Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>:
> >> it´s just a discussion, right? no implementation yet, right?
> >
> > Of course...
> >
> >> what i think....
> >> if device accept TRIM, we can use TRIM.
> >> if not, we must translate TRIM to something similar (maybe many WRITES
> >> ?), and when we READ from disk we get the same information
> >
> > TRIM is not about writing at all. TRIM tells the
> > device that the addressed block is not anymore used,
> > so it (the SSD) can do whatever it wants with it.
> >
> > The only software layer having the same "knowledge"
> > is the filesystem, the other layers, do not have
> > any decisional power about the block allocation.
> > Except for metadata, of course.
> >
> > So, IMHO, a software TRIM can only be in the FS.
> >
> > bye,
> >
> > pg
> >
> >> the translation coulbe be done by kernel (not md) maybe options on
> >> libata, nbd device....
> >> other option is do it with md, internal (md) TRIM translate function
> >>
> >> who send trim?
> >> internal md information: md can generate it (if necessary, maybe it´s
> >> not...) for parity disks (not data disks)
> >> filesystem/or another upper layer program (database with direct device
> >> access), we could accept TRIM from filesystem/database, and send it to
> >> disks/mirrors, when necessary translate it (internal or kernel
> >> translate function)
> >>
> >>
> >> 2011/2/9 Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>:
> >> > On Wed, Feb 09, 2011 at 04:30:15PM -0200, Roberto Spadim wrote:
> >> >> nice =)
> >> >> but check that parity block is a raid information, not a filesystem information
> >> >> for raid we could implement trim when possible (like swap)
> >> >> and implement a trim that we receive from filesystem, and send to all
> >> >> disks (if it´s a raid1 with mirrors, we should sent to all mirrors)
> >> >
> >> > To all disk also in case of RAID-5?
> >> >
> >> > What if the TRIM belongs only to a single SDD block
> >> > belonging to a single chunk of a stripe?
> >> > That is a *single* SSD of the RAID-5.
> >> >
> >> > Should md re-read the block and re-write (not TRIM)
> >> > the parity?
> >> >
> >> > I think anything that has to do with checking &
> >> > repairing must be carefully considered...
> >> >
> >> > bye,
> >> >
> >> > pg
> >> >
> >> >> i don´t know what trim do very well, but i think it´s a very big write
> >> >> with only some bits for example:
> >> >> set sector1='00000000000000000000000000000000000000000000000000'
> >> >> could be replace by:
> >> >> trim sector1
> >> >> it´s faster for sata communication, and it´s a good information for
> >> >> hard disk (it can put a single '0' at the start of the sector and know
> >> >> that all sector is 0, if it try to read any information it can use
> >> >> internal memory (don´t read hard disk), if a write is done it should
> >> >> write 0000 to bits, and after after the write operation, but it´s
> >> >> internal function of hard disk/ssd, not a problem of md raid... md
> >> >> raid should need know how to optimize and use it =] )
> >> >>
> >> >> 2011/2/9 Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>:
> >> >> >> ext4 send trim commands to device (disk/md raid/nbd)
> >> >> >> kernel swap send this commands (when possible) to device too
> >> >> >> for internal raid5 parity disk this could be done by md, for data
> >> >> >> disks this should be done by ext4
> >> >> >
> >> >> > That's an interesting point.
> >> >> >
> >> >> > On which basis should a parity "block" get a TRIM?
> >> >> >
> >> >> > If you ask me, I think the complete TRIM story is, at
> >> >> > best, a temporary patch.
> >> >> >
> >> >> > IMHO the wear levelling should be handled by the filesystem
> >> >> > and, with awarness of this, by the underlining device drivers.
> >> >> > Reason is that the FS knows better what's going on with the
> >> >> > blocks and what will happen.
> >> >> >
> >> >> > bye,
> >> >> >
> >> >> > pg
> >> >> >
> >> >> >>
> >> >> >> the other question... about resync with only write what is different
> >> >> >> this is very good since write and read speed can be different for ssd
> >> >> >> (hd don´t have this 'problem')
> >> >> >> but i´m sure that just write what is diff is better than write all
> >> >> >> (ssd life will be bigger, hd maybe... i think that will be bigger too)
> >> >> >>
> >> >> >>
> >> >> >> 2011/2/9 Eric D. Mudama <edmudama@bounceswoosh.org>:
> >> >> >> > On Wed, Feb  9 at 11:28, Scott E. Armitage wrote:
> >> >> >> >>
> >> >> >> >> Who sends this command? If md can assume that determinate mode is
> >> >> >> >> always set, then RAID 1 at least would remain consistent. For RAID 5,
> >> >> >> >> consistency of the parity information depends on the determinate
> >> >> >> >> pattern used and the number of disks. If you used determinate
> >> >> >> >> all-zero, then parity information would always be consistent, but this
> >> >> >> >> is probably not preferable since every TRIM command would incur an
> >> >> >> >> extra write for each bit in each page of the block.
> >> >> >> >
> >> >> >> > True, and there are several solutions.  Maybe track space used via
> >> >> >> > some mechanism, such that when you trim you're only trimming the
> >> >> >> > entire stripe width so no parity is required for the trimmed regions.
> >> >> >> > Or, trust the drive's wear leveling and endurance rating, combined
> >> >> >> > with SMART data, to indicate when you need to replace the device
> >> >> >> > preemptive to eventual failure.
> >> >> >> >
> >> >> >> > It's not an unsolvable issue.  If the RAID5 used distributed parity,
> >> >> >> > you could expect wear leveling to wear all the devices evenly, since
> >> >> >> > on average, the # of writes to all devices will be the same.  Only a
> >> >> >> > RAID4 setup would see a lopsided amount of writes to a single device.
> >> >> >> >
> >> >> >> > --eric
> >> >> >> >
> >> >> >> > --
> >> >> >> > Eric D. Mudama
> >> >> >> > edmudama@bounceswoosh.org
> >> >> >> >
> >> >> >> > --
> >> >> >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> >> >> > the body of a message to majordomo@vger.kernel.org
> >> >> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Roberto Spadim
> >> >> >> Spadim Technology / SPAEmpresarial
> >> >> >> --
> >> >> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> >> >> the body of a message to majordomo@vger.kernel.org
> >> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >> >
> >> >> > --
> >> >> >
> >> >> > piergiorgio
> >> >> > --
> >> >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> >> > the body of a message to majordomo@vger.kernel.org
> >> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Roberto Spadim
> >> >> Spadim Technology / SPAEmpresarial
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> >> the body of a message to majordomo@vger.kernel.org
> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >
> >> > --
> >> >
> >> > piergiorgio
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> > the body of a message to majordomo@vger.kernel.org
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >
> >>
> >>
> >>
> >> --
> >> Roberto Spadim
> >> Spadim Technology / SPAEmpresarial
> >
> > --
> >
> > piergiorgio
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> 
> 
> -- 
> Roberto Spadim
> Spadim Technology / SPAEmpresarial

-- 

piergiorgio
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-02-09 19:21 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-07 20:07 SSD - TRIM command Roberto Spadim
2011-02-08 17:37 ` maurice
2011-02-08 18:31   ` Roberto Spadim
     [not found]     ` <AANLkTik5SumqyTN5LZVntna8nunvPe7v38TSFf9eCfcU@mail.gmail.com>
2011-02-08 20:50       ` Roberto Spadim
2011-02-08 21:18         ` maurice
2011-02-08 21:33           ` Roberto Spadim
2011-02-09  7:44   ` Stan Hoeppner
2011-02-09  9:05     ` Eric D. Mudama
2011-02-09 15:45       ` Chris Worley
2011-02-09 13:29     ` David Brown
2011-02-09 14:39       ` Roberto Spadim
2011-02-09 15:00         ` Scott E. Armitage
2011-02-09 15:52           ` Chris Worley
2011-02-09 19:15             ` Doug Dumitru
2011-02-09 19:22               ` Roberto Spadim
2011-02-09 16:19           ` Eric D. Mudama
2011-02-09 16:28             ` Scott E. Armitage
2011-02-09 17:17               ` Eric D. Mudama
2011-02-09 18:18                 ` Roberto Spadim
2011-02-09 18:24                   ` Piergiorgio Sartor
2011-02-09 18:30                     ` Roberto Spadim
2011-02-09 18:38                       ` Piergiorgio Sartor
2011-02-09 18:46                         ` Roberto Spadim
2011-02-09 18:52                           ` Roberto Spadim
2011-02-09 19:13                           ` Piergiorgio Sartor
2011-02-09 19:16                             ` Roberto Spadim
2011-02-09 19:21                               ` Piergiorgio Sartor [this message]
2011-02-09 19:27                                 ` Roberto Spadim
2011-02-21 18:24             ` Phillip Susi
2011-02-21 18:30               ` Roberto Spadim
2011-02-09 15:49         ` David Brown
2011-02-21 18:20           ` Phillip Susi
2011-02-21 18:25             ` Roberto Spadim
2011-02-21 18:34               ` Phillip Susi
2011-02-21 18:48                 ` Roberto Spadim
2011-02-21 18:51               ` Mathias Burén
2011-02-21 19:32                 ` Roberto Spadim
2011-02-21 19:38                   ` Mathias Burén
2011-02-21 19:39                     ` Mathias Burén
2011-02-21 19:43                       ` Roberto Spadim
2011-02-21 20:45                       ` Phillip Susi
2011-02-21 19:39                   ` Roberto Spadim
2011-02-21 19:51                     ` Doug Dumitru
2011-02-21 19:57                       ` Roberto Spadim
2011-02-21 20:47                     ` Phillip Susi
2011-02-21 21:02                       ` Mathias Burén
2011-02-21 22:52                         ` Roberto Spadim
2011-02-21 23:41                           ` Mathias Burén
2011-02-21 23:42                             ` Mathias Burén
2011-02-21 23:52                               ` Roberto Spadim
2011-02-22  0:25                                 ` Mathias Burén
2011-02-22  0:30                                 ` Brendan Conoboy
2011-02-22  0:36                                 ` Eric D. Mudama
2011-02-22  1:46                                   ` Roberto Spadim
2011-02-22  1:52                                     ` Mathias Burén
2011-02-22  1:55                                       ` Roberto Spadim
2011-02-22  2:01                                         ` Eric D. Mudama
2011-02-22  2:02                                         ` Mikael Abrahamsson
2011-02-22  2:22                                           ` Guy Watkins
2011-02-22  2:27                                             ` Roberto Spadim
2011-02-22  3:45                                               ` NeilBrown
2011-02-22  4:37                                                 ` Roberto Spadim
2011-02-22  2:38                                         ` Phillip Susi
2011-02-22  3:29                                           ` Roberto Spadim
2011-02-22  3:42                                             ` Roberto Spadim
2011-02-22  4:04                                             ` Phillip Susi
2011-02-22  4:30                                               ` Roberto Spadim
2011-02-22 14:45                                                 ` Phillip Susi
2011-02-22 17:15                                                   ` Roberto Spadim
2011-02-22  0:32                           ` Eric D. Mudama

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110209192101.GA20745@lazy.lzy \
    --to=piergiorgio.sartor@nexgo.de \
    --cc=david@westcontrol.com \
    --cc=edmudama@bounceswoosh.org \
    --cc=launchpad@scott.armitage.name \
    --cc=linux-raid@vger.kernel.org \
    --cc=roberto@spadim.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.