All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Freemyer <greg.freemyer@gmail.com>
To: Mark Lord <kernel@teksavvy.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	James Bottomley <James.Bottomley@suse.de>,
	Jeff Moyer <jmoyer@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <matthew@wil.cx>, Josef Bacik <josef@redhat.com>,
	Lukas Czerner <lczerner@redhat.com>,
	tytso@mit.edu, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	sandeen@redhat.com
Subject: Re: [PATCH 1/2] fs: Do not dispatch FITRIM through separate super_operation
Date: Thu, 18 Nov 2010 17:16:36 -0800	[thread overview]
Message-ID: <AANLkTikgEeUL2NVLLdoT3RX0r=v0Pzfp9-==F39DVRgN@mail.gmail.com> (raw)
In-Reply-To: <4CE5C616.7070706@teksavvy.com>

On Thu, Nov 18, 2010 at 4:34 PM, Mark Lord <kernel@teksavvy.com> wrote:
> On 10-11-18 06:52 PM, Martin K. Petersen wrote:
>>>>>>>
>>>>>>> "Mark" == Mark Lord<kernel@teksavvy.com>  writes:
>>
>> Mark>  If FITRIM is still issuing single-range-at-a-time TRIMs, then I'd
>> Mark>  call that a BUG that needs fixing.  Doing TRIM like that causes
>> Mark>  tons of unnecessary ERASE cycles, shortening the SSD lifetime.  It
>> Mark>  really needs to batch them into groups of (up to) 64 ranges at a
>> Mark>  time (64 ranges fits into a single 512-byte parameter block).
>>
>> We don't support coalescing discontiguous requests into one command. But
>> we will issue contiguous TRIM requests as big as the payload can
>> handle. That's just short of two gigs per command given a 512-byte
>> block.
>>
>> I spent quite a bit of time trying to make coalescing work in the
>> spring. It got very big and unwieldy. When we discussed it at the
>> filesystem summit the consensus was that it was too intrusive to the I/O
>> stack, elevators, etc.
>
> Surely if a userspace tool and shell-script can accomplish this,
> totally lacking real filesystem knowledge, then we should be able
> to approximate it in kernel space?
>
> This is FITRIM we're talking about, not the on-the-fly automatic TRIM.
>
> FITRIM could perhaps use a similar approach to what wiper.sh does:
> reserve a large number of free blocks, and issue coalesced TRIM(s) on them.
>
> The difference being, it could walk through the filesystem,
> trimming in sections, rather than trying to reserve/trim the entire
> freespace all in one go.
>
> Over-thinking it???

Martin,

I agree with Mark.  When you say "make coalescing work" it sounds like
major overkill.

FITRIM should be able to lock a group of non-contiguous free ranges,
send them down to the block layer as a single pre-coalesced set, and
the block layer just needs to pass it on in a synchronous way.  Then
when that group of ranges is discarded, FITRIM releases the locks.

Every TRIM causes a cache flush anyway on the SSD, so the synchronous
aspect of the process should not be a problem.  (Maybe SCSI with thin
provisioning would see a performance hit?)

To my small brain, the only really complex part is sending something
like that into MDraid, because that one set of ranges might explode
into thousands of ranges and then have to be coalesced back down to a
more manageable number of ranges.

ie. with a simple raid 0, each range will need to be broken into a
bunch of stride sized ranges, then the contiguous strides on each
spindle coalesced back into larger ranges.

But if MDraid can handle discards now with one range, it should not be
that hard to teach it handle a group of ranges.

Greg


-- 
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
CNN/TruTV Aired Forensic Imaging Demo -
   http://insession.blogs.cnn.com/2010/03/23/how-computer-evidence-gets-retrieved/

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com

WARNING: multiple messages have this Message-ID (diff)
From: Greg Freemyer <greg.freemyer@gmail.com>
To: Mark Lord <kernel@teksavvy.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	James Bottomley <James.Bottomley@suse.de>,
	Jeff Moyer <jmoyer@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <matthew@wil.cx>, Josef Bacik <josef@redhat.com>,
	Lukas Czerner <lczerner@redhat.com>,
	tytso@mit.edu, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	sandeen@redhat.com
Subject: Re: [PATCH 1/2] fs: Do not dispatch FITRIM through separate super_operation
Date: Thu, 18 Nov 2010 17:16:36 -0800	[thread overview]
Message-ID: <AANLkTikgEeUL2NVLLdoT3RX0r=v0Pzfp9-==F39DVRgN@mail.gmail.com> (raw)
In-Reply-To: <4CE5C616.7070706@teksavvy.com>

On Thu, Nov 18, 2010 at 4:34 PM, Mark Lord <kernel@teksavvy.com> wrote:
> On 10-11-18 06:52 PM, Martin K. Petersen wrote:
>>>>>>>
>>>>>>> "Mark" == Mark Lord<kernel@teksavvy.com>  writes:
>>
>> Mark>  If FITRIM is still issuing single-range-at-a-time TRIMs, then I'd
>> Mark>  call that a BUG that needs fixing.  Doing TRIM like that causes
>> Mark>  tons of unnecessary ERASE cycles, shortening the SSD lifetime.  It
>> Mark>  really needs to batch them into groups of (up to) 64 ranges at a
>> Mark>  time (64 ranges fits into a single 512-byte parameter block).
>>
>> We don't support coalescing discontiguous requests into one command. But
>> we will issue contiguous TRIM requests as big as the payload can
>> handle. That's just short of two gigs per command given a 512-byte
>> block.
>>
>> I spent quite a bit of time trying to make coalescing work in the
>> spring. It got very big and unwieldy. When we discussed it at the
>> filesystem summit the consensus was that it was too intrusive to the I/O
>> stack, elevators, etc.
>
> Surely if a userspace tool and shell-script can accomplish this,
> totally lacking real filesystem knowledge, then we should be able
> to approximate it in kernel space?
>
> This is FITRIM we're talking about, not the on-the-fly automatic TRIM.
>
> FITRIM could perhaps use a similar approach to what wiper.sh does:
> reserve a large number of free blocks, and issue coalesced TRIM(s) on them.
>
> The difference being, it could walk through the filesystem,
> trimming in sections, rather than trying to reserve/trim the entire
> freespace all in one go.
>
> Over-thinking it???

Martin,

I agree with Mark.  When you say "make coalescing work" it sounds like
major overkill.

FITRIM should be able to lock a group of non-contiguous free ranges,
send them down to the block layer as a single pre-coalesced set, and
the block layer just needs to pass it on in a synchronous way.  Then
when that group of ranges is discarded, FITRIM releases the locks.

Every TRIM causes a cache flush anyway on the SSD, so the synchronous
aspect of the process should not be a problem.  (Maybe SCSI with thin
provisioning would see a performance hit?)

To my small brain, the only really complex part is sending something
like that into MDraid, because that one set of ranges might explode
into thousands of ranges and then have to be coalesced back down to a
more manageable number of ranges.

ie. with a simple raid 0, each range will need to be broken into a
bunch of stride sized ranges, then the contiguous strides on each
spindle coalesced back into larger ranges.

But if MDraid can handle discards now with one range, it should not be
that hard to teach it handle a group of ranges.

Greg


-- 
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
CNN/TruTV Aired Forensic Imaging Demo -
   http://insession.blogs.cnn.com/2010/03/23/how-computer-evidence-gets-retrieved/

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-11-19  1:17 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-18  7:36 [PATCH 1/2] fs: Do not dispatch FITRIM through separate super_operation Lukas Czerner
2010-11-18  7:36 ` [PATCH 2/2] ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard Lukas Czerner
2010-11-19 16:19   ` Ted Ts'o
2010-11-19 16:26     ` Lukas Czerner
2010-11-20  1:37       ` Ted Ts'o
2010-11-18 13:06 ` [PATCH 1/2] fs: Do not dispatch FITRIM through separate super_operation Matthew Wilcox
2010-11-18 13:48   ` Josef Bacik
2010-11-18 14:19     ` Matthew Wilcox
2010-11-18 14:29       ` Christoph Hellwig
2010-11-18 17:19         ` James Bottomley
2010-11-18 17:22           ` Jeff Moyer
2010-11-18 17:41             ` James Bottomley
2010-11-18 20:04               ` Greg Freemyer
2010-11-18 20:04                 ` Greg Freemyer
2010-11-18 21:42                 ` Mark Lord
2010-11-18 21:44                 ` Mark Lord
2010-11-18 21:50                   ` James Bottomley
2010-11-18 22:07                     ` Mark Lord
2010-11-19  1:33                       ` Ted Ts'o
2010-11-19  3:44                         ` Mark Lord
2010-11-19  3:44                         ` Mark Lord
2010-11-19  3:44                           ` Mark Lord
2010-11-19 13:58                           ` Jeff Moyer
2010-11-18 23:52                   ` Martin K. Petersen
2010-11-19  0:34                     ` Mark Lord
2010-11-19  1:16                       ` Greg Freemyer [this message]
2010-11-19  1:16                         ` Greg Freemyer
2010-11-19 11:55                         ` Christoph Hellwig
2010-11-19 14:01                           ` Mark Lord
2010-11-19 14:06                             ` Christoph Hellwig
2010-11-19 14:48                               ` Mark Lord
2010-11-19 14:54                                 ` Christoph Hellwig
2010-11-19 15:24                                   ` Mark Lord
2010-11-19 15:34                                     ` Christoph Hellwig
2010-11-19 16:20                           ` Greg Freemyer
2010-11-19 16:38                             ` Christoph Hellwig
2010-11-19 18:06                               ` Lukas Czerner
2010-11-19 18:10                                 ` Lukas Czerner
2010-11-19 18:14                                   ` Lukas Czerner
2010-11-19 19:29                                 ` Chris Mason
2010-11-19  1:49                       ` Martin K. Petersen
2010-11-19  3:42                         ` Mark Lord
2010-11-18 18:05             ` Jamie Lokier
2010-11-18 19:32               ` Markus Trippelsdorf
2010-11-18 21:45                 ` Mark Lord
2010-11-18 21:50                   ` Markus Trippelsdorf
2010-11-18 22:09                     ` Mark Lord
2010-11-18 17:35           ` Lukas Czerner
2010-11-19 12:16             ` Steven Whitehouse
2010-11-19 13:53               ` Mark Lord
2010-11-19 14:02                 ` Ted Ts'o
2010-11-19 14:10                   ` Christoph Hellwig
2010-11-19 14:10                   ` Christoph Hellwig
2010-11-19 14:10                   ` Christoph Hellwig
2010-11-19 15:37                     ` Ted Ts'o
2010-11-19 15:50                       ` Christoph Hellwig
2010-11-19 15:50                       ` Christoph Hellwig
2010-11-19 15:50                         ` Christoph Hellwig
2010-11-19 16:16                         ` Ted Ts'o
2010-11-19 14:50                   ` Mark Lord
2010-11-19 14:50                   ` Mark Lord
2010-11-19 14:50                     ` Mark Lord
2010-11-19 15:35                   ` Mark Lord
2010-11-19 15:35                   ` Mark Lord
2010-11-19 15:35                     ` Mark Lord
2010-11-19 15:44                     ` Lukas Czerner
2010-11-19 16:30                       ` Ted Ts'o
2010-11-19 22:49                         ` Mark Lord
2010-11-19 22:49                           ` Mark Lord
2010-11-19 22:49                         ` Mark Lord
2010-11-25  2:48                         ` Mark Lord
2010-11-25  2:48                         ` Mark Lord
2010-11-25  2:48                           ` Mark Lord
2010-11-25  4:23                           ` Martin K. Petersen
2010-11-25 14:44                             ` Mark Lord
2010-11-25  4:41                           ` Greg Freemyer
2010-11-25  4:41                             ` Greg Freemyer
2010-11-25 14:53                             ` Mark Lord
2010-11-25 16:24                               ` Greg Freemyer
2010-11-25 16:24                                 ` Greg Freemyer
2010-11-26 13:49                                 ` Mark Lord
2010-11-26 14:00                                   ` Lukas Czerner
2010-11-18 17:55           ` Chris Mason
2010-12-03 18:24             ` Ric Wheeler
2010-11-18 21:37           ` Mark Lord
2010-11-19 11:09             ` Christoph Hellwig
2010-11-19 13:54               ` Mark Lord
2010-11-19 14:40             ` Chris Mason
2010-11-19 14:53               ` Mark Lord
2010-11-19 14:57                 ` Christoph Hellwig
2010-11-19 15:21                   ` Mark Lord
2010-12-07  9:27                     ` Christoph Hellwig
2010-12-07 16:52                       ` Chris Mason
2011-06-02  4:52                         ` Kyungmin Park
2011-06-02  8:14                           ` Lukas Czerner
2011-06-03  2:06                           ` Dave Chinner
2011-06-03  2:06                             ` Dave Chinner
2011-06-03  4:25                             ` Kyungmin Park
2011-06-03  4:25                               ` Kyungmin Park
2010-11-19 15:30                   ` Mark Lord
2010-11-21 19:07                 ` Valdis.Kletnieks
2010-11-21 20:20                   ` James Bottomley
2010-11-18 14:31       ` Josef Bacik
2010-11-18 14:36         ` Tao Ma
2010-11-19 15:41 ` Ted Ts'o
2010-11-19 15:50   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTikgEeUL2NVLLdoT3RX0r=v0Pzfp9-==F39DVRgN@mail.gmail.com' \
    --to=greg.freemyer@gmail.com \
    --cc=James.Bottomley@suse.de \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=josef@redhat.com \
    --cc=kernel@teksavvy.com \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthew@wil.cx \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.