All of lore.kernel.org
 help / color / mirror / Atom feed
* fstrim has no effect on a just-mounted filesystem
@ 2014-03-11 21:39 Richard W.M. Jones
  2014-03-11 21:47 ` Eric Sandeen
  0 siblings, 1 reply; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-11 21:39 UTC (permalink / raw)
  To: linux-ext4


Here's a problem I can't work out:

I have a filesystem (in a VM) that I know has at least 100MB of
deleted files on it.  Doing this in a script:

  mount -o discard /dev/sda1 /mnt
  fstrim /mnt

... does nothing.  Also the fstrim is almost instantaneous -- there's
no way it could be scanning the disk.

However, if I start with the same filesystem, mounted with -o discard,
and create and rm large files, while observing the size of the
underlying virtual disk, then discard is obviously working fine.  'rm'
of large files makes the underlying disk shrink.

Any ideas here?

Rich.

kernel: 3.12.5-302.fc20.x86_64
qemu: 1.7.0
virtio-scsi with discard=unmap

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 21:39 fstrim has no effect on a just-mounted filesystem Richard W.M. Jones
@ 2014-03-11 21:47 ` Eric Sandeen
  2014-03-11 22:00   ` Richard W.M. Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Sandeen @ 2014-03-11 21:47 UTC (permalink / raw)
  To: Richard W.M. Jones, linux-ext4

On 3/11/14, 4:39 PM, Richard W.M. Jones wrote:
> 
> Here's a problem I can't work out:
> 
> I have a filesystem (in a VM) that I know has at least 100MB of
> deleted files on it.

Was it mounted with -o discard at the time the files were deleted?
If so, then the trim is already done during the unlink process,
and there's no more work to do.

So that's my first thought, but ...

>  Doing this in a script:
> 
>   mount -o discard /dev/sda1 /mnt
>   fstrim /mnt
> 
> ... does nothing.  Also the fstrim is almost instantaneous -- there's
> no way it could be scanning the disk.

blktrace would be a better tool to find out whether or not discards
are actually getting issued to storage...

And if you strace it what does the ioctl return?

Enabling the trace_ext4_trim_all_free tracepoint might be interesting too.

> However, if I start with the same filesystem, mounted with -o discard,
> and create and rm large files, while observing the size of the
> underlying virtual disk, then discard is obviously working fine.  'rm'
> of large files makes the underlying disk shrink.
> 
> Any ideas here?

first of all, I should point out that "-o discard" is not necessary for
fstrim / FITRIM ioctl to work.  The former tries to trim as soon
as files are unlinked; FITRIM goes looking for free blocks to trim.

If you're mounting with -o discard, then fstrim should never find any
workd to do.

-Eric


> Rich.
> 
> kernel: 3.12.5-302.fc20.x86_64
> qemu: 1.7.0
> virtio-scsi with discard=unmap
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 21:47 ` Eric Sandeen
@ 2014-03-11 22:00   ` Richard W.M. Jones
  2014-03-11 22:04     ` Richard W.M. Jones
  2014-03-11 22:08     ` Eric Sandeen
  0 siblings, 2 replies; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-11 22:00 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

[The context of this is trying to get virt-sparsify to work in place
on disks.]

On Tue, Mar 11, 2014 at 04:47:02PM -0500, Eric Sandeen wrote:
> On 3/11/14, 4:39 PM, Richard W.M. Jones wrote:
> > 
> > Here's a problem I can't work out:
> > 
> > I have a filesystem (in a VM) that I know has at least 100MB of
> > deleted files on it.
> 
> Was it mounted with -o discard at the time the files were deleted?

No, it was not.

I know that the original 'rm' command didn't recover any space because
the disk image grew by ~100 MB.

> If so, then the trim is already done during the unlink process,
> and there's no more work to do.
> 
> So that's my first thought, but ...
> 
> >  Doing this in a script:
> > 
> >   mount -o discard /dev/sda1 /mnt
> >   fstrim /mnt
> > 
> > ... does nothing.  Also the fstrim is almost instantaneous -- there's
> > no way it could be scanning the disk.
> 
> blktrace would be a better tool to find out whether or not discards
> are actually getting issued to storage...
> 
> And if you strace it what does the ioctl return?

I'll try that in a few minutes.

In the mean time I captured the fstrim -v output:

  fstrim -v /
  /: 124 MiB (130039808 bytes) trimmed

124 MB is (within 25%) the amount of data I would expect needs to be
trimmed.

> Enabling the trace_ext4_trim_all_free tracepoint might be interesting too.

That a systemtap thing?  It's tricky to get systemtap working in a
virtual machine, but I guess I can try if nothing else works.

> > However, if I start with the same filesystem, mounted with -o discard,
> > and create and rm large files, while observing the size of the
> > underlying virtual disk, then discard is obviously working fine.  'rm'
> > of large files makes the underlying disk shrink.
> > 
> > Any ideas here?
> 
> first of all, I should point out that "-o discard" is not necessary for
> fstrim / FITRIM ioctl to work.  The former tries to trim as soon
> as files are unlinked; FITRIM goes looking for free blocks to trim.
>
> If you're mounting with -o discard, then fstrim should never find any
> workd to do.

Useful to know.

I thought I had to use -o discard in order for the ext4 module to send
discard commands at all to the block layer.

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 22:00   ` Richard W.M. Jones
@ 2014-03-11 22:04     ` Richard W.M. Jones
  2014-03-11 22:08     ` Eric Sandeen
  1 sibling, 0 replies; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-11 22:04 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

> > And if you strace it what does the ioctl return?

It seems OK:

stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/", O_RDONLY)                     = 0
ioctl(0, FITRIM, 0x7fffbfb2be60)        = 0

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 22:00   ` Richard W.M. Jones
  2014-03-11 22:04     ` Richard W.M. Jones
@ 2014-03-11 22:08     ` Eric Sandeen
  2014-03-11 22:59       ` Richard W.M. Jones
  1 sibling, 1 reply; 13+ messages in thread
From: Eric Sandeen @ 2014-03-11 22:08 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: linux-ext4

On 3/11/14, 5:00 PM, Richard W.M. Jones wrote:
> [The context of this is trying to get virt-sparsify to work in place
> on disks.]
> 
> On Tue, Mar 11, 2014 at 04:47:02PM -0500, Eric Sandeen wrote:
>> On 3/11/14, 4:39 PM, Richard W.M. Jones wrote:
>>>
>>> Here's a problem I can't work out:
>>>
>>> I have a filesystem (in a VM) that I know has at least 100MB of
>>> deleted files on it.
>>
>> Was it mounted with -o discard at the time the files were deleted?
> 
> No, it was not.

Ok, worth asking.  :)

> I know that the original 'rm' command didn't recover any space because
> the disk image grew by ~100 MB.

>> If so, then the trim is already done during the unlink process,
>> and there's no more work to do.
>>
>> So that's my first thought, but ...
>>
>>>  Doing this in a script:
>>>
>>>   mount -o discard /dev/sda1 /mnt
>>>   fstrim /mnt
>>>
>>> ... does nothing.  Also the fstrim is almost instantaneous -- there's
>>> no way it could be scanning the disk.
>>
>> blktrace would be a better tool to find out whether or not discards
>> are actually getting issued to storage...

blktrace is probably the place to start.  Do you see discard
requests?  then ext4 is doing its job.  If not, we can trace
ext4 to see why it's not issuing them, assuming there really
is work to do.

>> And if you strace it what does the ioctl return?
> 
> I'll try that in a few minutes.
> 
> In the mean time I captured the fstrim -v output:
> 
>   fstrim -v /
>   /: 124 MiB (130039808 bytes) trimmed
> 
> 124 MB is (within 25%) the amount of data I would expect needs to be
> trimmed.

Ok, so it says that it did do what you expected...

>> Enabling the trace_ext4_trim_all_free tracepoint might be interesting too.
> 
> That a systemtap thing?  It's tricky to get systemtap working in a
> virtual machine, but I guess I can try if nothing else works.

# trace-cmd record -e &
# <run test>
# fg
<ctrl-c>
# trace-cmd report > trace_report.txt

should do it.

>>> However, if I start with the same filesystem, mounted with -o discard,
>>> and create and rm large files, while observing the size of the
>>> underlying virtual disk, then discard is obviously working fine.  'rm'
>>> of large files makes the underlying disk shrink.

(backing up, that's the "-o discard" option at work)

>>> Any ideas here?
>>
>> first of all, I should point out that "-o discard" is not necessary for
>> fstrim / FITRIM ioctl to work.  The former tries to trim as soon
>> as files are unlinked; FITRIM goes looking for free blocks to trim.
>>
>> If you're mounting with -o discard, then fstrim should never find any
>> work to do.
> 
> Useful to know.
> 
> I thought I had to use -o discard in order for the ext4 module to send
> discard commands at all to the block layer.

nope.  That just makes it do it every time a block is freed, instead
of in batches via fstrim:

discard                 Controls whether ext4 should issue discard/TRIM
nodiscard(*)            commands to the underlying block device when
                        blocks are freed.

the FITRIM ioctl works fine w/o the mount option, and in fact as I said,
should have no work to do if the mount option is there - every freed block
shouid get discarded  (well, maybe modulo some size thresholds, I don't
remember for sure).

-Eric

> Thanks,
> 
> Rich.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 22:08     ` Eric Sandeen
@ 2014-03-11 22:59       ` Richard W.M. Jones
  2014-03-11 23:07         ` Richard W.M. Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-11 22:59 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

On Tue, Mar 11, 2014 at 05:08:19PM -0500, Eric Sandeen wrote:
> blktrace is probably the place to start.  Do you see discard
> requests?  then ext4 is doing its job.  If not, we can trace
> ext4 to see why it's not issuing them, assuming there really
> is work to do.

At the moment I can't get this to work.  The script I'm using is:

----------------------------------------------------------------------
set -e
set -x
trace-cmd record -e all -o /tmp/trace &
pid=$!
fstrim /sysroot
kill $pid; sleep 2
trace-cmd report -i /tmp/trace
----------------------------------------------------------------------

However the last trace-cmd gives an error:

trace-cmd: No such file or directory
  opening '/tmp/trace'

I'll try again tomorrow morning.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 22:59       ` Richard W.M. Jones
@ 2014-03-11 23:07         ` Richard W.M. Jones
  2014-03-11 23:09           ` Eric Sandeen
  0 siblings, 1 reply; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-11 23:07 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

On Tue, Mar 11, 2014 at 10:59:32PM +0000, Richard W.M. Jones wrote:
> On Tue, Mar 11, 2014 at 05:08:19PM -0500, Eric Sandeen wrote:
> > blktrace is probably the place to start.  Do you see discard
> > requests?  then ext4 is doing its job.  If not, we can trace
> > ext4 to see why it's not issuing them, assuming there really
> > is work to do.
> 
> At the moment I can't get this to work.  The script I'm using is:
> 
> ----------------------------------------------------------------------
> set -e
> set -x
> trace-cmd record -e all -o /tmp/trace &
> pid=$!
> fstrim /sysroot
> kill $pid; sleep 2
> trace-cmd report -i /tmp/trace
> ----------------------------------------------------------------------

I got it to work by using:

----------------------------------------------------------------------
set -e
set -x
trace-cmd record -e all fstrim /sysroot
trace-cmd report
----------------------------------------------------------------------

The output is absolutely huge and I didn't capture it.

However just the act of doing the tracing *caused* the trim to happen
properly in the underlying disk.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 23:07         ` Richard W.M. Jones
@ 2014-03-11 23:09           ` Eric Sandeen
  2014-03-11 23:30             ` Richard W.M. Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Sandeen @ 2014-03-11 23:09 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: linux-ext4

On 3/11/14, 6:07 PM, Richard W.M. Jones wrote:

> I got it to work by using:
> 
> ----------------------------------------------------------------------
> set -e
> set -x
> trace-cmd record -e all fstrim /sysroot
> trace-cmd report
> ----------------------------------------------------------------------
> 
> The output is absolutely huge and I didn't capture it.

that's why I suggested a single tracepoint, rather than every tracepoint
in the kernel... ;)

oh wait, I didn't.  :/  argh sorry.

# trace-cmd record -e ext4_trim\* 

should do it.

> However just the act of doing the tracing *caused* the trim to happen
> properly in the underlying disk.

that sounds very strange...

-Eric
 
> Rich.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 23:09           ` Eric Sandeen
@ 2014-03-11 23:30             ` Richard W.M. Jones
  2014-03-12 10:17               ` Richard W.M. Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-11 23:30 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

On Tue, Mar 11, 2014 at 06:09:28PM -0500, Eric Sandeen wrote:
> On 3/11/14, 6:07 PM, Richard W.M. Jones wrote:
> 
> > I got it to work by using:
> > 
> > ----------------------------------------------------------------------
> > set -e
> > set -x
> > trace-cmd record -e all fstrim /sysroot
> > trace-cmd report
> > ----------------------------------------------------------------------
> > 
> > The output is absolutely huge and I didn't capture it.
> 
> that's why I suggested a single tracepoint, rather than every tracepoint
> in the kernel... ;)
> 
> oh wait, I didn't.  :/  argh sorry.
> 
> # trace-cmd record -e ext4_trim\* 
> 
> should do it.
> 
> > However just the act of doing the tracing *caused* the trim to happen
> > properly in the underlying disk.
> 
> that sounds very strange...

Thanks Eric.

FYI the libguestfs / virt-sparsify patch series that motivates this is
here:

https://www.redhat.com/archives/libguestfs/2014-March/thread.html#00091

Even with the greatly reduced set of traces (see attached), just the
act of tracing seems to have made trimming work properly.  The output
file has been trimmed properly from 926 MB to 819 MB:

819M	fedora-20.img
926M	fedora-20.img.ORIG

However I don't think making tracing the default on all fstrim
operations is going to be a good solution :-(

Rich.

----------------------------------------------------------------------

+ trace-cmd record -e 'ext4_trim*' fstrim /sysroot
/sys/kernel/debug/tracing/events/ext4_trim*/filter
/sys/kernel/debug/tracing/events/*/ext4_trim*/filter
CPU0 data recorded at offset=0x2d0000
    4096 bytes in size
/sys/kernel/debug/tracing/events/ext4_trim*/filter
/sys/kernel/debug/tracing/events/*/ext4_trim*/filter
Kernel buffer statistics:
  Note: "entries" are the entries left in the kernel ring buffer and are not
        recorded in the trace data. They should all be zero.

CPU: 0
entries: 0
overrun: 0
commit overrun: 0
bytes: 3368
oldest event ts:     2.475090
now ts:     2.500652
dropped events: 0
read events: 105

+ trace-cmd report
version = 6
cpus=1
          fstrim-189   [000]     2.475090: ext4_trim_all_free:   dev 8,3 group 0, start 0, len 32767
          fstrim-189   [000]     2.475092: ext4_trim_extent:     dev 8,3 group 0, start 10061, len 22707
          fstrim-189   [000]     2.475300: ext4_trim_all_free:   dev 8,3 group 1, start 0, len 32767
          fstrim-189   [000]     2.475301: ext4_trim_extent:     dev 8,3 group 1, start 316, len 58
          fstrim-189   [000]     2.475413: ext4_trim_extent:     dev 8,3 group 1, start 384, len 35
          fstrim-189   [000]     2.475501: ext4_trim_extent:     dev 8,3 group 1, start 453, len 1
          fstrim-189   [000]     2.475569: ext4_trim_extent:     dev 8,3 group 1, start 512, len 22
          fstrim-189   [000]     2.475648: ext4_trim_extent:     dev 8,3 group 1, start 1735, len 473
          fstrim-189   [000]     2.475937: ext4_trim_extent:     dev 8,3 group 1, start 9225, len 1
          fstrim-189   [000]     2.476008: ext4_trim_extent:     dev 8,3 group 1, start 9227, len 1
          fstrim-189   [000]     2.476076: ext4_trim_extent:     dev 8,3 group 1, start 9232, len 496
          fstrim-189   [000]     2.476250: ext4_trim_all_free:   dev 8,3 group 3, start 0, len 32767
          fstrim-189   [000]     2.476251: ext4_trim_extent:     dev 8,3 group 3, start 5384, len 8
          fstrim-189   [000]     2.476324: ext4_trim_extent:     dev 8,3 group 3, start 17748, len 684
          fstrim-189   [000]     2.476396: ext4_trim_extent:     dev 8,3 group 3, start 30731, len 7
          fstrim-189   [000]     2.476471: ext4_trim_extent:     dev 8,3 group 3, start 30803, len 1
          fstrim-189   [000]     2.476595: ext4_trim_all_free:   dev 8,3 group 4, start 0, len 32767
          fstrim-189   [000]     2.476596: ext4_trim_extent:     dev 8,3 group 4, start 1607, len 57
          fstrim-189   [000]     2.476665: ext4_trim_extent:     dev 8,3 group 4, start 1798, len 250
          fstrim-189   [000]     2.476736: ext4_trim_extent:     dev 8,3 group 4, start 17810, len 14
          fstrim-189   [000]     2.476809: ext4_trim_extent:     dev 8,3 group 4, start 17862, len 14906
          fstrim-189   [000]     2.485681: ext4_trim_all_free:   dev 8,3 group 5, start 0, len 32767
          fstrim-189   [000]     2.485683: ext4_trim_extent:     dev 8,3 group 5, start 316, len 32452
          fstrim-189   [000]     2.492399: ext4_trim_all_free:   dev 8,3 group 6, start 0, len 32767
          fstrim-189   [000]     2.492400: ext4_trim_extent:     dev 8,3 group 6, start 0, len 32768
          fstrim-189   [000]     2.492546: ext4_trim_all_free:   dev 8,3 group 7, start 0, len 32767
          fstrim-189   [000]     2.492547: ext4_trim_extent:     dev 8,3 group 7, start 316, len 32452
          fstrim-189   [000]     2.492665: ext4_trim_all_free:   dev 8,3 group 8, start 0, len 32767
          fstrim-189   [000]     2.492666: ext4_trim_extent:     dev 8,3 group 8, start 0, len 32768
          fstrim-189   [000]     2.492783: ext4_trim_all_free:   dev 8,3 group 9, start 0, len 32767
          fstrim-189   [000]     2.492784: ext4_trim_extent:     dev 8,3 group 9, start 316, len 32452
          fstrim-189   [000]     2.492897: ext4_trim_all_free:   dev 8,3 group 10, start 0, len 32767
          fstrim-189   [000]     2.492898: ext4_trim_extent:     dev 8,3 group 10, start 0, len 32768
          fstrim-189   [000]     2.493018: ext4_trim_all_free:   dev 8,3 group 11, start 0, len 32767
          fstrim-189   [000]     2.493019: ext4_trim_extent:     dev 8,3 group 11, start 0, len 32768
          fstrim-189   [000]     2.493132: ext4_trim_all_free:   dev 8,3 group 12, start 0, len 32767
          fstrim-189   [000]     2.493133: ext4_trim_extent:     dev 8,3 group 12, start 0, len 32768
          fstrim-189   [000]     2.493245: ext4_trim_all_free:   dev 8,3 group 13, start 0, len 32767
          fstrim-189   [000]     2.493246: ext4_trim_extent:     dev 8,3 group 13, start 0, len 32768
          fstrim-189   [000]     2.493359: ext4_trim_all_free:   dev 8,3 group 14, start 0, len 32767
          fstrim-189   [000]     2.493360: ext4_trim_extent:     dev 8,3 group 14, start 0, len 32768
          fstrim-189   [000]     2.493473: ext4_trim_all_free:   dev 8,3 group 15, start 0, len 32767
          fstrim-189   [000]     2.493473: ext4_trim_extent:     dev 8,3 group 15, start 0, len 32768
          fstrim-189   [000]     2.493586: ext4_trim_all_free:   dev 8,3 group 16, start 0, len 32767
          fstrim-189   [000]     2.493587: ext4_trim_extent:     dev 8,3 group 16, start 9816, len 1
          fstrim-189   [000]     2.493656: ext4_trim_extent:     dev 8,3 group 16, start 9818, len 22950
          fstrim-189   [000]     2.493892: ext4_trim_all_free:   dev 8,3 group 18, start 0, len 32767
          fstrim-189   [000]     2.493892: ext4_trim_extent:     dev 8,3 group 18, start 761, len 1
          fstrim-189   [000]     2.493972: ext4_trim_extent:     dev 8,3 group 18, start 793, len 231
          fstrim-189   [000]     2.494085: ext4_trim_extent:     dev 8,3 group 18, start 6767, len 1
          fstrim-189   [000]     2.494170: ext4_trim_extent:     dev 8,3 group 18, start 7694, len 1
          fstrim-189   [000]     2.494259: ext4_trim_extent:     dev 8,3 group 18, start 7700, len 1
          fstrim-189   [000]     2.494347: ext4_trim_extent:     dev 8,3 group 18, start 12252, len 1
          fstrim-189   [000]     2.494437: ext4_trim_extent:     dev 8,3 group 18, start 24218, len 1
          fstrim-189   [000]     2.494620: ext4_trim_all_free:   dev 8,3 group 19, start 0, len 32767
          fstrim-189   [000]     2.494621: ext4_trim_extent:     dev 8,3 group 19, start 7693, len 1
          fstrim-189   [000]     2.494702: ext4_trim_extent:     dev 8,3 group 19, start 7715, len 1
          fstrim-189   [000]     2.494791: ext4_trim_extent:     dev 8,3 group 19, start 8980, len 108
          fstrim-189   [000]     2.494948: ext4_trim_extent:     dev 8,3 group 19, start 9147, len 37
          fstrim-189   [000]     2.495070: ext4_trim_extent:     dev 8,3 group 19, start 9190, len 4
          fstrim-189   [000]     2.495157: ext4_trim_extent:     dev 8,3 group 19, start 9893, len 349
          fstrim-189   [000]     2.495453: ext4_trim_extent:     dev 8,3 group 19, start 10245, len 507
          fstrim-189   [000]     2.495556: ext4_trim_extent:     dev 8,3 group 19, start 10753, len 22015
          fstrim-189   [000]     2.495739: ext4_trim_all_free:   dev 8,3 group 20, start 0, len 32767
          fstrim-189   [000]     2.495739: ext4_trim_extent:     dev 8,3 group 20, start 799, len 31969
          fstrim-189   [000]     2.495915: ext4_trim_all_free:   dev 8,3 group 21, start 0, len 32767
          fstrim-189   [000]     2.495916: ext4_trim_extent:     dev 8,3 group 21, start 0, len 32768
          fstrim-189   [000]     2.496098: ext4_trim_all_free:   dev 8,3 group 22, start 0, len 32767
          fstrim-189   [000]     2.496099: ext4_trim_extent:     dev 8,3 group 22, start 0, len 32768
          fstrim-189   [000]     2.496270: ext4_trim_all_free:   dev 8,3 group 23, start 0, len 32767
          fstrim-189   [000]     2.496271: ext4_trim_extent:     dev 8,3 group 23, start 0, len 32768
          fstrim-189   [000]     2.496449: ext4_trim_all_free:   dev 8,3 group 24, start 0, len 32767
          fstrim-189   [000]     2.496450: ext4_trim_extent:     dev 8,3 group 24, start 0, len 32768
          fstrim-189   [000]     2.496613: ext4_trim_all_free:   dev 8,3 group 25, start 0, len 32767
          fstrim-189   [000]     2.496614: ext4_trim_extent:     dev 8,3 group 25, start 316, len 32452
          fstrim-189   [000]     2.496780: ext4_trim_all_free:   dev 8,3 group 26, start 0, len 32767
          fstrim-189   [000]     2.496781: ext4_trim_extent:     dev 8,3 group 26, start 0, len 32768
          fstrim-189   [000]     2.496952: ext4_trim_all_free:   dev 8,3 group 27, start 0, len 32767
          fstrim-189   [000]     2.496953: ext4_trim_extent:     dev 8,3 group 27, start 316, len 32452
          fstrim-189   [000]     2.497138: ext4_trim_all_free:   dev 8,3 group 28, start 0, len 32767
          fstrim-189   [000]     2.497139: ext4_trim_extent:     dev 8,3 group 28, start 0, len 32768
          fstrim-189   [000]     2.497341: ext4_trim_all_free:   dev 8,3 group 29, start 0, len 32767
          fstrim-189   [000]     2.497342: ext4_trim_extent:     dev 8,3 group 29, start 0, len 32768
          fstrim-189   [000]     2.497530: ext4_trim_all_free:   dev 8,3 group 30, start 0, len 32767
          fstrim-189   [000]     2.497531: ext4_trim_extent:     dev 8,3 group 30, start 0, len 32768
          fstrim-189   [000]     2.497711: ext4_trim_all_free:   dev 8,3 group 31, start 0, len 32767
          fstrim-189   [000]     2.497712: ext4_trim_extent:     dev 8,3 group 31, start 0, len 32768
          fstrim-189   [000]     2.497873: ext4_trim_all_free:   dev 8,3 group 32, start 0, len 32767
          fstrim-189   [000]     2.497874: ext4_trim_extent:     dev 8,3 group 32, start 8, len 8
          fstrim-189   [000]     2.497960: ext4_trim_extent:     dev 8,3 group 32, start 24, len 8
          fstrim-189   [000]     2.498037: ext4_trim_extent:     dev 8,3 group 32, start 4056, len 28712
          fstrim-189   [000]     2.498235: ext4_trim_all_free:   dev 8,3 group 33, start 0, len 32767
          fstrim-189   [000]     2.498237: ext4_trim_extent:     dev 8,3 group 33, start 0, len 32768
          fstrim-189   [000]     2.498433: ext4_trim_all_free:   dev 8,3 group 34, start 0, len 32767
          fstrim-189   [000]     2.498434: ext4_trim_extent:     dev 8,3 group 34, start 0, len 32768
          fstrim-189   [000]     2.498612: ext4_trim_all_free:   dev 8,3 group 35, start 0, len 32767
          fstrim-189   [000]     2.498613: ext4_trim_extent:     dev 8,3 group 35, start 0, len 32768
          fstrim-189   [000]     2.498780: ext4_trim_all_free:   dev 8,3 group 36, start 0, len 32767
          fstrim-189   [000]     2.498781: ext4_trim_extent:     dev 8,3 group 36, start 0, len 32768
          fstrim-189   [000]     2.498972: ext4_trim_all_free:   dev 8,3 group 37, start 0, len 32767
          fstrim-189   [000]     2.498973: ext4_trim_extent:     dev 8,3 group 37, start 0, len 32768
          fstrim-189   [000]     2.499219: ext4_trim_all_free:   dev 8,3 group 38, start 0, len 32767
          fstrim-189   [000]     2.499221: ext4_trim_extent:     dev 8,3 group 38, start 0, len 32768
          fstrim-189   [000]     2.499397: ext4_trim_all_free:   dev 8,3 group 39, start 0, len 9471
          fstrim-189   [000]     2.499398: ext4_trim_extent:     dev 8,3 group 39, start 0, len 9472



-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-11 23:30             ` Richard W.M. Jones
@ 2014-03-12 10:17               ` Richard W.M. Jones
  2014-03-12 13:42                 ` Richard W.M. Jones
  2014-03-12 18:10                 ` Paolo Bonzini
  0 siblings, 2 replies; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-12 10:17 UTC (permalink / raw)
  To: Eric Sandeen, pbonzini; +Cc: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 1885 bytes --]

On Tue, Mar 11, 2014 at 11:30:47PM +0000, Richard W.M. Jones wrote:
> On Tue, Mar 11, 2014 at 06:09:28PM -0500, Eric Sandeen wrote:
> > On Tue, Mar 11, 2014, Richard W.M. Jones wrote:
> > > However just the act of doing the tracing *caused* the trim to happen
> > > properly in the underlying disk.
> > 
> > that sounds very strange...
> 
> Thanks Eric.
> 
> FYI the libguestfs / virt-sparsify patch series that motivates this is
> here:
> 
> https://www.redhat.com/archives/libguestfs/2014-March/thread.html#00091
> 
> Even with the greatly reduced set of traces (see attached), just the
> act of tracing seems to have made trimming work properly.  The output
> file has been trimmed properly from 926 MB to 819 MB:

I did a bit more testing on this.

It appears we are sure that the ext4 ioctl FITRIM is sending discard
requests.

However fstrim doesn't happen reliably.

  fstrim + blktrace       works reliably
  fstrim + fsync          unreliable, usually fails to trim
  fstrim + sync           unreliable, usually fails to trim
  fstrim + umount         unreliable, usually fails to trim
  fstrim + sleep 10       unreliable, usually fails to trim
  ( fstrim + sleep 10 ) x 3  unreliable, usually fails to trim
  fstrim on its own       unreliable, usually fails to trim

Somewhere, the discard requests are disappearing in the stack (or more
likely, being delayed).  blktrace/trace-cmd somehow forces them out.
But fsync/sync/umount/sleep does not.  They might be stuck in qemu too ...

Is there any further test I can try here?

Is there a way to force out discard requests?

qemu cache mode is set to writeback.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

[-- Attachment #2: test-fstrim.pl --]
[-- Type: application/x-perl, Size: 3806 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-12 10:17               ` Richard W.M. Jones
@ 2014-03-12 13:42                 ` Richard W.M. Jones
  2014-03-12 18:10                 ` Paolo Bonzini
  1 sibling, 0 replies; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-12 13:42 UTC (permalink / raw)
  To: Eric Sandeen, pbonzini; +Cc: linux-ext4

Well, it turns out that the bug is mine.

I had libguestfs trimming the wrong filesystem :-(

Anyway I can report that fstrim works reliably, virt-sparsify now
supports an --in-place option, the world is good.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-12 10:17               ` Richard W.M. Jones
  2014-03-12 13:42                 ` Richard W.M. Jones
@ 2014-03-12 18:10                 ` Paolo Bonzini
  2014-03-12 18:24                   ` Richard W.M. Jones
  1 sibling, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2014-03-12 18:10 UTC (permalink / raw)
  To: Richard W.M. Jones, Eric Sandeen; +Cc: linux-ext4

Il 12/03/2014 11:17, Richard W.M. Jones ha scritto:
> Somewhere, the discard requests are disappearing in the stack (or more
> likely, being delayed).  blktrace/trace-cmd somehow forces them out.
> But fsync/sync/umount/sleep does not.  They might be stuck in qemu too ...

No, this I can be quite sure about.  QEMU sends them as soon as they are 
received in the SCSI layer.  If they were ill-formed, QEMU would fail 
them.  If they got stuck, sooner or later you'd not be able to do I/O 
anymore (there is a queue depth limit) and the guest would start getting 
timeouts.

Also, certainly blktrace would have no effect on QEMU.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: fstrim has no effect on a just-mounted filesystem
  2014-03-12 18:10                 ` Paolo Bonzini
@ 2014-03-12 18:24                   ` Richard W.M. Jones
  0 siblings, 0 replies; 13+ messages in thread
From: Richard W.M. Jones @ 2014-03-12 18:24 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Eric Sandeen, linux-ext4

On Wed, Mar 12, 2014 at 07:10:42PM +0100, Paolo Bonzini wrote:
> Il 12/03/2014 11:17, Richard W.M. Jones ha scritto:
> >Somewhere, the discard requests are disappearing in the stack (or more
> >likely, being delayed).  blktrace/trace-cmd somehow forces them out.
> >But fsync/sync/umount/sleep does not.  They might be stuck in qemu too ...
> 
> No, this I can be quite sure about.  QEMU sends them as soon as they
> are received in the SCSI layer.  If they were ill-formed, QEMU would
> fail them.  If they got stuck, sooner or later you'd not be able to
> do I/O anymore (there is a queue depth limit) and the guest would
> start getting timeouts.
> 
> Also, certainly blktrace would have no effect on QEMU.

Yup, it was completely a bug at my end.  Now that is fixed,
fstrim works perfectly.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-03-12 18:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-11 21:39 fstrim has no effect on a just-mounted filesystem Richard W.M. Jones
2014-03-11 21:47 ` Eric Sandeen
2014-03-11 22:00   ` Richard W.M. Jones
2014-03-11 22:04     ` Richard W.M. Jones
2014-03-11 22:08     ` Eric Sandeen
2014-03-11 22:59       ` Richard W.M. Jones
2014-03-11 23:07         ` Richard W.M. Jones
2014-03-11 23:09           ` Eric Sandeen
2014-03-11 23:30             ` Richard W.M. Jones
2014-03-12 10:17               ` Richard W.M. Jones
2014-03-12 13:42                 ` Richard W.M. Jones
2014-03-12 18:10                 ` Paolo Bonzini
2014-03-12 18:24                   ` Richard W.M. Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.