All of lore.kernel.org
 help / color / mirror / Atom feed
* compression btrfs
@ 2013-03-26  4:03 lonat_front
  2013-03-26 13:14 ` Josef Bacik
  0 siblings, 1 reply; 4+ messages in thread
From: lonat_front @ 2013-03-26  4:03 UTC (permalink / raw)
  To: linux-btrfs

Hi everyone,

  I have used btrfs as a work partition with compression=zlib. The compression ratio is not satisfied to me. 

   I tracked my workloads in btrfs. The zlib module (zlib.c) seems work well: write size of each write operation in writepage function can be compressed into about 20%. 

  I suspent the workloads may impact the btrfs behavior. My workloads include really a large number of overwrite operations. 

   I briefly reviewed the code about the space reclaim in btrfs, and found the btrfs kicks the defrag off when the overwritten range is smaller than 16KB, And this is the only method of reclaiming freed extents with compression. Am I right?
   
   So my question is if btrfs can successfully reclaim the overwritten space when the cleaner thread can not be started, such as in the case that each overwrite operation is larger than 16KB? 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: compression btrfs
  2013-03-26  4:03 compression btrfs lonat_front
@ 2013-03-26 13:14 ` Josef Bacik
       [not found]   ` <57473f27.23ac0.13da786afad.Coremail.lonat_front@163.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2013-03-26 13:14 UTC (permalink / raw)
  To: lonat_front; +Cc: linux-btrfs

On Mon, Mar 25, 2013 at 10:03:20PM -0600, lonat_front@163.com wrote:
> Hi everyone,
> 
>   I have used btrfs as a work partition with compression=zlib. The compression ratio is not satisfied to me. 
> 

So you probably want compress-force=zlib.  With just compress we will bail out
of the compression if the compressed pages are larger than the original size,
which means if you wrote a particular file and then copmressed it with gzip
you'd possibly see different results, but if you do compress-force=zlib then
you'll see behavior more like gzip.

>    I tracked my workloads in btrfs. The zlib module (zlib.c) seems work well: write size of each write operation in writepage function can be compressed into about 20%. 
> 
>   I suspent the workloads may impact the btrfs behavior. My workloads include really a large number of overwrite operations. 
> 
>    I briefly reviewed the code about the space reclaim in btrfs, and found the btrfs kicks the defrag off when the overwritten range is smaller than 16KB, And this is the only method of reclaiming freed extents with compression. Am I right?

It's 64k, and what do you mean reclaiming freed extents?  The freed extents will
be reclaimed once they are completely overwritten.

>    
>    So my question is if btrfs can successfully reclaim the overwritten space when the cleaner thread can not be started, such as in the case that each overwrite operation is larger than 16KB? 

Not sure what you mean by reclaim.  They won't be defragged if the overwrite is
above 64k, but if any write is less than 64k then it will defrag the whole file.
Thanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: compression btrfs
       [not found]   ` <57473f27.23ac0.13da786afad.Coremail.lonat_front@163.com>
@ 2013-03-26 18:03     ` Josef Bacik
  2013-03-26 18:18       ` yiletian
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2013-03-26 18:03 UTC (permalink / raw)
  To: yiletian; +Cc: Josef Bacik, linux-btrfs

On Tue, Mar 26, 2013 at 10:27:34AM -0600, yiletian wrote:
> Yes, I use compress-force=zlib for my partition.
> 
> Consider this scenario.
> 
> We first write a file with size of 256KB. Assume all data is compressed to 128KB size,
> btrfs create a extent item in extent-tree to record the 128KB disk range  (named E).
> and btrfs also creates a single file extent to records the disk range of E.
> 
> Then we overwrite from 16KB to the end of file, with size of 240KB.
> Btrfs will create a new file extent for the overwritten range.
> That is, the file has two file extents: the first one is to record the first 16KB and the second one record the remaining 240KB.
> 
> Then we are in a dilemma:
> 1. the first one only occupies a disk range of 16KB, but entire E is reserved for it. This is because the __btrfs_drop_exte nts function do not decrease the number of back refs of E.
> 2. because the overwritten range is large enough, the compress_file_range does not  call btrfs_add_inode_defrag to kick off a defrag for the file automatically.
> 
> With this dilemma,  how can btrfs reclaim the 112KB disk range (at least) recorded in E.
> 

Oh yeah welcome to btrfs, you must be new here ;).  So yeah this is the way it
works, until we overwrite the entire extent we don't reclaim any of the space.
This includes the "prealloc an 8 gig vm image and then random write inside of
it" workload, you could end up using up to 16gb in the worst case scenario.  The
thing we could do to fix this would be to instead of splitting the file extents
and then inc'ing the ref of the original extent we instead split the extent ref
as well, so we can reclaim this space.  It's on my list of things to do down the
road, but it keeps getting supplanted by other priorities.  THanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re:Re: Re: compression btrfs
  2013-03-26 18:03     ` Josef Bacik
@ 2013-03-26 18:18       ` yiletian
  0 siblings, 0 replies; 4+ messages in thread
From: yiletian @ 2013-03-26 18:18 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

I think the biggest problem is how we can reclaim the space when the extent is a compressed one.
In this case, we may need to read and decompress data in the extent, and then compress the valid range to generate a new extent.
Is this process a performance killer?
At 2013-03-27 02:03:57,"Josef Bacik" <jbacik@fusionio.com> wrote:
>On Tue, Mar 26, 2013 at 10:27:34AM -0600, yiletian wrote:
>> Yes, I use compress-force=zlib for my partition.
>> 
>> Consider this scenario.
>> 
>> We first write a file with size of 256KB. Assume all data is compressed to 128KB size,
>> btrfs create a extent item in extent-tree to record the 128KB disk range  (named E).
>> and btrfs also creates a single file extent to records the disk range of E.
>> 
>> Then we overwrite from 16KB to the end of file, with size of 240KB.
>> Btrfs will create a new file extent for the overwritten range.
>> That is, the file has two file extents: the first one is to record the first 16KB and the second one record the remaining 240KB.
>> 
>> Then we are in a dilemma:
>> 1. the first one only occupies a disk range of 16KB, but entire E is reserved for it. This is because the __btrfs_drop_exte nts function do not decrease the number of back refs of E.
>> 2. because the overwritten range is large enough, the compress_file_range does not  call btrfs_add_inode_defrag to kick off a defrag for the file automatically.
>> 
>> With this dilemma,  how can btrfs reclaim the 112KB disk range (at least) recorded in E.
>> 
>
>Oh yeah welcome to btrfs, you must be new here ;).  So yeah this is the way it
>works, until we overwrite the entire extent we don't reclaim any of the space.
>This includes the "prealloc an 8 gig vm image and then random write inside of
>it" workload, you could end up using up to 16gb in the worst case scenario.  The
>thing we could do to fix this would be to instead of splitting the file extents
>and then inc'ing the ref of the original extent we instead split the extent ref
>as well, so we can reclaim this space.  It's on my list of things to do down the
>road, but it keeps getting supplanted by other priorities.  THanks,
>
>Josef


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-03-26 18:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-26  4:03 compression btrfs lonat_front
2013-03-26 13:14 ` Josef Bacik
     [not found]   ` <57473f27.23ac0.13da786afad.Coremail.lonat_front@163.com>
2013-03-26 18:03     ` Josef Bacik
2013-03-26 18:18       ` yiletian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.