All of lore.kernel.org
 help / color / mirror / Atom feed
From: liubo <liubo2009@cn.fujitsu.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Josef Bacik <josef@redhat.com>
Subject: Re: [RFC PATCH] Btrfs: do not flush csum items of unchanged file data during treelog
Date: Mon, 25 Apr 2011 17:58:05 +0800	[thread overview]
Message-ID: <4DB545AD.5050908@cn.fujitsu.com> (raw)
In-Reply-To: <1303435579-sup-6101@think>

On 04/22/2011 09:28 AM, Chris Mason wrote:
> Excerpts from Li Zefan's message of 2011-04-21 20:55:40 -0400:
>> Chris Mason wrote:
>>> Excerpts from liubo's message of 2011-04-21 03:58:21 -0400:
>>>> The current code relogs the entire inode every time during fsync log,
>>>> and it is much better suited to small files rather than large ones.
>>>>
>>>> During my performance test, the fsync performace of large files sucks,
>>>> and we can ascribe this to the tremendous amount of csum infos of the
>>>> large ones, cause we have to flush all of these csum infos into log trees
>>>> even when there are only _one_ change in the whole file data.  Apparently,
>>>> to optimize fsync, we need to create a filter to skip the unnecessary csum
>>>> ones, that is, the corresponding file data remains unchanged before this fsync.
>>>>
>>>> Here I have some test results to show, I use sysbench to do "random write + fsync".
>>>>
>>>> Sysbench args:
>>>>   - Number of threads: 1
>>>>   - Extra file open flags: 0
>>>>   - 2 files, 4Gb each
>>>>   - Block size 4Kb
>>>>   - Number of random requests for random IO: 10000
>>>>   - Read/Write ratio for combined random IO test: 1.50
>>>>   - Periodic FSYNC enabled, calling fsync() each 100 requests.
>>>>   - Calling fsync() at the end of test, Enabled.
>>>>   - Using synchronous I/O mode
>>>>   - Doing random write test
>>>>
>>>> Sysbench results:
>>>> ===
>>>>    Operations performed:  0 Read, 10000 Write, 200 Other = 10200 Total
>>>>    Read 0b  Written 39.062Mb  Total transferred 39.062Mb
>>>> ===
>>>> a) without patch:  (*SPEED* : 451.01Kb/sec)
>>>>    112.75 Requests/sec executed
>>>>
>>>> b) with patch:     (*SPEED* : 5.1537Mb/sec)
>>>>    1319.34 Requests/sec executed
>>> Really nice results! Especially considering the small size of the patch.
>>>
>>> But, I'd really like to look at using sub transaction ids for this, and
>>> then logging just the part of the inode that had changed since the last
>>> log commit.  It's more complex, but will also help reduce tree searches
>>> for the file items.
>>>
>> And this patch forgot to mention it has compatability issue.
> 
> Right, at the very least we want to just use one bit of that field
> instead of all 8.  But keeping a sub-transid and putting that in the
> generation field of the file extent instead can get us the same benefits
> without stealing the bits.
> 

Nice.  This is the first step of my plan.

> As we push the sub transid into the btree blocks as well, we'll get much
> faster tree walks too.  The penalty is in complexity in the logging
> code, since it will have to deal with finding extents in the log tree
> and merging in the new extents from the file.

I've been thinking of this extent buffer with sub transid stuff for a while,
and will give it a try. :)

thanks,
liubo.

> 
> -chris
> 


  reply	other threads:[~2011-04-25  9:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-21  7:58 [RFC PATCH] Btrfs: do not flush csum items of unchanged file data during treelog liubo
2011-04-21 13:16 ` Chris Mason
2011-04-22  0:55   ` Li Zefan
2011-04-22  1:28     ` Chris Mason
2011-04-25  9:58       ` liubo [this message]
2011-10-25 23:18         ` Myroslav Opyr
2011-10-26  1:12           ` Liu Bo
     [not found] <4DAD7957.6070505@cn.fujitsu.com>
     [not found] ` <4DAE3787.8050602@cn.fujitsu.com>
     [not found]   ` <4DAE9C00.2020705@cn.fujitsu.com>
2011-05-06  2:36     ` liubo
2011-05-06 12:51       ` Josef Bacik
2011-05-06 14:59       ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DB545AD.5050908@cn.fujitsu.com \
    --to=liubo2009@cn.fujitsu.com \
    --cc=chris.mason@oracle.com \
    --cc=josef@redhat.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.