From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Pratt Subject: Re: Updated performance results Date: Thu, 23 Jul 2009 17:04:49 -0500 Message-ID: <4A68DE81.3020505@dangyankee.net> References: <4A68AD69.4030803@dangyankee.net> <20090723210051.GB1040@think> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed To: Chris Mason , Steven Pratt , linux-btrfs Return-path: In-Reply-To: <20090723210051.GB1040@think> List-ID: Chris Mason wrote: > On Thu, Jul 23, 2009 at 01:35:21PM -0500, Steven Pratt wrote: > >> I have re-run the raid tests with re-creating the fileset between each >> of the random write workloads and performance does now match the >> previous newformat results. The bad news is that the huge gain that I >> had attributed to the newformat release, does not really exist. All of >> the previous results(except for the newformat run) were not re-creating >> the fileset, so the gain in performance was due only to having a fresh >> set of files, not any code changes. >> > > Thanks for doing all of these runs. This is still a little different > than what I have here, my initial runs are very very fast and after 10 > or so level out to a relatively low performance on random writes. With > nodatacow, it stays even. > > Right, I do not see this problem with nodatacow. >> So, I have done 2 new sets of runs to look into this further. One is a 3 >> hour run of single threaded random write to the RAID system. I have >> compared this to ext3. Performance results are here: >> http://btrfs.boxacle.net/repository/raid/longwrite/longwrite/Longrandomwrite.html >> >> and graphing of all the iostat data can be found here: >> >> http://btrfs.boxacle.net/repository/raid/longwrite/summary.html >> >> The iostat graphs for btrfs are interesting for a number of reasons. >> First, it takes about 3000 seconds (or 50 minutes) for btrfs to reach >> steady state. Second, if you look at write throughput from the device >> view vs. the btrfs/application view, we see that for a application >> throughput of 21.5MB/sec it requires 63MB/sec of actual disk writes. >> That is an overhead of 3 to 1 vs an overhead of ~0 for ext3. Also, >> looking at the change in iops vs MB/sec, we see that while btrfs starts >> out with reasonable size IOs, it quickly deteriorate to an average IO >> size of only 13kb. Remember, the starting file set is only 100GB on a >> 2.1TB filesystem, and all data is overwrite, and this is single >> threaded, so there is no reason this should fragment. It seems like the >> allocator is having a problem doing sequential allocations. >> > > There are two things happening. First the default allocation scheme > isn't very well suited to this, mount -o ssd will perform better. But > over the long term, random overwrites to the file cause a lot of writes > to the extent allocation tree. That's really what -o nodatacow is > saving us. There are optimizations we can do, but we're holding off on > that in favor of enospc and other pressing things. > Well I have -o ssd data that I can upload, but it was worse than without. I do understand about timing and priorities. > But, with all of that said, Josef has some really important allocator > improvements. I've put them out along with our pending patches into the > experimental branch of the btrfs-unstable tree. Could you please give > this branch a try both with and without the ssd mount option? > > Sure, will try to get to it tomorrow. Steve > -chris > >