From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Phillips <daniel@phunq.net>
Subject: Re: xfs: does mkfs.xfs require fancy switches to get decent
 performance? (was Tux3 Report: How fast can we fsync?)
Date: Wed, 13 May 2015 05:41:44 -0700
Message-ID: <55534688.5010608@phunq.net>
References: <20150430002008.GY15810@dastard>
 <b1c4315e-f7d5-4081-957b-e58feff4a64b@phunq.net>
 <1430395641.3180.94.camel@gmail.com>
 <eb31c569-259c-4814-9fee-69a36fc518dc@phunq.net>
 <1430401693.3180.131.camel@gmail.com> <55423732.2070509@phunq.net>
 <55423C05.1000506@symas.com> <554246D7.40105@phunq.net>
 <20150511221223.GD4434@amd> <555140E6.1070409@phunq.net>
 <20150513072539.GC19700@amd> <555335FA.3060005@phunq.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: Theodore Ts'o <tytso@mit.edu>, Howard Chu <hyc@symas.com>,
 Dave Chinner <david@fromorbit.com>, linux-kernel@vger.kernel.org,
 Mike Galbraith <umgwanakikbuti@gmail.com>, tux3@tux3.org,
 linux-fsdevel@vger.kernel.org, OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
To: Pavel Machek <pavel@ucw.cz>
Return-path: <tux3-bounces@phunq.net>
In-Reply-To: <555335FA.3060005@phunq.net>
List-Unsubscribe: <http://phunq.net/mailman/options/tux3>,
 <mailto:tux3-request@phunq.net?subject=unsubscribe>
List-Archive: <http://phunq.net/pipermail/tux3/>
List-Post: <mailto:tux3@phunq.net>
List-Help: <mailto:tux3-request@phunq.net?subject=help>
List-Subscribe: <http://phunq.net/mailman/listinfo/tux3>,
 <mailto:tux3-request@phunq.net?subject=subscribe>
Errors-To: tux3-bounces@phunq.net
Sender: "Tux3" <tux3-bounces@phunq.net>
List-Id: linux-fsdevel.vger.kernel.org

On 05/13/2015 04:31 AM, Daniel Phillips wrote:
Let me be the first to catch that arithmetic error....

> Let's say our delta size is 400MB (typical under load) and we leave
> a "nice big gap" of 112 MB after flushing each one. Let's say we do
> two thousand of those before deciding that we have enough information
> available to switch to some smarter strategy. We used one GB of a
> a 4TB disk, say. The media transfer rate decreased by a factor of:
> 
>     (1 - 2/1000) = .2%.

Ahem, no, we used 1/8th of the disk. The time/data rate increased
from unity to 1.125, for an average of 1.0625 across the region.
If we only use 1/10th of the disk instead, by not leaving gaps,
then the average time/data across the region is 1.05. The
difference is, 1.0625 - 1.05, so the gap strategy increases media
transfer time by 1.25%, which is not significant compared to the
performance deficit in question of 400%. So, same argument:
change in media transfer rate is just a distraction from the
original question.

In any case, we probably want to start using a smarter strategy
sooner than 1000 commits, maybe after ten or a hundred commits,
which would make the change in media transfer rate even less
relevant.

The thing is, when data first starts landing on media, we do not
have much information about what the long term load will be. So
just analyze the clues we have in the early commits and put those
early deltas onto disk in the most efficient format, which for
Tux3 seems to be linear per delta. There would be exceptions, but
that is the common case.

Then get smarter later. The intent is to get the best of both:
early efficiency, and long term nice aging behavior. I do not
accept the proposition that one must be sacrificed for the
other, I find that reasoning faulty.

> The performance deficit in question and the difference in media rate are
> three orders of magnitude apart, does that justify the term "similar or
> identical?".

Regards,

Daniel