From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [patch 6/6] mm: fsync livelock avoidance Date: Thu, 11 Dec 2008 23:59:06 +0100 Message-ID: <20081211225906.GF8294@wotan.suse.de> References: <20081210072454.GB27096@wotan.suse.de> <20081210074209.GG27096@wotan.suse.de> <20081211135111.cada5b8b.akpm@linux-foundation.org> <20081211223213.GC8294@wotan.suse.de> <20081211144502.28ab9036.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org, mpatocka@redhat.com To: Andrew Morton Return-path: Received: from cantor.suse.de ([195.135.220.2]:47509 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755003AbYLKW7I (ORCPT ); Thu, 11 Dec 2008 17:59:08 -0500 Content-Disposition: inline In-Reply-To: <20081211144502.28ab9036.akpm@linux-foundation.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, Dec 11, 2008 at 02:45:02PM -0800, Andrew Morton wrote: > On Thu, 11 Dec 2008 23:32:13 +0100 > Nick Piggin wrote: > > > The livelock behaviour? (or the error propagation). > > > > I first heard about it from Mikulas, where some dm tool locks up because > > it does direct IO on the block device of mounted filesystem (or something > > like that). > > Does it actually lock up? Or does it just take a loooong time? Just takes a looong time. > Presumably it can be worked around in userspace. Depends. It's very easy to get stuck behind a dirtier, but OTOH, do many applications do such a thing? I simply don't know. Livelock due to sys_sync would be easier (because then you have lots of other apps all doing their own thing which you aren't in control of, but I don't want to optimise for apps calling sys_sync, and the sync command basically should only be for testing). Point is, I don't know :) Probably not a bad idea to leave it out and then think about it again if anybody cares. > > That case is actually mostly solved by my first ptach in the > > series. > > mm-direct-io-starvation-improvement.patch? I guess that would help > a lot. I can't imagine why we didn't do that years ago??? > > Can we please determine whether that optimisation was sufficient > for Mikulas's example? Yes, that patch. I was able to reproduce the problem here, and that did solve it for me, but yes confirming it with the actual dm tools would be nice. > > > Why fix it? > > > > Good question. My earlier patches already in your tree removed some starvation > > avoidance code because they were breaking data integrity semantics. So in > > theory, your tree today is more susceptible to this sync/fsync starvation > > than mainline. I care most about the correctness, and it would be great if > > nobody cares about this starvation problem so we don't need the extra > > complexity. > > Yes, it does add quite a bit of complexity and more code. It'd be good > if we could find some way of avoiding merging it. Well, don't merge it yet then. Oh, but we still have this fsync error may not get propagated problem... I guess that should be solved in a broken out patch anyway.