From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Boxer Subject: Re: [PATCH 10/16] fuse: Implement writepages callback Date: Tue, 6 Aug 2013 11:26:40 -0500 Message-ID: <088AF050-3C88-4FBD-9004-33C7AFFC1517@gmail.com> References: <20130629172211.20175.70154.stgit@maximpc.sw.ru> <20130629174525.20175.18987.stgit@maximpc.sw.ru> <20130719165037.GA18358@tucsk.piliscsaba.szeredi.hu> <51FBD2DF.50506@parallels.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="520123c0_507ed7ab_13e" Cc: James Bottomley , devel , Kirill Korotaev , Brian Foster , linux-mm , Kernel Mailing List , Linux-Fsdevel , fuse-devel , riel , Pavel Emelianov , Al Viro , Maxim Patlasov , Andrew Morton , fengguang wu , Mel Gorman To: Miklos Szeredi Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org --520123c0_507ed7ab_13e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Ok On =46ri, Aug 2, 2013 at 5:40 PM, Maxim Patlasov wrote: > 07/19/2013 08:50 PM, Miklos Szeredi =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > >> On Sat, Jun 29, 2013 at 09:45:29PM +0400, Maxim Patlasov wrote: >>> >>> =46rom: Pavel Emelyanov >>> >>> The .writepages one is required to make each writeback request carry = more >>> than >>> one page on it. The patch enables optimized behaviour unconditionally= , >>> i.e. mmap-ed writes will benefit from the patch even if >>> fc->writeback=5Fcache=3D0. >> >> I rewrote this a bit, so we won't have to do the thing in two passes, = >> which >> makes it simpler and more robust. Waiting for page writeback here is = >> wrong >> anyway, see comment above fuse=5Fpage=5Fmkwrite(). BTW we had a race = there >> because >> fuse=5Fpage=5Fmkwrite() didn't take the page lock. I've also fixed th= at up >> and >> pushed a series containing these patches up to implementing ->writepag= es() >> to >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git >> writepages >> >> Passed some trivial testing but more is needed. > > > Thanks a lot for efforts. The approach you implemented looks promising,= but > it introduces the following assumption: a page cannot become dirty befo= re we > have a chance to wait on fuse writeback holding the page locked. This i= s > already true for mmap-ed writes (due to your fixes) and it seems doable= for > cached writes as well (like we do in fuse=5Fperform=5Fwrite). But the a= ssumption > seems to be broken in case of direct read from local fs (e.g. ext4) to = a > memory region mmap-ed to a file on fuse fs. See how dio=5Fbio=5Fsubmit(= ) marks > pages dirty by bio=5Fset=5Fpages=5Fdirty(). I can't see any solution fo= r this > use-case. Do you=3F Hmm. Direct IO on an mmaped file will do get=5Fuser=5Fpages() which will= do the necessary page fault magic and ->page=5Fmkwrite() will be called. = At least A=46AICS. The page cannot become dirty through a memory mapping without first switching the pte from read-only to read-write first. Page accounting logic relies on this too. The other way the page can become dirty is through write(2) on the fs. But we do get notified about that too. Thanks, Miklos -- To unsubscribe from this list: send the line =22unsubscribe linux-kernel=22= in the body of a message to majordomo=40vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the =46AQ at http://www.tux.org/lkml/ --520123c0_507ed7ab_13e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Ok<= /span>
On Tue, Aug 06, 2013 at 11:25 AM, Miklos Szeredi wrote:
On =46ri, Aug 2,= 2013 at 5:40 PM, Maxim Patlasov wrote:
&g= t; 07/19/2013 08:50 PM, Miklos Szeredi =D0=BF=D0=B8=D1=88=D0=B5=D1=82:>
>> On Sat, Jun 29, 2013 at 09:45:29PM +0400, Maxim Patlaso= v wrote:
>>>
>>> =46rom: Pavel Emelyanov
>>>
>>> The .writepages one is requir= ed to make each writeback request carry more
>>> than
>= >> one page on it. The patch enables optimized behaviour unconditio= nally,
>>> i.e. mmap-ed writes will benefit from the patch ev= en if
>>> fc->writeback=5Fcache=3D0.
>>
>&g= t; I rewrote this a bit, so we won't have to do the thing in two passes,<= br>>> which
>> makes it simpler and more robust. Waiting = for page writeback here is
>> wrong
>> anyway, see comm= ent above fuse=5Fpage=5Fmkwrite(). BTW we had a race there
>> b= ecause
>> fuse=5Fpage=5Fmkwrite() didn't take the page lock. I'= ve also fixed that up
>> and
>> pushed a series contain= ing these patches up to implementing ->writepages()
>> to
= >>
>> git://git.kernel.org/pub/scm/linux/kernel/git/msz= eredi/fuse.git
>> writepages
>>
>> Passed some= trivial testing but more is needed.
>
>
> Thanks a lot= for efforts. The approach you implemented looks promising, but
> i= t introduces the following assumption: a page cannot become dirty before = we
> have a chance to wait on fuse writeback holding the page locke= d. This is
> already true for mmap-ed writes (due to your fixes) an= d it seems doable for
> cached writes as well (like we do in fuse=5F= perform=5Fwrite). But the assumption
> seems to be broken in case o= f direct read from local fs (e.g. ext4) to a
> memory region mmap-e= d to a file on fuse fs. See how dio=5Fbio=5Fsubmit() marks
> pages = dirty by bio=5Fset=5Fpages=5Fdirty(). I can't see any solution for this> use-case. Do you=3F

Hmm. Direct IO on an mmaped file will = do get=5Fuser=5Fpages() which will
do the necessary page fault magic a= nd ->page=5Fmkwrite() will be called.
At least A=46AICS.

The= page cannot become dirty through a memory mapping without first
switc= hing the pte from read-only to read-write first. Page accounting
logi= c relies on this too. The other way the page can become dirty is
thro= ugh write(2) on the fs. But we do get notified about that too.

Th= anks,
Miklos
--
To unsubscribe from this list: send the line =22= unsubscribe linux-kernel=22 in
the body of a message to majordomo=40vg= er.kernel.org
More majordomo info at http://vger.kernel.org/majordomo= -info.html
Please read the =46AQ at http://www.tux.org/lkml/
--520123c0_507ed7ab_13e-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org