From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756063Ab3HFQZh (ORCPT ); Tue, 6 Aug 2013 12:25:37 -0400 Received: from mail-qe0-f52.google.com ([209.85.128.52]:53427 "EHLO mail-qe0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755730Ab3HFQZY convert rfc822-to-8bit (ORCPT ); Tue, 6 Aug 2013 12:25:24 -0400 MIME-Version: 1.0 X-Originating-IP: [86.59.245.170] In-Reply-To: <51FBD2DF.50506@parallels.com> References: <20130629172211.20175.70154.stgit@maximpc.sw.ru> <20130629174525.20175.18987.stgit@maximpc.sw.ru> <20130719165037.GA18358@tucsk.piliscsaba.szeredi.hu> <51FBD2DF.50506@parallels.com> Date: Tue, 6 Aug 2013 18:25:22 +0200 Message-ID: Subject: Re: [PATCH 10/16] fuse: Implement writepages callback From: Miklos Szeredi To: Maxim Patlasov Cc: riel@redhat.com, Kirill Korotaev , Pavel Emelianov , fuse-devel , Brian Foster , Kernel Mailing List , James Bottomley , linux-mm@kvack.org, Al Viro , Linux-Fsdevel , Andrew Morton , fengguang.wu@intel.com, devel@openvz.org, Mel Gorman Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 2, 2013 at 5:40 PM, Maxim Patlasov wrote: > 07/19/2013 08:50 PM, Miklos Szeredi пишет: > >> On Sat, Jun 29, 2013 at 09:45:29PM +0400, Maxim Patlasov wrote: >>> >>> From: Pavel Emelyanov >>> >>> The .writepages one is required to make each writeback request carry more >>> than >>> one page on it. The patch enables optimized behaviour unconditionally, >>> i.e. mmap-ed writes will benefit from the patch even if >>> fc->writeback_cache=0. >> >> I rewrote this a bit, so we won't have to do the thing in two passes, >> which >> makes it simpler and more robust. Waiting for page writeback here is >> wrong >> anyway, see comment above fuse_page_mkwrite(). BTW we had a race there >> because >> fuse_page_mkwrite() didn't take the page lock. I've also fixed that up >> and >> pushed a series containing these patches up to implementing ->writepages() >> to >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git >> writepages >> >> Passed some trivial testing but more is needed. > > > Thanks a lot for efforts. The approach you implemented looks promising, but > it introduces the following assumption: a page cannot become dirty before we > have a chance to wait on fuse writeback holding the page locked. This is > already true for mmap-ed writes (due to your fixes) and it seems doable for > cached writes as well (like we do in fuse_perform_write). But the assumption > seems to be broken in case of direct read from local fs (e.g. ext4) to a > memory region mmap-ed to a file on fuse fs. See how dio_bio_submit() marks > pages dirty by bio_set_pages_dirty(). I can't see any solution for this > use-case. Do you? Hmm. Direct IO on an mmaped file will do get_user_pages() which will do the necessary page fault magic and ->page_mkwrite() will be called. At least AFAICS. The page cannot become dirty through a memory mapping without first switching the pte from read-only to read-write first. Page accounting logic relies on this too. The other way the page can become dirty is through write(2) on the fs. But we do get notified about that too. Thanks, Miklos From mboxrd@z Thu Jan 1 00:00:00 1970 From: Miklos Szeredi Subject: Re: [PATCH 10/16] fuse: Implement writepages callback Date: Tue, 6 Aug 2013 18:25:22 +0200 Message-ID: References: <20130629172211.20175.70154.stgit@maximpc.sw.ru> <20130629174525.20175.18987.stgit@maximpc.sw.ru> <20130719165037.GA18358@tucsk.piliscsaba.szeredi.hu> <51FBD2DF.50506@parallels.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: riel@redhat.com, Kirill Korotaev , Pavel Emelianov , fuse-devel , Brian Foster , Kernel Mailing List , James Bottomley , linux-mm@kvack.org, Al Viro , Linux-Fsdevel , Andrew Morton , fengguang.wu@intel.com, devel@openvz.org, Mel Gorman To: Maxim Patlasov Return-path: In-Reply-To: <51FBD2DF.50506@parallels.com> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Fri, Aug 2, 2013 at 5:40 PM, Maxim Patlasov wr= ote: > 07/19/2013 08:50 PM, Miklos Szeredi =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > >> On Sat, Jun 29, 2013 at 09:45:29PM +0400, Maxim Patlasov wrote: >>> >>> From: Pavel Emelyanov >>> >>> The .writepages one is required to make each writeback request carry mo= re >>> than >>> one page on it. The patch enables optimized behaviour unconditionally, >>> i.e. mmap-ed writes will benefit from the patch even if >>> fc->writeback_cache=3D0. >> >> I rewrote this a bit, so we won't have to do the thing in two passes, >> which >> makes it simpler and more robust. Waiting for page writeback here is >> wrong >> anyway, see comment above fuse_page_mkwrite(). BTW we had a race there >> because >> fuse_page_mkwrite() didn't take the page lock. I've also fixed that up >> and >> pushed a series containing these patches up to implementing ->writepages= () >> to >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git >> writepages >> >> Passed some trivial testing but more is needed. > > > Thanks a lot for efforts. The approach you implemented looks promising, b= ut > it introduces the following assumption: a page cannot become dirty before= we > have a chance to wait on fuse writeback holding the page locked. This is > already true for mmap-ed writes (due to your fixes) and it seems doable f= or > cached writes as well (like we do in fuse_perform_write). But the assumpt= ion > seems to be broken in case of direct read from local fs (e.g. ext4) to a > memory region mmap-ed to a file on fuse fs. See how dio_bio_submit() mark= s > pages dirty by bio_set_pages_dirty(). I can't see any solution for this > use-case. Do you? Hmm. Direct IO on an mmaped file will do get_user_pages() which will do the necessary page fault magic and ->page_mkwrite() will be called. At least AFAICS. The page cannot become dirty through a memory mapping without first switching the pte from read-only to read-write first. Page accounting logic relies on this too. The other way the page can become dirty is through write(2) on the fs. But we do get notified about that too. Thanks, Miklos -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org