From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e38.co.us.ibm.com ([32.97.110.159]:45648 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750958Ab3LMWJ3 (ORCPT ); Fri, 13 Dec 2013 17:09:29 -0500 Received: from /spool/local by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 13 Dec 2013 15:09:29 -0700 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id DFB083E40040 for ; Fri, 13 Dec 2013 15:09:27 -0700 (MST) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by b03cxnp07028.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rBDM9RsV8126950 for ; Fri, 13 Dec 2013 23:09:27 +0100 Received: from d03av03.boulder.ibm.com (localhost [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id rBDM9Rhb010988 for ; Fri, 13 Dec 2013 15:09:27 -0700 Subject: Re: [PATCH 0/7] Patches to support subpagesize blocksize From: Chandra Seetharaman Reply-To: sekharan@us.ibm.com To: Josef Bacik Cc: linux-btrfs@vger.kernel.org In-Reply-To: <52AB5470.3090108@fb.com> References: <1386805122-23972-1-git-send-email-sekharan@us.ibm.com> <52AB5470.3090108@fb.com> Content-Type: text/plain; charset="UTF-8" Date: Fri, 13 Dec 2013 16:09:26 -0600 Message-ID: <1386972566.4241.203.camel@chandra-dt.ibm.com> Mime-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Fri, 2013-12-13 at 13:39 -0500, Josef Bacik wrote: > On 12/11/2013 06:38 PM, Chandra Seetharaman wrote: > > In btrfs, blocksize, the basic IO size of the filesystem, has been > > more than PAGE_SIZE. > > > > But, some 64 bit architures, like PPC64 and ARM64 have the default > > PAGE_SIZE as 64K, which means the filesystems handled in these > > architectures are with a blocksize of 64K. > > > > This works fine as long as you create and use the filesystems within > > these systems. > > > > In other words, one cannot create a filesystem in some other architecture > > and use that filesystem in PPC64 or ARM64, and vice versa., > > > > Another restriction is that we cannot use ext? filesystems in these > > architectures as btrfs filesystems, since ext? filesystems have a blocksize > > of 4K. > > > > Sometime last year, Wade Cline posted a patch(http://lwn.net/Articles/529682/). > > I started testing it, and found many locking/race issues. So, I changed the > > logic and created an extent_buffer_head that holds an array of extent buffers that > > belong to a page. > > > > There are few wrinkles in this patchset, like some xfstests are failing, which > > could be due to me doing something incorrectly w.r.t how the blocksize and > > PAGE_SIZE are used in these patched. > > > > Would like to get some feedback, review comments. > > > > Ok so the more we talked about it on IRC and talking with Chris I think > we have a way forward here. > > 1) Add an extent_buffer_head that embeds an extent_buffer, and in the > extent_buffer_head track the state of the whole page. So this is where > we have a linked list of all the extent_buffers on the page, we can keep > track of the number of extent_buffers that are dirty/not so we can be > sure to set the page state and everything right. Let me see if I understand you correctly: In my patch I have, ----------- extent_buffer { // buffer specific data }; extent_buffer_head { // page wide data extent_buffer *extent_buf[]; }; -------------- You are suggesting to make it ------------ extent_buffer { // buffer specific data extent_buffer *ebuf_next; }; extent_buffer_head { // page wide data extent_buffer ebuf_first; extent_buffer *ebuf_next; }; ----------- correct ? If yes, then, IMO, the code might look more convoluted as we have to take care of two different situations ? isn't it ? > > 2) Set page->private to the first extent_buffer like we currently do. > Then we just have checks in the endio stuff to see if the eb we found is > the one for our currently range (ie bv_offset == 0) and if not do a > linear search through the extent_buffers on the extent_buffer_head part > to get the right one. > > We have to do this because we need to be able to track IO for each of > the extent_buffer's independently of each other in case a page spans a > block_group. > > Hopefully that makes sense, this way you don't have to futz with any of > my crazier long term goals of no longer using pagecache or any of that > mess. Thanks, Yeah, that would be good :) > > Josef >