From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757508Ab1GKNgl (ORCPT ); Mon, 11 Jul 2011 09:36:41 -0400 Received: from mail-pz0-f46.google.com ([209.85.210.46]:35758 "EHLO mail-pz0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757413Ab1GKNgj convert rfc822-to-8bit (ORCPT ); Mon, 11 Jul 2011 09:36:39 -0400 MIME-Version: 1.0 In-Reply-To: <1310385651.18678.59.camel@twins> References: <1310305703.13309.7.camel@twins> <4E0AF2BA.2040706@gmail.com> <1302756608.2854.10.camel@perseus.themaw.net> <4DA4B6A8.7030804@gmail.com> <4DA5DCB8.3040101@gmail.com> <4DA5F569.9020309@gmail.com> <24792.1302808448@redhat.com> <2477.1309342656@redhat.com> <4E1962BE.8010204@redhat.com> <1408.1310382069@redhat.com> <1310385651.18678.59.camel@twins> From: Michal Suchanek Date: Mon, 11 Jul 2011 15:36:19 +0200 X-Google-Sender-Auth: LkY8_qhqUiHcmI3eNJUL4f2hAU0 Message-ID: Subject: Re: Union mount and lockdep design issues To: Peter Zijlstra Cc: David Howells , Ric Wheeler , Alexander Viro , Christoph Hellwig , Ingo Molnar , Ian Kent , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Moyer , miklos@szeredi.hu Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11 July 2011 14:00, Peter Zijlstra wrote: > On Mon, 2011-07-11 at 12:01 +0100, David Howells wrote: >> Peter Zijlstra wrote: >> >> > Also, why would you want to have a class per sb-instance? From last >> > talking to David, he said there could only ever be 2 filesystems >> > involved in this, the top and bottom, and it is determined on (union) >> > mount time which is which. >> >> There can be more than 2 - one upperfs (the actual union) and many lowerfs - >> though I think only one lowerfs is accessed at a time. > > Right, however I understood from our earlier discussion that the vfs > would only ever try to lock 2 filesystems at a time, the top and one > lower. This is true from local point of view. However, it is technically possible to use overlayfs as the upper layer of another overlayfs which allows layering multiple readonly "branches" into a single overlay. Since the vfs will lock the "union" and one (or possibly both) of its branches and one of the branches may be itself an union you can get arbitrary depth (which is currently limited by a constant in the code to cut recursion depth and stack usage). > >> However, I was wondering that if in the future it could be possible to make it >> possible to union over a union.  I think that conceptually it shouldn't be that >> hard, but definitely lockdep presents a barrier unless the top union goes >> behind the scenes of the lower union and interacts with its lowerfs's directly. > > Aside from lockdep, how many fs locks will you nest and how will you > enforce the filesystem relations remain a DAG? But yeah, that'll be a > tad harder to do. One of the ways we could tackle that is create a lock > class per depth, and statically create say 16 of those, allowing for a > DAG with span of 16. This would be consistent with the limit on nesting imposed by stack size but there should be probably some mechanism to infer one of the numbers from the other. > >> > I'm also assuming that once a filesystem is part of a union mount, it >> > cannot be accessed from outside of said union (can it? can the botton be >> > itself be the top layer of another union?) >> >> Not at the moment; the hard read-only requirements on the lowerfs versus the >> writeability requirements of the upperfs (you can't enter a directory that you >> can't mirror up) prevent it. >> >> However, at some point I'd be interested in trying to make it possible to union >> over a writeable filesystem.  This is pretty much a requirement for unioning >> over NFS (as you can't tell the server to make the volume you're mounting hard >> read-only). I don't think that there is a hard readonly requirement. As far s a I understand the current status is that "The filesystem should not be modified directly" and "doing so will lead to undefined behaviour but no crash or lockup". Unless there are bugs, obviously. >> > Also, in what state are the filesystems on construction of the union?  Are >> > they already fully formed and populated (do inodes already exist?) >> >> The lower filesystems must be fully formed and, at present, may not be modified >> whilst in the union. >> >> The upper filesystem can be empty or filled by a previous union.  In fact, >> there's nothing stopping the upper fs being an ordinary fs that's then used as >> the upper layer in a union, but I'm not sure you can then access the lower >> echelons as the directories don't contain fallthru entries. As overlayfs does not have explicit fallthru entries layering any two fully formed filesystems gives an union of the two. You will only lose access to entries that were previously deleted in an union and have a whiteout entry in the upper layer. Unionmount makes any directories which were touched in an upper union layer opaque and requires explicit fallthru entries to access the lower layer. A normal filesystem does not have opaque directories and allows access to the lower layer when it is used as the top layer for the first time. Traversing the union will make it opaque, though. Thanks Michal From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Suchanek Subject: Re: Union mount and lockdep design issues Date: Mon, 11 Jul 2011 15:36:19 +0200 Message-ID: References: <1310305703.13309.7.camel@twins> <4E0AF2BA.2040706@gmail.com> <1302756608.2854.10.camel@perseus.themaw.net> <4DA4B6A8.7030804@gmail.com> <4DA5DCB8.3040101@gmail.com> <4DA5F569.9020309@gmail.com> <24792.1302808448@redhat.com> <2477.1309342656@redhat.com> <4E1962BE.8010204@redhat.com> <1408.1310382069@redhat.com> <1310385651.18678.59.camel@twins> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: David Howells , Ric Wheeler , Alexander Viro , Christoph Hellwig , Ingo Molnar , Ian Kent , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Moyer , miklos@szeredi.hu To: Peter Zijlstra Return-path: Received: from mail-pz0-f46.google.com ([209.85.210.46]:35758 "EHLO mail-pz0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757413Ab1GKNgj convert rfc822-to-8bit (ORCPT ); Mon, 11 Jul 2011 09:36:39 -0400 In-Reply-To: <1310385651.18678.59.camel@twins> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 11 July 2011 14:00, Peter Zijlstra wrote: > On Mon, 2011-07-11 at 12:01 +0100, David Howells wrote: >> Peter Zijlstra wrote: >> >> > Also, why would you want to have a class per sb-instance? From las= t >> > talking to David, he said there could only ever be 2 filesystems >> > involved in this, the top and bottom, and it is determined on (uni= on) >> > mount time which is which. >> >> There can be more than 2 - one upperfs (the actual union) and many l= owerfs - >> though I think only one lowerfs is accessed at a time. > > Right, however I understood from our earlier discussion that the vfs > would only ever try to lock 2 filesystems at a time, the top and one > lower. This is true from local point of view. However, it is technically possible to use overlayfs as the upper layer of another overlayfs which allows layering multiple readonly "branches" into a single overlay. Since the vfs will lock the "union" and one (or possibly both) of its branches and one of the branches may be itself an union you can get arbitrary depth (which is currently limited by a constant in the code to cut recursion depth and stack usage). > >> However, I was wondering that if in the future it could be possible = to make it >> possible to union over a union. =C2=A0I think that conceptually it s= houldn't be that >> hard, but definitely lockdep presents a barrier unless the top union= goes >> behind the scenes of the lower union and interacts with its lowerfs'= s directly. > > Aside from lockdep, how many fs locks will you nest and how will you > enforce the filesystem relations remain a DAG? But yeah, that'll be a > tad harder to do. One of the ways we could tackle that is create a lo= ck > class per depth, and statically create say 16 of those, allowing for = a > DAG with span of 16. This would be consistent with the limit on nesting imposed by stack size but there should be probably some mechanism to infer one of the numbers from the other. > >> > I'm also assuming that once a filesystem is part of a union mount,= it >> > cannot be accessed from outside of said union (can it? can the bot= ton be >> > itself be the top layer of another union?) >> >> Not at the moment; the hard read-only requirements on the lowerfs ve= rsus the >> writeability requirements of the upperfs (you can't enter a director= y that you >> can't mirror up) prevent it. >> >> However, at some point I'd be interested in trying to make it possib= le to union >> over a writeable filesystem. =C2=A0This is pretty much a requirement= for unioning >> over NFS (as you can't tell the server to make the volume you're mou= nting hard >> read-only). I don't think that there is a hard readonly requirement. As far s a I understand the current status is that "The filesystem should not be modified directly" and "doing so will lead to undefined behaviour but no crash or lockup". Unless there are bugs, obviously. >> > Also, in what state are the filesystems on construction of the uni= on? =C2=A0Are >> > they already fully formed and populated (do inodes already exist?) >> >> The lower filesystems must be fully formed and, at present, may not = be modified >> whilst in the union. >> >> The upper filesystem can be empty or filled by a previous union. =C2= =A0In fact, >> there's nothing stopping the upper fs being an ordinary fs that's th= en used as >> the upper layer in a union, but I'm not sure you can then access the= lower >> echelons as the directories don't contain fallthru entries. As overlayfs does not have explicit fallthru entries layering any two fully formed filesystems gives an union of the two. You will only lose access to entries that were previously deleted in an union and have a whiteout entry in the upper layer. Unionmount makes any directories which were touched in an upper union layer opaque and requires explicit fallthru entries to access the lower layer. A normal filesystem does not have opaque directories and allows access to the lower layer when it is used as the top layer for the first time. Traversing the union will make it opaque, though. Thanks Michal -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html