All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] Large (> 16TiB) volumes revisited
@ 2010-06-23  0:11 Patrick J. LoPresti
  2010-06-23  0:49 ` Joel Becker
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-23  0:11 UTC (permalink / raw)
  To: ocfs2-devel

I have just submitted the following bug report:

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1266

This is formally reporting the issue originally identified (and fixed)
by Robert Smith back in December:

http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg04728.html

Specifically, even the latest OCFS2 produces an error when you attempt
to mount a volume larger than 16 TiB:

"ocfs2_initialize_super:2157 ERROR: Volume might try to write to
blocks beyond what jbd can address in 32 bits."

I would like to use large volumes in production later this year or
early next, so I am interested in seeing this issue resolved so I can
begin testing.  I believe this check in fs/ocfs2/super.c is the only
known issue standing in the way of large volume support for OCFS2.  I
want to submit a patch to fix it.

The simplest approach is just to delete the check, like so:

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..0ba41f3 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2215,14 +2215,6 @@ static int ocfs2_initialize_super(struct super_block *sb,
                goto bail;
        }

-       if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
-           > (u32)~0UL) {
-               mlog(ML_ERROR, "Volume might try to write to blocks beyond "
-                    "what jbd can address in 32 bits.\n");
-               status = -EINVAL;
-               goto bail;
-       }
-
        if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
                                 sizeof(di->id2.i_super.s_uuid))) {
                mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");


Questions for the list:

1) Is this patch sufficient?  Or should I try to modify the check to
take into account the cluster size?  Anything else I need to check
here (e.g. inode64 mount option)?

2) Should mkfs.ocfs2 contain a similar check?  (It may already; I have
not looked yet...)

 - Pat

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-23  0:11 [Ocfs2-devel] Large (> 16TiB) volumes revisited Patrick J. LoPresti
@ 2010-06-23  0:49 ` Joel Becker
  2010-06-23  1:12   ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Joel Becker @ 2010-06-23  0:49 UTC (permalink / raw)
  To: ocfs2-devel

On Tue, Jun 22, 2010 at 05:11:50PM -0700, Patrick J. LoPresti wrote:
> The simplest approach is just to delete the check, like so:
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 0eaa929..0ba41f3 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2215,14 +2215,6 @@ static int ocfs2_initialize_super(struct super_block *sb,
>                 goto bail;
>         }
> 
> -       if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
> -           > (u32)~0UL) {
> -               mlog(ML_ERROR, "Volume might try to write to blocks beyond "
> -                    "what jbd can address in 32 bits.\n");
> -               status = -EINVAL;
> -               goto bail;
> -       }
> -
>         if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
>                                  sizeof(di->id2.i_super.s_uuid))) {
>                 mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
> 
> 
> Questions for the list:
> 
> 1) Is this patch sufficient?  Or should I try to modify the check to
> take into account the cluster size?  Anything else I need to check
> here (e.g. inode64 mount option)?

	This patch is not sufficient.  The journal options need to be
checked, as does the block addressing space of the kernel.

Joel

-- 

"Gone to plant a weeping willow
 On the bank's green edge it will roll, roll, roll.
 Sing a lulaby beside the waters.
 Lovers come and go, the river roll, roll, rolls."

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-23  0:49 ` Joel Becker
@ 2010-06-23  1:12   ` Patrick J. LoPresti
  2010-06-23  1:36     ` Joel Becker
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-23  1:12 UTC (permalink / raw)
  To: ocfs2-devel

On Tue, Jun 22, 2010 at 5:49 PM, Joel Becker <Joel.Becker@oracle.com> wrote:
> On Tue, Jun 22, 2010 at 05:11:50PM -0700, Patrick J. LoPresti wrote:
>
> ? ? ? ?This patch is not sufficient. ?The journal options need to be
> checked, as does the block addressing space of the kernel.

OK.  But I am new to this code base and would appreciate some pointers...

Specifically:

Which journal options?  Are they visible at this point in super.c?

By "block addressing space of the kernel", are we just talking
sizeof(some_kernel_type) or something more involved?

Thanks!

 - Pat

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-23  1:12   ` Patrick J. LoPresti
@ 2010-06-23  1:36     ` Joel Becker
  2010-06-24  0:04       ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Joel Becker @ 2010-06-23  1:36 UTC (permalink / raw)
  To: ocfs2-devel

On Tue, Jun 22, 2010 at 06:12:01PM -0700, Patrick J. LoPresti wrote:
> On Tue, Jun 22, 2010 at 5:49 PM, Joel Becker <Joel.Becker@oracle.com> wrote:
> > On Tue, Jun 22, 2010 at 05:11:50PM -0700, Patrick J. LoPresti wrote:
> >
> > ? ? ? ?This patch is not sufficient. ?The journal options need to be
> > checked, as does the block addressing space of the kernel.
> 
> OK.  But I am new to this code base and would appreciate some pointers...
> 
> Specifically:
> 
> Which journal options?  Are they visible at this point in super.c?

	JBD2_FEATURE_INCOMPAT_64BIT.  You'll need to call
jbd2_journal_check_used_features() to see if the journal has it enabled.
Before you can do that, you have to ensure that the filesystem has the
OCFS2_FEATURE_COMPAT_JBD2_SB feature enabled.

> By "block addressing space of the kernel", are we just talking
> sizeof(some_kernel_type) or something more involved?

	sector_t.  You can copy what ext4 did here.

2709         if ((ext4_blocks_count(es) >
2710              (sector_t)(~0ULL) >> (sb->s_blocksize_bits - 9)) ||
2711             (ext4_blocks_count(es) >
2712              (pgoff_t)(~0ULL) >> (PAGE_CACHE_SHIFT - sb->s_blocksize_bits))     ) {
2713                 ext4_msg(sb, KERN_ERR, "filesystem"
2714                          " too large to mount safely on this system");

	blocks_count is, of course, the calculation we already perform
to turn di->i_clusters into how many blocks we have.
	We must always fail if we can't address our disk in sector_t.
Once we've gotten past that check, we then need to figure out if our
journal can address our entire disk.  So if our current test for ">
(u32)~0UL" returns true, one must then check for the 64BIT feature.

Joel

-- 

"You don't make the poor richer by making the rich poorer."
	- Sir Winston Churchill

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-23  1:36     ` Joel Becker
@ 2010-06-24  0:04       ` Patrick J. LoPresti
  2010-06-24  0:14         ` Joel Becker
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-24  0:04 UTC (permalink / raw)
  To: ocfs2-devel

On Tue, Jun 22, 2010 at 6:36 PM, Joel Becker <Joel.Becker@oracle.com> wrote:
>
> ? ? ? ?JBD2_FEATURE_INCOMPAT_64BIT. ?You'll need to call
> jbd2_journal_check_used_features() to see if the journal has it enabled.
> Before you can do that, you have to ensure that the filesystem has the
> OCFS2_FEATURE_COMPAT_JBD2_SB feature enabled.
>
> ? ? ? ?sector_t. ?You can copy what ext4 did here.
>
> 2709 ? ? ? ? if ((ext4_blocks_count(es) >
> 2710 ? ? ? ? ? ? ?(sector_t)(~0ULL) >> (sb->s_blocksize_bits - 9)) ||
> 2711 ? ? ? ? ? ? (ext4_blocks_count(es) >
> 2712 ? ? ? ? ? ? ?(pgoff_t)(~0ULL) >> (PAGE_CACHE_SHIFT - sb->s_blocksize_bits)) ? ? ) {
> 2713 ? ? ? ? ? ? ? ? ext4_msg(sb, KERN_ERR, "filesystem"
> 2714 ? ? ? ? ? ? ? ? ? ? ? ? ?" too large to mount safely on this system");
>
> ? ? ? ?blocks_count is, of course, the calculation we already perform
> to turn di->i_clusters into how many blocks we have.
> ? ? ? ?We must always fail if we can't address our disk in sector_t.
> Once we've gotten past that check, we then need to figure out if our
> journal can address our entire disk. ?So if our current test for ">
> (u32)~0UL" returns true, one must then check for the 64BIT feature.

Thank you for the pointers, Joel.

Below is my first proposal for this patch.  It compiles and passes
checkpatch.pl (except for the missing Signed-off-by line).  *** I have
not tested this yet. ***  I just want to get some early feedback.

Note that I believe the original code had a bug: A "-1" was being
applied to the cluster count, when it should instead be applied to the
block count.  (Please correct me if I am wrong.)

Trying to do all this inline got messy, so I factored out the tests
into a new static function.

I am hoping to find time to test within the next few days.

Thanks!

 - Pat


diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..e9d9ca3 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1991,6 +1991,43 @@ static int ocfs2_setup_osb_uuid(struct
ocfs2_super *osb, const unsigned char *uu
 	return 0;
 }

+/* Check to make sure entire volume is addressable on this system. */
+static int check_addressable(struct ocfs2_super *osb, struct ocfs2_dinode *di)
+{
+	int status = 0;
+	const u32 clusters = le32_to_cpu(di->i_clusters);
+	const u64 max_block = ocfs2_clusters_to_blocks(osb->sb, clusters) - 1;
+
+	/* Absolute addressability check (borrowed from ext4/super.c) */
+	if ((max_block >
+	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
+	    max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
+					    osb->sb->s_blocksize_bits)) {
+		mlog(ML_ERROR, "Volume too large "
+		     "to mount safely on this system");
+		status = -EFBIG;
+		goto out;
+	}
+
+	if (max_block <= (u32)~0UL)
+		goto out;
+
+	/* Volume is "huge", so see if our journal is new enough to
+	   have support for it. */
+	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
+				       OCFS2_FEATURE_COMPAT_JBD2_SB) &&
+	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
+					       JBD2_FEATURE_INCOMPAT_64BIT))) {
+		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
+		     "what jbd can address in 32 bits.\n");
+		status = -EFBIG;
+		goto out;
+	}
+
+ out:
+	return status;
+}
+
 static int ocfs2_initialize_super(struct super_block *sb,
 				  struct buffer_head *bh,
 				  int sector_size,
@@ -2215,13 +2252,9 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		goto bail;
 	}

-	if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
-	    > (u32)~0UL) {
-		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
-		     "what jbd can address in 32 bits.\n");
-		status = -EINVAL;
+	status = check_addressable(osb, di);
+	if (status)
 		goto bail;
-	}

 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-24  0:04       ` Patrick J. LoPresti
@ 2010-06-24  0:14         ` Joel Becker
  2010-06-24  0:45           ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Joel Becker @ 2010-06-24  0:14 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Jun 23, 2010 at 05:04:49PM -0700, Patrick J. LoPresti wrote:
> Note that I believe the original code had a bug: A "-1" was being
> applied to the cluster count, when it should instead be applied to the
> block count.  (Please correct me if I am wrong.)

	Good catch.
	I
> Trying to do all this inline got messy, so I factored out the tests
> into a new static function.

	Works for me.

> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 0eaa929..e9d9ca3 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1991,6 +1991,43 @@ static int ocfs2_setup_osb_uuid(struct
> ocfs2_super *osb, const unsigned char *uu
>  	return 0;
>  }
> 
> +/* Check to make sure entire volume is addressable on this system. */
> +static int check_addressable(struct ocfs2_super *osb, struct ocfs2_dinode *di)

	Prefix the function properly: ocfs2_check_addressable().

> +{
> +	int status = 0;
> +	const u32 clusters = le32_to_cpu(di->i_clusters);
> +	const u64 max_block = ocfs2_clusters_to_blocks(osb->sb, clusters) - 1;
> +
> +	/* Absolute addressability check (borrowed from ext4/super.c) */
> +	if ((max_block >
> +	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
> +	    max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
> +					    osb->sb->s_blocksize_bits)) {

	Please wrap the second side of the || in parenthesis.
Precedence is nice, but readability is better.

> +		mlog(ML_ERROR, "Volume too large "
> +		     "to mount safely on this system");
> +		status = -EFBIG;
> +		goto out;
> +	}
> +
> +	if (max_block <= (u32)~0UL)
> +		goto out;
> +
> +	/* Volume is "huge", so see if our journal is new enough to
> +	   have support for it. */
> +	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
> +				       OCFS2_FEATURE_COMPAT_JBD2_SB) &&
> +	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
> +					       JBD2_FEATURE_INCOMPAT_64BIT))) {
> +		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
> +		     "what jbd can address in 32 bits.\n");

	"The journal cannot address the entire volume.  Enable the
'block64' journal option with tunefs.ocfs2."

Joel

-- 

Bram's Law:
	The easier a piece of software is to write, the worse it's
	implemented in practice.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-24  0:14         ` Joel Becker
@ 2010-06-24  0:45           ` Patrick J. LoPresti
  2010-06-24  0:55             ` Joel Becker
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-24  0:45 UTC (permalink / raw)
  To: ocfs2-devel

Joel --

I have applied the changes you suggested.  Revised patch is below.

Oh, one trivial thing I forgot to mention:  I changed the errno for
the failure cases from EINVAL to EFBIG since that is what ext4 uses.

My next step will be to test this...  But what is next after that?  Do
I send this as a "[PATCH] ..." to ocfs2-devel and linux-kernel, or do
I just send to ocfs2-devel and one of you passes it on?

 - Pat


diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..084d28d 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1991,6 +1991,45 @@ static int ocfs2_setup_osb_uuid(struct
ocfs2_super *osb, const unsigned char *uu
 	return 0;
 }

+/* Check to make sure entire volume is addressable on this system. */
+static int ocfs2_check_addressable(struct ocfs2_super *osb,
+				   struct ocfs2_dinode *di)
+{
+	int status = 0;
+	const u32 clusters = le32_to_cpu(di->i_clusters);
+	const u64 max_block = ocfs2_clusters_to_blocks(osb->sb, clusters) - 1;
+
+	/* Absolute addressability check (borrowed from ext4/super.c) */
+	if ((max_block >
+	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
+	    (max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
+					     osb->sb->s_blocksize_bits))) {
+		mlog(ML_ERROR, "Volume too large "
+		     "to mount safely on this system");
+		status = -EFBIG;
+		goto out;
+	}
+
+	/* 32-bit block number is always OK. */
+	if (max_block <= (u32)~0UL)
+		goto out;
+
+	/* Volume is "huge", so see if our journal is new enough to
+	   support it. */
+	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
+				       OCFS2_FEATURE_COMPAT_JBD2_SB) &&
+	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
+					       JBD2_FEATURE_INCOMPAT_64BIT))) {
+		mlog(ML_ERROR, "The journal cannot address the entire volume. "
+		     "Enable the 'block64' journal option with tunefs.ocfs2");
+		status = -EFBIG;
+		goto out;
+	}
+
+ out:
+	return status;
+}
+
 static int ocfs2_initialize_super(struct super_block *sb,
 				  struct buffer_head *bh,
 				  int sector_size,
@@ -2215,13 +2254,9 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		goto bail;
 	}

-	if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
-	    > (u32)~0UL) {
-		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
-		     "what jbd can address in 32 bits.\n");
-		status = -EINVAL;
+	status = ocfs2_check_addressable(osb, di);
+	if (status)
 		goto bail;
-	}

 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-24  0:45           ` Patrick J. LoPresti
@ 2010-06-24  0:55             ` Joel Becker
  2010-06-24 16:53               ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Joel Becker @ 2010-06-24  0:55 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Jun 23, 2010 at 05:45:13PM -0700, Patrick J. LoPresti wrote:
> Oh, one trivial thing I forgot to mention:  I changed the errno for
> the failure cases from EINVAL to EFBIG since that is what ext4 uses.

	That's fine.

> My next step will be to test this...  But what is next after that?  Do
> I send this as a "[PATCH] ..." to ocfs2-devel and linux-kernel, or do
> I just send to ocfs2-devel and one of you passes it on?

	Yes, you send it as a [PATCH] to ocfs2-devel and linux-kernel.
It will still go through me, but we want the process to be followed.

> +/* Check to make sure entire volume is addressable on this system. */
> +static int ocfs2_check_addressable(struct ocfs2_super *osb,
> +				   struct ocfs2_dinode *di)
> +{
> +	int status = 0;
> +	const u32 clusters = le32_to_cpu(di->i_clusters);
> +	const u64 max_block = ocfs2_clusters_to_blocks(osb->sb, clusters) - 1;

	These don't need to be const.  Sure, they don't change, but
you're not signifying anything special with them (like passing them to a
subfunction or something).  And you don't really need the clusters
temporary.

	u64 max_block =
		ocfs2_clusters_to_blocks(osb->sb,
					 le32_to_cpu(di->i_clusters)) - 1;

Joel

-- 

 The herd instinct among economists makes sheep look like
 independent thinkers.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-24  0:55             ` Joel Becker
@ 2010-06-24 16:53               ` Patrick J. LoPresti
  2010-06-24 23:53                 ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-24 16:53 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Jun 23, 2010 at 5:55 PM, Joel Becker <Joel.Becker@oracle.com> wrote:
> ? ? ? ?These don't need to be const. ?Sure, they don't change, but
> you're not signifying anything special with them (like passing them to a
> subfunction or something).

Well, I am signifying that I do not intend to change them...  "const"
is just a compile-time assertion, and like all compile-time
assertions, it has no effect on the generated code.  So the only
issues are readability and maintainability.  In my experience, using
"const" consistently helps with both.  But I know it is not really the
Linux Way.

> And you don't really need the clusters temporary.

Here again, the generated code is identical, so the question is human
consumption.  I find that factoring out sub-expressions and giving
them meaningful names makes code easier to read and to understand.
(Not to mention taking two lines instead of three.)  For example, I
believe my version is less likely to wind up with the "-1" in the
wrong place.  :-)

But your house = your rules.  Updated patch follows.  I will try to
find time to test today or tomorrow.

Thanks!

 - Pat


diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..d32c800 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1991,6 +1991,46 @@ static int ocfs2_setup_osb_uuid(struct
ocfs2_super *osb, const unsigned char *uu
 	return 0;
 }

+/* Check to make sure entire volume is addressable on this system. */
+static int ocfs2_check_addressable(struct ocfs2_super *osb,
+				   struct ocfs2_dinode *di)
+{
+	int status = 0;
+	u64 max_block =
+		ocfs2_clusters_to_blocks(osb->sb,
+					 le32_to_cpu(di->i_clusters)) - 1;
+
+	/* Absolute addressability check (borrowed from ext4/super.c) */
+	if ((max_block >
+	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
+	    (max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
+					     osb->sb->s_blocksize_bits))) {
+		mlog(ML_ERROR, "Volume too large "
+		     "to mount safely on this system");
+		status = -EFBIG;
+		goto out;
+	}
+
+	/* 32-bit block number is always OK. */
+	if (max_block <= (u32)~0UL)
+		goto out;
+
+	/* Volume is "huge", so see if our journal is new enough to
+	   support it. */
+	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
+				       OCFS2_FEATURE_COMPAT_JBD2_SB) &&
+	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
+					       JBD2_FEATURE_INCOMPAT_64BIT))) {
+		mlog(ML_ERROR, "The journal cannot address the entire volume. "
+		     "Enable the 'block64' journal option with tunefs.ocfs2");
+		status = -EFBIG;
+		goto out;
+	}
+
+ out:
+	return status;
+}
+
 static int ocfs2_initialize_super(struct super_block *sb,
 				  struct buffer_head *bh,
 				  int sector_size,
@@ -2215,13 +2255,9 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		goto bail;
 	}

-	if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
-	    > (u32)~0UL) {
-		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
-		     "what jbd can address in 32 bits.\n");
-		status = -EINVAL;
+	status = ocfs2_check_addressable(osb, di);
+	if (status)
 		goto bail;
-	}

 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-24 16:53               ` Patrick J. LoPresti
@ 2010-06-24 23:53                 ` Patrick J. LoPresti
  2010-06-26 16:49                   ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-24 23:53 UTC (permalink / raw)
  To: ocfs2-devel

Well, the good news is I am testing my patch.

The bad news is that it crashes in jbd2_journal_check_used_features().

I think the problem is that ocfs2_initialize_super() is called
relatively early in ocfs2_fill_super().  Only later does it call
ocfs2_mount_volume(), which calls ocfs2_check_volume(), which finally
calls ocfs2_journal_init()...  Which sets up the data structure that I
need to check.

I can move ocfs2_check_addressable() to ocfs2_check_volume() after it
calls ocfs2_journal_init(); arguably, this even makes sense ("check
volume").  But then I am not sure how to get hold of the ocfs2_dinode
pointer that is used to compute the cluster and block count.

Suggestions?

 - Pat


On Thu, Jun 24, 2010 at 9:53 AM, Patrick J. LoPresti <lopresti@gmail.com> wrote:
> On Wed, Jun 23, 2010 at 5:55 PM, Joel Becker <Joel.Becker@oracle.com> wrote:
>> ? ? ? ?These don't need to be const. ?Sure, they don't change, but
>> you're not signifying anything special with them (like passing them to a
>> subfunction or something).
>
> Well, I am signifying that I do not intend to change them... ?"const"
> is just a compile-time assertion, and like all compile-time
> assertions, it has no effect on the generated code. ?So the only
> issues are readability and maintainability. ?In my experience, using
> "const" consistently helps with both. ?But I know it is not really the
> Linux Way.
>
>> And you don't really need the clusters temporary.
>
> Here again, the generated code is identical, so the question is human
> consumption. ?I find that factoring out sub-expressions and giving
> them meaningful names makes code easier to read and to understand.
> (Not to mention taking two lines instead of three.) ?For example, I
> believe my version is less likely to wind up with the "-1" in the
> wrong place. ?:-)
>
> But your house = your rules. ?Updated patch follows. ?I will try to
> find time to test today or tomorrow.
>
> Thanks!
>
> ?- Pat
>
>
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 0eaa929..d32c800 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1991,6 +1991,46 @@ static int ocfs2_setup_osb_uuid(struct
> ocfs2_super *osb, const unsigned char *uu
> ? ? ? ?return 0;
> ?}
>
> +/* Check to make sure entire volume is addressable on this system. */
> +static int ocfs2_check_addressable(struct ocfs2_super *osb,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct ocfs2_dinode *di)
> +{
> + ? ? ? int status = 0;
> + ? ? ? u64 max_block =
> + ? ? ? ? ? ? ? ocfs2_clusters_to_blocks(osb->sb,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?le32_to_cpu(di->i_clusters)) - 1;
> +
> + ? ? ? /* Absolute addressability check (borrowed from ext4/super.c) */
> + ? ? ? if ((max_block >
> + ? ? ? ? ? ?(sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
> + ? ? ? ? ? (max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?osb->sb->s_blocksize_bits))) {
> + ? ? ? ? ? ? ? mlog(ML_ERROR, "Volume too large "
> + ? ? ? ? ? ? ? ? ? ?"to mount safely on this system");
> + ? ? ? ? ? ? ? status = -EFBIG;
> + ? ? ? ? ? ? ? goto out;
> + ? ? ? }
> +
> + ? ? ? /* 32-bit block number is always OK. */
> + ? ? ? if (max_block <= (u32)~0UL)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? /* Volume is "huge", so see if our journal is new enough to
> + ? ? ? ? ?support it. */
> + ? ? ? if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?OCFS2_FEATURE_COMPAT_JBD2_SB) &&
> + ? ? ? ? ? ? jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?JBD2_FEATURE_INCOMPAT_64BIT))) {
> + ? ? ? ? ? ? ? mlog(ML_ERROR, "The journal cannot address the entire volume. "
> + ? ? ? ? ? ? ? ? ? ?"Enable the 'block64' journal option with tunefs.ocfs2");
> + ? ? ? ? ? ? ? status = -EFBIG;
> + ? ? ? ? ? ? ? goto out;
> + ? ? ? }
> +
> + out:
> + ? ? ? return status;
> +}
> +
> ?static int ocfs2_initialize_super(struct super_block *sb,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct buffer_head *bh,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int sector_size,
> @@ -2215,13 +2255,9 @@ static int ocfs2_initialize_super(struct super_block *sb,
> ? ? ? ? ? ? ? ?goto bail;
> ? ? ? ?}
>
> - ? ? ? if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
> - ? ? ? ? ? > (u32)~0UL) {
> - ? ? ? ? ? ? ? mlog(ML_ERROR, "Volume might try to write to blocks beyond "
> - ? ? ? ? ? ? ? ? ? ?"what jbd can address in 32 bits.\n");
> - ? ? ? ? ? ? ? status = -EINVAL;
> + ? ? ? status = ocfs2_check_addressable(osb, di);
> + ? ? ? if (status)
> ? ? ? ? ? ? ? ?goto bail;
> - ? ? ? }
>
> ? ? ? ?if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sizeof(di->id2.i_super.s_uuid))) {
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-24 23:53                 ` Patrick J. LoPresti
@ 2010-06-26 16:49                   ` Patrick J. LoPresti
  2010-06-27 19:02                     ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-26 16:49 UTC (permalink / raw)
  To: ocfs2-devel

Moving the addressability check to ocfs2_check_volume() -- after the
journal structure has been initialized -- seems to be working.  I
still have more testing to do, but at least it no longer crashes.

Current draft of patch follows.

 - Pat


diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..d92123b 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1991,6 +1991,43 @@ static int ocfs2_setup_osb_uuid(struct
ocfs2_super *osb, const unsigned char *uu
 	return 0;
 }

+/* Check to make sure entire volume is addressable on this system. */
+static int ocfs2_check_addressable(struct ocfs2_super *osb, u32 clusters)
+{
+	int status = 0;
+	u64 max_block = ocfs2_clusters_to_blocks(osb->sb, clusters) - 1;
+
+	/* Absolute addressability check (borrowed from ext4/super.c) */
+	if ((max_block >
+	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
+	    (max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
+					     osb->sb->s_blocksize_bits))) {
+		mlog(ML_ERROR, "Volume too large "
+		     "to mount safely on this system");
+		status = -EFBIG;
+		goto out;
+	}
+
+	/* 32-bit block number is always OK. */
+	if (max_block <= (u32)~0UL)
+		goto out;
+
+	/* Volume is "huge", so see if our journal is new enough to
+	   support it. */
+	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
+				       OCFS2_FEATURE_COMPAT_JBD2_SB) &&
+	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
+					       JBD2_FEATURE_INCOMPAT_64BIT))) {
+		mlog(ML_ERROR, "The journal cannot address the entire volume. "
+		     "Enable the 'block64' journal option with tunefs.ocfs2");
+		status = -EFBIG;
+		goto out;
+	}
+
+ out:
+	return status;
+}
+
 static int ocfs2_initialize_super(struct super_block *sb,
 				  struct buffer_head *bh,
 				  int sector_size,
@@ -2215,14 +2252,6 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		goto bail;
 	}

-	if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
-	    > (u32)~0UL) {
-		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
-		     "what jbd can address in 32 bits.\n");
-		status = -EINVAL;
-		goto bail;
-	}
-
 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {
 		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
@@ -2381,6 +2410,12 @@ static int ocfs2_check_volume(struct ocfs2_super *osb)
 		goto finally;
 	}

+	/* Now that journal has been initialized, check to make sure
+	   entire volume is addressable. */
+	status = ocfs2_check_addressable(osb, osb->osb_clusters_at_boot);
+	if (status)
+		goto finally;
+
 	/* If the journal was unmounted cleanly then we don't want to
 	 * recover anything. Otherwise, journal_load will do that
 	 * dirty work for us :) */

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-26 16:49                   ` Patrick J. LoPresti
@ 2010-06-27 19:02                     ` Patrick J. LoPresti
  2010-06-28 22:32                       ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-27 19:02 UTC (permalink / raw)
  To: ocfs2-devel

OK, this whole approach has a problem.

I do not think the journal flag I need to check
(JBD2_FEATURE_INCOMPAT_64BIT) is even read from disk until
jbd2_journal_load() is called.

Right now, jbd2_journal_load() is called by ocfs2_journal_load(),
which not only loads the journal, but also replays it.

I can move my check until after ocfs2_journal_load() is called, but
that seems a little late...  What if the journal is dirty and the disk
is mounted on a 32-bit system?

I am inclined to move the call to jbd2_journal_load() earlier,
specifically to ocfs2_journal_init().  Once the journal structure is
initialized from the journal's inode on disk, I can check the flags to
make sure the entire volume is addressable by the journal.

Thoughts?

 - Pat

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-27 19:02                     ` Patrick J. LoPresti
@ 2010-06-28 22:32                       ` Patrick J. LoPresti
  2010-06-29  0:47                         ` Sunil Mushran
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-28 22:32 UTC (permalink / raw)
  To: ocfs2-devel

Current version of my proposed patch is appended.

The good news is that I have tested it, and it "works for me".

The bad news is that it performs the check fairly late (in
ocfs2_check_volume(), just after the call to ocfs2_journal_load()),
because this is the earliest I have the journal flags available to
test.

We can:

(a) Run with this patch pretty much as-is; or
(b) Move the jbd2_journal_load() call out of ocfs2_journal_load() so
that the flags will be in memory sooner; or
(c) Do something else (ideas welcome)

Let me know what you think.  Thanks!

 - Pat


diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..3db233d 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1991,6 +1991,47 @@ static int ocfs2_setup_osb_uuid(struct
ocfs2_super *osb, const unsigned char *uu
 	return 0;
 }

+/* Check to make sure entire volume is addressable on this system.
+   Requires osb_clusters_at_boot to be valid and for the journal to
+   have been read by jbd2_journal_load(). */
+static int ocfs2_check_addressable(struct ocfs2_super *osb)
+{
+	int status = 0;
+	u64 max_block =
+		ocfs2_clusters_to_blocks(osb->sb,
+					 osb->osb_clusters_at_boot) - 1;
+
+	/* Absolute addressability check (borrowed from ext4/super.c) */
+	if ((max_block >
+	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
+	    (max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
+					     osb->sb->s_blocksize_bits))) {
+		mlog(ML_ERROR, "Volume too large "
+		     "to mount safely on this system");
+		status = -EFBIG;
+		goto out;
+	}
+
+	/* 32-bit block number is always OK. */
+	if (max_block <= (u32)~0UL)
+		goto out;
+
+	/* Volume is "huge", so see if our journal is new enough to
+	   support it. */
+	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
+				       OCFS2_FEATURE_COMPAT_JBD2_SB) &&
+	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
+					       JBD2_FEATURE_INCOMPAT_64BIT))) {
+		mlog(ML_ERROR, "The journal cannot address the entire volume. "
+		     "Enable the 'block64' journal option with tunefs.ocfs2");
+		status = -EFBIG;
+		goto out;
+	}
+
+ out:
+	return status;
+}
+
 static int ocfs2_initialize_super(struct super_block *sb,
 				  struct buffer_head *bh,
 				  int sector_size,
@@ -2215,14 +2256,6 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		goto bail;
 	}

-	if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
-	    > (u32)~0UL) {
-		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
-		     "what jbd can address in 32 bits.\n");
-		status = -EINVAL;
-		goto bail;
-	}
-
 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {
 		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
@@ -2404,6 +2437,12 @@ static int ocfs2_check_volume(struct ocfs2_super *osb)
 		goto finally;
 	}

+	/* Now that journal has been loaded, check to make sure entire
+	   volume is addressable. */
+	status = ocfs2_check_addressable(osb);
+	if (status)
+		goto finally;
+
 	if (dirty) {
 		/* recover my local alloc if we didn't unmount cleanly. */
 		status = ocfs2_begin_local_alloc_recovery(osb,

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-28 22:32                       ` Patrick J. LoPresti
@ 2010-06-29  0:47                         ` Sunil Mushran
  2010-06-29  1:15                           ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Sunil Mushran @ 2010-06-29  0:47 UTC (permalink / raw)
  To: ocfs2-devel

On 06/28/2010 03:32 PM, Patrick J. LoPresti wrote:
> Current version of my proposed patch is appended.
>
> The good news is that I have tested it, and it "works for me".
>
> The bad news is that it performs the check fairly late (in
> ocfs2_check_volume(), just after the call to ocfs2_journal_load()),
> because this is the earliest I have the journal flags available to
> test.
>
>    

That should be ok. Afterall we have to handle journal_load()
failing for other reasons too. BTW, have you tested with cleared
INCOMPAT_64BIT journal flag?

Hopefully, we'll never trigger this condition... that is if we are
careful when making the same change in mkfs/tunefs.

> We can:
>
> (a) Run with this patch pretty much as-is; or
> (b) Move the jbd2_journal_load() call out of ocfs2_journal_load() so
> that the flags will be in memory sooner; or
> (c) Do something else (ideas welcome)
>
> Let me know what you think.  Thanks!
>
>   - Pat
>
>
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 0eaa929..3db233d 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1991,6 +1991,47 @@ static int ocfs2_setup_osb_uuid(struct
> ocfs2_super *osb, const unsigned char *uu
>   	return 0;
>   }
>
> +/* Check to make sure entire volume is addressable on this system.
> +   Requires osb_clusters_at_boot to be valid and for the journal to
> +   have been read by jbd2_journal_load(). */
> +static int ocfs2_check_addressable(struct ocfs2_super *osb)
> +{
> +	int status = 0;
> +	u64 max_block =
> +		ocfs2_clusters_to_blocks(osb->sb,
> +					 osb->osb_clusters_at_boot) - 1;
> +
> +	/* Absolute addressability check (borrowed from ext4/super.c) */
> +	if ((max_block>
> +	     (sector_t)(~0LL)>>  (osb->sb->s_blocksize_bits - 9)) ||
> +	    (max_block>  (pgoff_t)(~0LL)>>  (PAGE_CACHE_SHIFT -
> +					     osb->sb->s_blocksize_bits))) {
> +		mlog(ML_ERROR, "Volume too large "
> +		     "to mount safely on this system");
> +		status = -EFBIG;
> +		goto out;
> +	}
> +
> +	/* 32-bit block number is always OK. */
> +	if (max_block<= (u32)~0UL)
> +		goto out;
> +
> +	/* Volume is "huge", so see if our journal is new enough to
> +	   support it. */
> +	if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb,
> +				       OCFS2_FEATURE_COMPAT_JBD2_SB)&&
> +	      jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0,
> +					       JBD2_FEATURE_INCOMPAT_64BIT))) {
> +		mlog(ML_ERROR, "The journal cannot address the entire volume. "
> +		     "Enable the 'block64' journal option with tunefs.ocfs2");
> +		status = -EFBIG;
> +		goto out;
> +	}
> +
> + out:
> +	return status;
> +}
> +
>   static int ocfs2_initialize_super(struct super_block *sb,
>   				  struct buffer_head *bh,
>   				  int sector_size,
> @@ -2215,14 +2256,6 @@ static int ocfs2_initialize_super(struct super_block *sb,
>   		goto bail;
>   	}
>
> -	if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1)
> -	>  (u32)~0UL) {
> -		mlog(ML_ERROR, "Volume might try to write to blocks beyond "
> -		     "what jbd can address in 32 bits.\n");
> -		status = -EINVAL;
> -		goto bail;
> -	}
> -
>   	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
>   				 sizeof(di->id2.i_super.s_uuid))) {
>   		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
> @@ -2404,6 +2437,12 @@ static int ocfs2_check_volume(struct ocfs2_super *osb)
>   		goto finally;
>   	}
>
> +	/* Now that journal has been loaded, check to make sure entire
> +	   volume is addressable. */
> +	status = ocfs2_check_addressable(osb);
> +	if (status)
> +		goto finally;
> +
>   	if (dirty) {
>   		/* recover my local alloc if we didn't unmount cleanly. */
>   		status = ocfs2_begin_local_alloc_recovery(osb,
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
>    

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-29  0:47                         ` Sunil Mushran
@ 2010-06-29  1:15                           ` Patrick J. LoPresti
  2010-06-29  1:54                             ` Sunil Mushran
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-29  1:15 UTC (permalink / raw)
  To: ocfs2-devel

On Mon, Jun 28, 2010 at 5:47 PM, Sunil Mushran <sunil.mushran@oracle.com> wrote:
> BTW, have you tested with cleared INCOMPAT_64BIT journal flag?

Hm, easier said than done. "tunefs.ocfs2 -J noblock64 ..." bombs out with:

tunefs.ocfs2: Unknown journal option: "block64"
Valid journal options are:
        size=<journal-size>
Usage: tunefs.ocfs2 [options] <device> [new-size]
[etc.]

By the way, it complains similarly about "tunefs.ocfs2 -J block64",
which puts the lie to the failure message in my patch...

When I try to re-format the partition without "-J block64", I get:

ERROR: jbd can only store block numbers in 32 bits. /dev/md0 can hold
5082795264 blocks which overflows this limit. If you have a new enough
Ocfs2 with JBD2 support, you can try formatting with the "-Jblock64"
option to turn on support for this size block device.
Otherwise, consider increasing the block size or decreasing the device size.


 - Pat

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-29  1:15                           ` Patrick J. LoPresti
@ 2010-06-29  1:54                             ` Sunil Mushran
  2010-06-29  2:50                               ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Sunil Mushran @ 2010-06-29  1:54 UTC (permalink / raw)
  To: ocfs2-devel

On 06/28/2010 06:15 PM, Patrick J. LoPresti wrote:
> On Mon, Jun 28, 2010 at 5:47 PM, Sunil Mushran<sunil.mushran@oracle.com>  wrote:
>    
>> BTW, have you tested with cleared INCOMPAT_64BIT journal flag?
>>      
> Hm, easier said than done. "tunefs.ocfs2 -J noblock64 ..." bombs out with:
>
> tunefs.ocfs2: Unknown journal option: "block64"
> Valid journal options are:
>          size=<journal-size>
> Usage: tunefs.ocfs2 [options]<device>  [new-size]
> [etc.]
>
> By the way, it complains similarly about "tunefs.ocfs2 -J block64",
> which puts the lie to the failure message in my patch...
>
> When I try to re-format the partition without "-J block64", I get:
>
> ERROR: jbd can only store block numbers in 32 bits. /dev/md0 can hold
> 5082795264 blocks which overflows this limit. If you have a new enough
> Ocfs2 with JBD2 support, you can try formatting with the "-Jblock64"
> option to turn on support for this size block device.
> Otherwise, consider increasing the block size or decreasing the device size.
>    

Which version of tools are you running? block64 was added after 
ocfs2-tools 1.4.2.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-29  1:54                             ` Sunil Mushran
@ 2010-06-29  2:50                               ` Patrick J. LoPresti
  2010-06-29  4:21                                 ` Sunil Mushran
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-29  2:50 UTC (permalink / raw)
  To: ocfs2-devel

On Mon, Jun 28, 2010 at 6:54 PM, Sunil Mushran <sunil.mushran@oracle.com> wrote:
>
> Which version of tools are you running? block64 was added after ocfs2-tools
> 1.4.2.

1.4.4 (from Debian experimental).

"-J block64" It works fine for mkfs.ocfs2; just not for tunefs.ocfs2...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-29  2:50                               ` Patrick J. LoPresti
@ 2010-06-29  4:21                                 ` Sunil Mushran
  2010-06-29 19:43                                   ` Patrick J. LoPresti
  0 siblings, 1 reply; 20+ messages in thread
From: Sunil Mushran @ 2010-06-29  4:21 UTC (permalink / raw)
  To: ocfs2-devel

Yes, that needs to be fixed. Meanwhile you could hand edit the jsb to  
test your patch.

On Jun 28, 2010, at 7:50 PM, "Patrick J. LoPresti"  
<lopresti@gmail.com> wrote:

> On Mon, Jun 28, 2010 at 6:54 PM, Sunil Mushran <sunil.mushran@oracle.com 
> > wrote:
>>
>> Which version of tools are you running? block64 was added after  
>> ocfs2-tools
>> 1.4.2.
>
> 1.4.4 (from Debian experimental).
>
> "-J block64" It works fine for mkfs.ocfs2; just not for  
> tunefs.ocfs2...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-29  4:21                                 ` Sunil Mushran
@ 2010-06-29 19:43                                   ` Patrick J. LoPresti
  2010-06-29 20:22                                     ` Sunil Mushran
  0 siblings, 1 reply; 20+ messages in thread
From: Patrick J. LoPresti @ 2010-06-29 19:43 UTC (permalink / raw)
  To: ocfs2-devel

On Mon, Jun 28, 2010 at 9:21 PM, Sunil Mushran <sunil.mushran@oracle.com> wrote:
> Yes, that needs to be fixed. Meanwhile you could hand edit the jsb to test
> your patch.

Well, I do not understand the bits well enough to do that...  But I
did download the source for mkfs.ocfs2 and eliminated its check.

I created a large volume without "-J block64".  With my patch applies,
the kernel refuses to mount it and generates the appropriate error
message.

OK to submit my patch to linux-kernel?

 - Pat

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Ocfs2-devel] Large (> 16TiB) volumes revisited
  2010-06-29 19:43                                   ` Patrick J. LoPresti
@ 2010-06-29 20:22                                     ` Sunil Mushran
  0 siblings, 0 replies; 20+ messages in thread
From: Sunil Mushran @ 2010-06-29 20:22 UTC (permalink / raw)
  To: ocfs2-devel

On 06/29/2010 12:43 PM, Patrick J. LoPresti wrote:
> On Mon, Jun 28, 2010 at 9:21 PM, Sunil Mushran<sunil.mushran@oracle.com>  wrote:
>    
>> Yes, that needs to be fixed. Meanwhile you could hand edit the jsb to test
>> your patch.
>>      
> Well, I do not understand the bits well enough to do that...  But I
> did download the source for mkfs.ocfs2 and eliminated its check.
>
> I created a large volume without "-J block64".  With my patch applies,
> the kernel refuses to mount it and generates the appropriate error
> message.
>
> OK to submit my patch to linux-kernel?
>    

Thanks. Mention that in the submission. Submit it to ocfs2-devel.
Joel rounds up all the patches and sends them to Linus during
the merge window.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-06-29 20:22 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-23  0:11 [Ocfs2-devel] Large (> 16TiB) volumes revisited Patrick J. LoPresti
2010-06-23  0:49 ` Joel Becker
2010-06-23  1:12   ` Patrick J. LoPresti
2010-06-23  1:36     ` Joel Becker
2010-06-24  0:04       ` Patrick J. LoPresti
2010-06-24  0:14         ` Joel Becker
2010-06-24  0:45           ` Patrick J. LoPresti
2010-06-24  0:55             ` Joel Becker
2010-06-24 16:53               ` Patrick J. LoPresti
2010-06-24 23:53                 ` Patrick J. LoPresti
2010-06-26 16:49                   ` Patrick J. LoPresti
2010-06-27 19:02                     ` Patrick J. LoPresti
2010-06-28 22:32                       ` Patrick J. LoPresti
2010-06-29  0:47                         ` Sunil Mushran
2010-06-29  1:15                           ` Patrick J. LoPresti
2010-06-29  1:54                             ` Sunil Mushran
2010-06-29  2:50                               ` Patrick J. LoPresti
2010-06-29  4:21                                 ` Sunil Mushran
2010-06-29 19:43                                   ` Patrick J. LoPresti
2010-06-29 20:22                                     ` Sunil Mushran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.