All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: xfs_growfs doesn't resize
@ 2011-06-30 23:30 kkeller
  2011-07-01 10:46 ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-06-30 23:30 UTC (permalink / raw)
  To: xfs

Hello again all,

I apologize for following up my own post, but I found some new information.

On Thu 30/06/11  2:42 PM , kkeller@sonic.net wrote:

> http://oss.sgi.com/archives/xfs/2008-01/msg00085.html

I found a newer thread in the archives which might be more relevant to my issue:

http://oss.sgi.com/archives/xfs/2009-09/msg00206.html

But I haven't yet done a umount, and don't really wish to.  So, my followup questions are:

==Is there a simple way to figure out what xfs_growfs did, and whether it caused any problems?
==Will I be able to fix these problems, if any, without needing a umount?
==Assuming my filesystem is healthy, will a simple kernel update (and reboot of course!) allow me to resize the filesystem in one step, instead of 2TB increments?

Again, many thanks!

--keith

-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-06-30 23:30 xfs_growfs doesn't resize kkeller
@ 2011-07-01 10:46 ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2011-07-01 10:46 UTC (permalink / raw)
  To: kkeller; +Cc: xfs

On Thu, Jun 30, 2011 at 04:30:20PM -0700, kkeller@sonic.net wrote:
> Hello again all,
> 
> I apologize for following up my own post, but I found some new information.
> 
> On Thu 30/06/11  2:42 PM , kkeller@sonic.net wrote:
> 
> > http://oss.sgi.com/archives/xfs/2008-01/msg00085.html
> 
> I found a newer thread in the archives which might be more relevant to my issue:
> 
> http://oss.sgi.com/archives/xfs/2009-09/msg00206.html
> 
> But I haven't yet done a umount, and don't really wish to.  So, my followup questions are:
> 
> ==Is there a simple way to figure out what xfs_growfs did, and whether it caused any problems?

Apart from looking at what is on disk with xfs_db in the manner that
is done in the first thread you quoted, no.


> ==Will I be able to fix these problems, if any, without needing a umount?

If you need to modify anything with xfs_db, then you have to unmount
the filesystem first.  And realistically, you need to unmount the
filesystem to make sure what xfs-db is reporting is not being
modified by the active filesystem.

So either way, you will have to unmount the filesystem.

> ==Assuming my filesystem is healthy, will a simple kernel update
> (and reboot of course!) allow me to resize the filesystem in one
> step, instead of 2TB increments?

I'd upgrade both kernel and userspace.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-07-07 22:23     ` Keith Keller
@ 2011-07-07 22:30       ` Eric Sandeen
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2011-07-07 22:30 UTC (permalink / raw)
  To: Keith Keller; +Cc: xfs

On 7/7/11 5:23 PM, Keith Keller wrote:
> On Thu, Jul 07, 2011 at 02:34:12PM -0500, Eric Sandeen wrote:

...

>> If it were me, if possible, I'd make backups of the fs as it's mounted
>> now, then umount it and make an xfs_metadump of it, restore that metadump
>> to a new sparse file, and point xfs_repair at that metadata image file,
>> to see what repair might do with it.
>>
>> If repair eats it alive, then we can look into more xfs_db type surgery
>> to fix things up more nicely...
> 
> This sounds like a reasonable plan.  It looks like xfs_metadump needs a
> umounted or readonly filesystem in order to work properly; is there any
> way to estimate how long such a dump would take, and how large it would
> be from an almost-full 11TB fs with however many inodes it has (~19 million
> IIRC)?  I want to plan downtime and usable disk space accordingly.

well, I'm looking at an image of a 4T fs right now, with 208k inodes,
and the image itself took up 800M (a 4T sparse file when restored,
of course, but only using 800M)

> Would xfs_metadump create the same dump from a filesystem remounted ro
> as from a filesystem not mounted?  I think you suggested this idea in

yes, looks like it works, with recent tools anyway.

> an earlier post.  In a very optimistic scenario, I could imagine
> remounting the original filesystem ro, taking the metadump, then being
> able to remount rw so that I could put it back into service while I
> work with the metadump.  Then, once I knew more about the metadump, I

I think that should work.

> could do an actual umount and fix the filesystem using the information
> gathered from the metadump testing.  If they will produce the same
> metadump, then it could be a win-win if it's able to successfully
> remount rw afterwards; and if it can't, it wasn't any additional effort
> or risk to try.

agreed.

> Will xfsprogs 3.1.5 work with the older kernel, and will it make a
> better dump than 2.9.4?  I have built xfsprogs from source, but if it

2.9.4 won't have xfs_metadump ... and no problems with newer tools on
older kernels.  It's just reading the block device, in any case.
No unique kernel interaction.

> might have problems working with the kmod-xfs kernel module I can use
> the 2.9.4 tools instead.  (Again, in keeping with the hope-for-the-best
> scenario above, if avoiding a reboot won't be too harmful it'd be
> convenient.)

I think you can.

> I think you also mentioned that xfs_metadump can not dump frozen
> filesystems, but the man page for 3.1.5 says it can.  FWIW xfs_metadump
> refused to work on a frozen filesystem on my test machine, which is
> version 2.9.4 (though from an older CentOS base).  (xfs_freeze does look
> like a nice tool though!)

it should(?) but:

# xfs_metadump /dev/loop0 blah
xfs_metadump: /dev/loop0 contains a mounted and writable filesystem

fatal error -- couldn't initialize XFS library

# xfs_freeze -f mnt/
# xfs_metadump /dev/loop0 blah
xfs_metadump: /dev/loop0 contains a mounted and writable filesystem

fatal error -- couldn't initialize XFS library

# xfs_freeze -u mnt
# mount -o remount,ro mnt
# xfs_metadump /dev/loop0 blah

<proceeds w/o problems>

I think we should make the tools work with freeze, but remount,ro works fine too.

-Eric

> --keith
> 
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-07-07 19:34   ` Eric Sandeen
@ 2011-07-07 22:23     ` Keith Keller
  2011-07-07 22:30       ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: Keith Keller @ 2011-07-07 22:23 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

On Thu, Jul 07, 2011 at 02:34:12PM -0500, Eric Sandeen wrote:
> 
> It seems like you need an "exit strategy" - you probably cannot leave
> your fs mounted -forever- ...

Yes, of course.  I didn't mean to imply that I'd leave it like this
indefinitely.  :)

I will be on vacation next week, and it's not really reschedulable.  So
my plan was to ride the filesystem through next week, hope for the best,
then fix it when I return, rather than attempt to start a fix now and
risk ending up with a broken filesystem.  Ideally I would preemptively
switch to a warm backup before I leave, but that won't be ready in time.
(I currently do have all the important data backed up, but it is to
various different spaces where I had free disk space.  The warm backup
should be ready early next week if the filesystem does go belly-up
before I return.)

> If it were me, if possible, I'd make backups of the fs as it's mounted
> now, then umount it and make an xfs_metadump of it, restore that metadump
> to a new sparse file, and point xfs_repair at that metadata image file,
> to see what repair might do with it.
>
> If repair eats it alive, then we can look into more xfs_db type surgery
> to fix things up more nicely...

This sounds like a reasonable plan.  It looks like xfs_metadump needs a
umounted or readonly filesystem in order to work properly; is there any
way to estimate how long such a dump would take, and how large it would
be from an almost-full 11TB fs with however many inodes it has (~19 million
IIRC)?  I want to plan downtime and usable disk space accordingly.

Would xfs_metadump create the same dump from a filesystem remounted ro
as from a filesystem not mounted?  I think you suggested this idea in
an earlier post.  In a very optimistic scenario, I could imagine
remounting the original filesystem ro, taking the metadump, then being
able to remount rw so that I could put it back into service while I
work with the metadump.  Then, once I knew more about the metadump, I
could do an actual umount and fix the filesystem using the information
gathered from the metadump testing.  If they will produce the same
metadump, then it could be a win-win if it's able to successfully
remount rw afterwards; and if it can't, it wasn't any additional effort
or risk to try.

Will xfsprogs 3.1.5 work with the older kernel, and will it make a
better dump than 2.9.4?  I have built xfsprogs from source, but if it
might have problems working with the kmod-xfs kernel module I can use
the 2.9.4 tools instead.  (Again, in keeping with the hope-for-the-best
scenario above, if avoiding a reboot won't be too harmful it'd be
convenient.)

I think you also mentioned that xfs_metadump can not dump frozen
filesystems, but the man page for 3.1.5 says it can.  FWIW xfs_metadump
refused to work on a frozen filesystem on my test machine, which is
version 2.9.4 (though from an older CentOS base).  (xfs_freeze does look
like a nice tool though!)

--keith


-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-07-07 18:25 ` Keith Keller
@ 2011-07-07 19:34   ` Eric Sandeen
  2011-07-07 22:23     ` Keith Keller
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2011-07-07 19:34 UTC (permalink / raw)
  To: Keith Keller; +Cc: xfs

On 7/7/11 1:25 PM, Keith Keller wrote:
> Hi all,
> 
> First, I hope that this message fixes my mail client breaking threading.
> 
> I am sorry for following up my own post (again), but I realized this
> morning that there may be another possible risk I had not considered:
> 
> On Wed, Jul 06, 2011 at 03:51:32PM -0700, kkeller@sonic.net wrote:
>>
>> So, here is my xfs_db output.  This is still on a mounted filesystem.
> 
> How safe/risky is it to leave this filesystem mounted and in use?
> I'm not too concerned about new data, since it won't be a huge amount,
> but I am wondering if data that's already been written may be at risk.
> Or, it it a reasonable guess that the kernel is still working completely
> with the old filesystem geometry, and so won't write anything beyond the
> old limits while it's still mounted?  df certainly seems to use the old
> fs size, not the new one.

I don't remember all the implications of this very old bug...

It seems like you need an "exit strategy" - you probably cannot leave
your fs mounted -forever- ...

If it were me, if possible, I'd make backups of the fs as it's mounted
now, then umount it and make an xfs_metadump of it, restore that metadump
to a new sparse file, and point xfs_repair at that metadata image file,
to see what repair might do with it.

If repair eats it alive, then we can look into more xfs_db type surgery
to fix things up more nicely...

-Eric

> Thanks again,
> 
> 
> --keith
> 
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-07-06 22:51 kkeller
@ 2011-07-07 18:25 ` Keith Keller
  2011-07-07 19:34   ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: Keith Keller @ 2011-07-07 18:25 UTC (permalink / raw)
  To: xfs

Hi all,

First, I hope that this message fixes my mail client breaking threading.

I am sorry for following up my own post (again), but I realized this
morning that there may be another possible risk I had not considered:

On Wed, Jul 06, 2011 at 03:51:32PM -0700, kkeller@sonic.net wrote:
> 
> So, here is my xfs_db output.  This is still on a mounted filesystem.

How safe/risky is it to leave this filesystem mounted and in use?
I'm not too concerned about new data, since it won't be a huge amount,
but I am wondering if data that's already been written may be at risk.
Or, it it a reasonable guess that the kernel is still working completely
with the old filesystem geometry, and so won't write anything beyond the
old limits while it's still mounted?  df certainly seems to use the old
fs size, not the new one.

Thanks again,


--keith


-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
@ 2011-07-06 22:51 kkeller
  2011-07-07 18:25 ` Keith Keller
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-07-06 22:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

Hello again XFS folks,

I have finally made the time to revisit this, after copying most of my data
elsewhere.

On Sun 03/07/11  9:41 PM , Eric Sandeen <sandeen@sandeen.net> wrote:
> On 7/3/11 11:34 PM, kkeller@sonic.net wrote:

> > How safe is running xfs_db with -r on my mounted filesystem? I
> 
> it's safe. At worst it might read inconsistent data, but it's
> perfectly safe.

So, here is my xfs_db output.  This is still on a mounted filesystem.

# xfs_db -r -c 'sb 0' -c 'print' /dev/mapper/saharaVG-saharaLV
magicnum = 0x58465342
blocksize = 4096
dblocks = 5371061248
rblocks = 0
rextents = 0
uuid = 1bffcb88-0d9d-4228-93af-83ec9e208e88
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 91552192
agcount = 59
rbmblocks = 0
logblocks = 32768
versionnum = 0x30e4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 19556544
ifree = 1036
fdblocks = 2634477046
frextents = 0
uquotino = 131
gquotino = 132
qflags = 0x7
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0


#  xfs_db -r -c 'sb 1' -c 'print' /dev/mapper/saharaVG-saharaLV
magicnum = 0x58465342
blocksize = 4096
dblocks = 2929670144
rblocks = 0
rextents = 0
uuid = 1bffcb88-0d9d-4228-93af-83ec9e208e88
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 91552192
agcount = 32
rbmblocks = 0
logblocks = 32768
versionnum = 0x30e4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 19528640
ifree = 15932
fdblocks = 170285408
frextents = 0
uquotino = 131
gquotino = 132
qflags = 0x7
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0


I can immediately see with a diff that dblocks and agcount are
different.  Some other variables are also different, namely icount,
ifree, and fdblocks, which I am unclear how to interpret.  But judging
from the other threads I quoted, it seems that dblocks and agcount
are using values for a 20TB filesystem, and that therefore on a umount
the filesystem will become (at least temporarily) unmountable.

I've seen two different routes for trying to correct this issue--either use
xfs_db to manipulate the values directly, or using xfs_repair on a frozen
ro-mounted filesystem with a dump from xfs_metadata.  My worry about
the latter is twofold--will I even be able to do a remount?  And will I
have space for a dump from xfs_metadata of an 11TB filesystem?  I have
also seen advice in some of the other threads that xfs_repair can actually
make the damage worse (though presumably xfs_repair -n should be safe).

If xfs_db is a better way to go, and if the values xfs_db returns on a
umount don't change, would I simply do this?

# xfs_db -x /dev/mapper/saharaVG-saharaLV
sb 0 w dblocks = 2929670144 w agcount = 32

and then do an xfs_repair -n?

A route I have used many ages ago, on ext2 filesystems, was to specify
an alternate superblock when running e2fsck.  Can xfs_repair do this?

> Get a recent xfsprogs too, if you haven't already, it scales better
> than the really old versions.

I think I may have asked this in another post, but would you suggest
compiling 3.0 from source?  The version that CentOS distributes is marked
as 2.9.4, but I don't know what patches they've applied (if any).  Would 3.0
be more likely to help recover the fs?

Thanks all for your patience!

--keith

-- 
kkeller@sonic.net


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-07-04  4:34 kkeller
@ 2011-07-04  4:41 ` Eric Sandeen
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2011-07-04  4:41 UTC (permalink / raw)
  To: kkeller; +Cc: xfs

On 7/3/11 11:34 PM, kkeller@sonic.net wrote:
> 
> 
> On Sun 03/07/11  3:14 PM , Eric Sandeen <sandeen@sandeen.net> wrote:
> 
> [some rearranging]
> 
>> You're welcome but here's the obligatory plug in return - running
>> RHEL5 proper would have gotten you up to date, fully supported xfs,
>> and you wouldn't have run into this mess. Just sayin' ... ;)
> 
> Yep, that's definitely a lesson learned.  Though I don't think I can
> blame CentOS either--from what I can tell the bug has been available
> from yum for some time now.  So it's pretty much entirely my own
> fault.  :(

well it's unfortunate that that kmod persists.  I'll admit to providing
it, years and years ago... Centos should find a way to deprecate it...

> I also am sorry for not preserving threading--for some reason, the
> SGI mailserver rejected mail from my normal host (which is odd, as
> it's not in any blacklists I know of), so I am using an unfamiliar
> mail client.

sgi email ... sucks ;)

>> You probably hit this bug: 
>> http://oss.sgi.com/archives/xfs/2007-01/msg00053.html [1]
>> 
>> See also: http://oss.sgi.com/archives/xfs/2009-07/msg00087.html
>> [2]
>> 
>> I can't remember how much damage the original bug did ...
> 
> If any?  I'm a bit amazed that, if there was damage, that the
> filesystem is still usable.  Perhaps if I were to fill it it would
> show signs of inconsistency?  Or remounting would read the
> now-incorrect values from the superblock 0?
> 
>> is it still mounted I guess?
> 
> Yes, it's still mounted, and as far as I can tell perfectly fine.
> But I won't really know till I can throw xfs_repair -n and/or xfs_db
> and/or remount it; I'm choosing to get as much data off as I can
> before I try these things, just in case.
> 
> How safe is running xfs_db with -r on my mounted filesystem?  I

it's safe.  At worst it might read inconsistent data, but it's
perfectly safe.

> understand that results might not be consistent, but on the off
> chance that they are I am hoping that it might be at least a little
> helpful.
> 
> I was re-reading some of the threads I posted in my original
> messages, in particular these posts:
> 
> http://oss.sgi.com/archives/xfs/2009-09/msg00210.html 
> http://oss.sgi.com/archives/xfs/2009-09/msg00211.html
> 
> If I am reading those, plus the xfs_db man page, correctly, it seems
> like what Russell suggested was to look at superblock 1 (or some
> other one?) and use those values to correct superblock 0.  At what

don't worry about correcting anything until you know there is a problem :)

> points (if any) are the other superblocks updated?  I was testing on
> another machine, on a filesystem that I had successfully grown using
> xfs_growfs, and of the two values Russell suggested the OP to change,
> dblocks is different between sb 0 and sb 1, but agcount is not.
> Could that just be that I did not grow the filesystem too much, so
> that agcount didn't need to change?  That seems a bit
> counterintuitive, but (as should be obvious) I don't know XFS all

if you grew it 9T, you would have almost certainly gotten more AGs.
If you did a smaller test then you might see that.  To be honest
I don't remember when the backup superblocks get updated.

> that well.  I am hoping to know because, in re-reading those
> messages, I got a better idea of what those particular xfs_db
> commands do, so that if I did run into problems remounting, I might
> be able to determine the appropriate new values myself and reduce my
> downtime.  But I want to understand more what I'm doing before I try
> that!

I think finding a way to do a dry-run xfs_repair would be the best
place to start ...

Get a recent xfsprogs too, if you haven't already, it scales better
than the really old versions.

-Eric
 
> --keith
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
@ 2011-07-04  4:34 kkeller
  2011-07-04  4:41 ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-07-04  4:34 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs



On Sun 03/07/11  3:14 PM , Eric Sandeen <sandeen@sandeen.net> wrote:

[some rearranging]

> You're welcome but here's the obligatory plug in return - running RHEL5
> proper would have gotten you up to date, fully supported xfs, and you
> wouldn't have run into this mess. Just sayin' ... ;)

Yep, that's definitely a lesson learned.  Though I don't think I can blame CentOS either--from what I can tell the bug has been available from yum for some time now.  So it's pretty much entirely my own fault.  :(

I also am sorry for not preserving threading--for some reason, the SGI mailserver rejected mail from my normal host (which is odd, as it's not in any blacklists I know of), so I am using an unfamiliar mail client.

> You probably hit this bug:
> http://oss.sgi.com/archives/xfs/2007-01/msg00053.html [1]
> 
> See also:
> http://oss.sgi.com/archives/xfs/2009-07/msg00087.html [2]
> 
> I can't remember how much damage the original bug did ...

If any?  I'm a bit amazed that, if there was damage, that the filesystem is still usable.  Perhaps if I were to fill it it would show signs of inconsistency?  Or remounting would read the now-incorrect values from the superblock 0?

> is it still mounted I guess?

Yes, it's still mounted, and as far as I can tell perfectly fine.  But I won't really know till I can throw xfs_repair -n and/or xfs_db and/or remount it; I'm choosing to get as much data off as I can before I try these things, just in case.

How safe is running xfs_db with -r on my mounted filesystem?  I understand that results might not be consistent, but on the off chance that they are I am hoping that it might be at least a little helpful.

I was re-reading some of the threads I posted in my original messages, in particular these posts:

http://oss.sgi.com/archives/xfs/2009-09/msg00210.html
http://oss.sgi.com/archives/xfs/2009-09/msg00211.html

If I am reading those, plus the xfs_db man page, correctly, it seems like what Russell suggested was to look at superblock 1 (or some other one?) and use those values to correct superblock 0.  At what points (if any) are the other superblocks updated?  I was testing on another machine, on a filesystem that I had successfully grown using xfs_growfs, and of the two values Russell suggested the OP to change, dblocks is different between sb 0 and sb 1, but agcount is not.  Could that just be that I did not grow the filesystem too much, so that agcount didn't need to change?  That seems a bit counterintuitive, but (as should be obvious) I don't know XFS all that well.  I am hoping to know because, in re-reading those messages, I got a better idea of what those particular xfs_db commands do, so that if I did run into problems remounting, I might be able to determine the appropriate new values myself and reduce my downtime.  But I want to understand more what I'm doing before I try!
  that!

--keith

-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
       [not found]   ` <20110703193822.GA28632@wombat.san-francisco.ca.us>
@ 2011-07-03 22:14     ` Eric Sandeen
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2011-07-03 22:14 UTC (permalink / raw)
  To: Keith Keller; +Cc: xfs

On 7/3/11 2:38 PM, Keith Keller wrote:
> On Sun, Jul 03, 2011 at 10:59:03AM -0500, Eric Sandeen wrote:
>> On 6/30/11 4:42 PM, kkeller@sonic.net wrote:
>>> # uname -a
>>> Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> Yes, it's not a completely current kernel.  This box is running CentOS 5
>>> with some yum updates.
>>
>> try
>>
>> # rpm -qa | grep xfs
>>
>> If you see anything with "kmod" you're running an exceptionally old xfs codebase.
> 
> Yes, I do have a kmod-xfs package, so clearly a kernel update is in
> order.  So my goals are twofold: 1) verify the current filesystem's
> state--is it healthy, or does it need xfs_db voodoo?  2) once it's
> determined healthy, again attempt to grow the filesystem.  Here is
> my current plan for reaching these goals:

> 0) get a nearer-term backup, just in case :)  The filesystem still seems
> perfectly normal, but without knowing what my first xfs_growfs did I
> don't know if or how long this state will last.

good idea.
 
> 1) umount the fs to run xfs_db
> 
> 2) attempt a remount--is this safe, or is there risk of damaging the
> filesystem?

I'm not sure.

You probably hit this bug:
http://oss.sgi.com/archives/xfs/2007-01/msg00053.html

See also:
http://oss.sgi.com/archives/xfs/2009-07/msg00087.html

I can't remember how much damage the original bug did ...

> 3) If a remount succeeds, then update the kernel and xfsprogs.  If
> a remount doesn't work, then revert to the near-term backup I took
> in 0) and attempt to fix the issue (with the help of the list, I hope).

One thing you might be able to do, though I don't remember for sure
if this works, is to freeze the fs and create an xfs_metadump
image of it.  You can then point xfs_repair at that image, and see
what it finds.  But I'm not sure if metadump will work on a frozen
fs... hm no.  Only if it's mounted ro.

Otherwise -maybe- xfs_repair -n -d might work after a mount -o remount,ro.

(-n -d means operate in no-modify mode on an ro-mounted fs)


So you'd need to mount readonly before you could either do xfs_repair -nd
or xfs_metadump followed by repair of that image.  Either one would give
you an idea of the health of the fs.

> 4) In either case, post my xfs_db output to the list and get your
> opinions on the health of the fs.

repair probably will tell you more as an initial step.

> 5) If the fs seems correct, attempt xfs_growfs again.
> 
> Do all these steps seem reasonable?  I am most concerned about step 2--
> I really do want to be able to remount as quickly as possible, but I
> do not know how to tell whether it's okay from xfs_db's output.  So if a
> remount attempt is reasonably nondestructive (i.e., it won't make worse
> an already unhealthy XFS fs) then I can try it and hope for the best.
> (From the other threads I've seen it seems like it's not a good idea to
> run xfs_repair.)

you can run it with -n to do no-modify.  If it's clean, you're good;
if it's a mess, you won't hurt anything, other than making you sad.  :)

> Would it make more sense to update the kernel and xfsprogs before
> attempting a remount?  If a remount fails under the original kernel,

is it still mounted I guess?

A newer up to date kernel certainly won't make anything -worse-

You should uninstall that kmod rpm though so it doesn't get priority
over the xfs.ko in the new kernel.  If you need to revert to the old
kernel, you could always reinstall it.

> what do people think the odds are that a new kernel would be able to
> mount the original fs, or is that really unwise?

I don't think a newer kernel would do any further harm.

> Again, many thanks for all your help.

You're welcome but here's the obligatory plug in return - running RHEL5
proper would have gotten you up to date, fully supported xfs, and you wouldn't
have run into this mess.  Just sayin' ... ;)

-Eric
 
> --keith
> 
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
@ 2011-07-03 19:42 kkeller
  0 siblings, 0 replies; 15+ messages in thread
From: kkeller @ 2011-07-03 19:42 UTC (permalink / raw)
  To: xfs; +Cc: Eric Sandeen

On Sun, Jul 03, 2011 at 10:59:03AM -0500, Eric Sandeen wrote:
> On 6/30/11 4:42 PM, kkeller@sonic.net wrote:
> > # uname -a
> > Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
> > 
> > Yes, it's not a completely current kernel. This box is running CentOS 5
> > with some yum updates.
> 
> try
> 
> # rpm -qa | grep xfs
> 
> If you see anything with "kmod" you're running an exceptionally old xfs codebase.


Yes, I do have a kmod-xfs package, so clearly a kernel update is in
order. So my goals are twofold: 1) verify the current filesystem's
state--is it healthy, or does it need xfs_db voodoo? 2) once it's
determined healthy, again attempt to grow the filesystem. Here is
my current plan for reaching these goals:

0) get a nearer-term backup, just in case :) The filesystem still seems
perfectly normal, but without knowing what my first xfs_growfs did I
don't know if or how long this state will last.

1) umount the fs to run xfs_db

2) attempt a remount--is this safe, or is there risk of damaging the filesystem?

3) If a remount succeeds, then update the kernel and xfsprogs. If a remount
doesn't work, then revert to the near-term backup I took in 0) and attempt
to fix the issue (with the help of the list, I hope).

4) In either case, post my xfs_db output to the list and get your
opinions on the health of the fs.

5) If the fs seems correct, attempt xfs_growfs again.

Do all these steps seem reasonable? I am most concerned about step 2--
I really do want to be able to remount as quickly as possible, but I
do not know how to tell whether it's okay from xfs_db's output. So if a
remount attempt is reasonably nondestructive (i.e., it won't make worse
an already unhealthy XFS fs) then I can try it and hope for the best.
(From the other threads I've seen it seems like it's not a good idea to
run xfs_repair.)

Would it make more sense to update the kernel and xfsprogs before
attempting a remount? If a remount fails under the original kernel,
what do people think the odds are that a new kernel would be able to
mount the original fs, or is that really unwise?

Again, many thanks for all your help.

--keith

-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-07-03 15:59 ` Eric Sandeen
@ 2011-07-03 16:01   ` Eric Sandeen
       [not found]   ` <20110703193822.GA28632@wombat.san-francisco.ca.us>
  1 sibling, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2011-07-03 16:01 UTC (permalink / raw)
  To: kkeller; +Cc: xfs

On 7/3/11 10:59 AM, Eric Sandeen wrote:
> On 6/30/11 4:42 PM, kkeller@sonic.net wrote:
>> # uname -a
>> Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>>
>> Yes, it's not a completely current kernel.  This box is running CentOS 5
>> with some yum updates.
> 
> try
> 
> # rpm -qa | grep xfs
> 
> If you see anything with "kmod" you're running an exceptionally old xfs codebase.
> 
> 2.6.18-138 and beyond should have a newer xfs backport built into the kernel
> rpm itself, as shipped from Red Hat.  But the ancient xfs-kmod (or similar)
> provided xfs.ko will take precedence even if you update that kernel.

... unless you uninstall the xfs-kmod package.

(i'm not sure how to set precedence of found kernel modules, I guess)

-Eric

> * Fri Apr 03 2009 Don Zickus <dzickus@redhat.com> [2.6.18-138.el5]
> ...
> - [fs] xfs:  update to 2.6.28.6 codebase (Eric Sandeen ) [470845]
> 
> If at all possible I'd try an updated kernel, especially if your xfs.ko
> is provided by the very, very, very old centos xfs-kmod rpm.
> 
> -Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
  2011-06-30 21:42 kkeller
@ 2011-07-03 15:59 ` Eric Sandeen
  2011-07-03 16:01   ` Eric Sandeen
       [not found]   ` <20110703193822.GA28632@wombat.san-francisco.ca.us>
  0 siblings, 2 replies; 15+ messages in thread
From: Eric Sandeen @ 2011-07-03 15:59 UTC (permalink / raw)
  To: kkeller; +Cc: xfs

On 6/30/11 4:42 PM, kkeller@sonic.net wrote:
> # uname -a
> Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
> 
> Yes, it's not a completely current kernel.  This box is running CentOS 5
> with some yum updates.

try

# rpm -qa | grep xfs

If you see anything with "kmod" you're running an exceptionally old xfs codebase.

2.6.18-138 and beyond should have a newer xfs backport built into the kernel
rpm itself, as shipped from Red Hat.  But the ancient xfs-kmod (or similar)
provided xfs.ko will take precedence even if you update that kernel.

* Fri Apr 03 2009 Don Zickus <dzickus@redhat.com> [2.6.18-138.el5]
...
- [fs] xfs:  update to 2.6.28.6 codebase (Eric Sandeen ) [470845]

If at all possible I'd try an updated kernel, especially if your xfs.ko
is provided by the very, very, very old centos xfs-kmod rpm.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfs_growfs doesn't resize
@ 2011-07-01 16:44 kkeller
  0 siblings, 0 replies; 15+ messages in thread
From: kkeller @ 2011-07-01 16:44 UTC (permalink / raw)
  To: xfs

Thanks for the response, Dave!  I have some additional questions inline.


On Fri 01/07/11  3:46 AM , Dave Chinner <david@fromorbit.com> wrote:

> So either way, you will have to unmount the filesystem.

Yikes!  I am guessing that may put the filesystem at risk of not being able to re-mount without xfs_db commands, as happened to the other posters I cited.  If I want to minimize the amount of downtime if umounting does cause the fs not to be mountable, is there a way for me to look at the xfs_db output after I umount, and calculate any new parameters myself?  Or is that considered generally unwise, and xfs_db needs an expert to look at the output?  I want to minimize downtime, but I also want to minimize the risk of data loss, so I wouldn't want to derive my own xfs_db commands unless it was very safe.  (Even with backups available, it's more work to switch over or restore if I do lose the filesystem; we're a small group so we don't have an automatic failover server.)

Are there any other docs concerning using xfs_db?  I saw a post from last year that said that there weren't, but I'm wondering if that's changed since then.  There is of course the man page, but that doesn't describe how to interpret what's going on from its output (or what the correct steps to take are if there's a problem).

> > ==Assuming my filesystem is healthy, will a simple kernel update
> > (and reboot of course!) allow me to resize the filesystem in one
> > step, instead of 2TB increments?
> 
> I'd upgrade both kernel and userspace.

Would you recommend upgrading userspace from source?  CentOS 5 still calls the version available (from their centosplus repo) 2.9.4, but I haven't investigated what sort of patches they may have applied.


--keith


-- 
kkeller@sonic.net


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* xfs_growfs doesn't resize
@ 2011-06-30 21:42 kkeller
  2011-07-03 15:59 ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-06-30 21:42 UTC (permalink / raw)
  To: xfs

Hello kind XFS folks,

I am having a strange issue with xfs_growfs, and before I attempt to do
something potentially unsafe, I thought I would check in with the list
for advice.

Our fileserver had an ~11TB xfs filesystem hosted under linux lvm.  I
recently added more disks to create a new 9TB container, and used the
lvm tools to add the container to the existing volume group.  When I
went to xfs_growfs the filesystem, I had the first issue that this user
had, where the metadata was reported, but there was no message about
the new number of blocks:

http://oss.sgi.com/archives/xfs/2008-01/msg00085.html

Fortunately, I have not yet seen the other symptoms that the OP saw: I
can still read from and write to the original filesystem.  But the
filesystem size hasn't changed, and I'm not experienced enough to
interpret the xfs_info output properly.

I read through that thread (and others), but none seemed specific to my
issue.  Plus, since my filesystem still seems healthy, I'm hoping that
there's a graceful way to resolve the issue and add the new disk space.

Here's some of the information I've seen asked for in the past.  I
apologize for it being fairly long.

/proc/partitions:

major minor  #blocks  name

   8     0  244129792 sda
   8     1     104391 sda1
   8     2    8385930 sda2
   8     3   21205800 sda3
   8     4          1 sda4
   8     5   30876898 sda5
   8     6   51761398 sda6
   8     7   20555136 sda7
   8     8    8233281 sda8
   8     9   20603331 sda9
   8    16 11718684672 sdb
   8    17 11718684638 sdb1
 253     1 21484244992 dm-1
   8    48 9765570560 sdd
   8    49 9765568085 sdd1


sdb1 is the original member of the volume group.  sdd1 is the new PV.  I
believe dm-1 is the LV where the volume group is hosted (and all the LVM
tools report a 20TB logical volume).


# lvdisplay 
  --- Logical volume ---
  LV Name                /dev/saharaVG/saharaLV
  VG Name                saharaVG
  LV UUID                DjacPa-p9mk-mBmv-69c2-dmXF-LfxQ-wsRUOD
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                20.01 TB
  Current LE             5245177
  Segments               2
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:1

# uname -a
Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

Yes, it's not a completely current kernel.  This box is running CentOS 5
with some yum updates.

# xfs_growfs -V
xfs_growfs version 2.9.4

This xfs_info is from after the xfs_growfs attempt.  I regret that I don't have one from before; I was actually thinking of it, but the resize went so smoothly on my test machine (and went fine in the past as well on other platforms) that I didn't give it much thought till it was too late.

# xfs_info /export/
meta-data=/dev/mapper/saharaVG-saharaLV isize=256    agcount=32, agsize=91552192 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=2929670144,
imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

I saw requests to run xfs_db, but I don't want to mess up the syntax, even if -r should be safe.

Thanks for any help you can provide!

--keith


-- 
kkeller@sonic.net


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-07-07 22:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-30 23:30 xfs_growfs doesn't resize kkeller
2011-07-01 10:46 ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2011-07-06 22:51 kkeller
2011-07-07 18:25 ` Keith Keller
2011-07-07 19:34   ` Eric Sandeen
2011-07-07 22:23     ` Keith Keller
2011-07-07 22:30       ` Eric Sandeen
2011-07-04  4:34 kkeller
2011-07-04  4:41 ` Eric Sandeen
2011-07-03 19:42 kkeller
2011-07-01 16:44 kkeller
2011-06-30 21:42 kkeller
2011-07-03 15:59 ` Eric Sandeen
2011-07-03 16:01   ` Eric Sandeen
     [not found]   ` <20110703193822.GA28632@wombat.san-francisco.ca.us>
2011-07-03 22:14     ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.