All of lore.kernel.org
 help / color / mirror / Atom feed
* Cannot delete some empty dirs and weird sizes
@ 2012-01-31 12:00 Amon Ott
  2012-01-31 18:40 ` Gregory Farnum
  0 siblings, 1 reply; 5+ messages in thread
From: Amon Ott @ 2012-01-31 12:00 UTC (permalink / raw)
  To: ceph-devel

Hi again!

We are running Ceph 0.41 and kernel 3.2.2 with current for-linus code (commit  
3d882ce47de80e0294a536bec771b5651885b4d3) now.

After some heavy workloads we see quite a few directories that cannot be 
deleted, although ls and find show that they are empty. rmdir says they are 
not empty.

Additionally, ceph reports various weird size values for some, but not all of 
them:
ls -la .tmp/tiny61/.mozilla/firefox/default.yat/
insgesamt 0
drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 .
drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 ..

Is this a known or a new bug? Can it be related to .snap pseudo dirs? The 
problem appeared without ever using snapshots, though.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Am Köllnischen Park 1    Fax: +49 30 24342336
10179 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot delete some empty dirs and weird sizes
  2012-01-31 12:00 Cannot delete some empty dirs and weird sizes Amon Ott
@ 2012-01-31 18:40 ` Gregory Farnum
  2012-02-01 17:02   ` Amon Ott
  0 siblings, 1 reply; 5+ messages in thread
From: Gregory Farnum @ 2012-01-31 18:40 UTC (permalink / raw)
  To: Amon Ott; +Cc: ceph-devel, Sage Weil

On Tue, Jan 31, 2012 at 4:00 AM, Amon Ott <a.ott@m-privacy.de> wrote:
> Hi again!
>
> We are running Ceph 0.41 and kernel 3.2.2 with current for-linus code (commit
> 3d882ce47de80e0294a536bec771b5651885b4d3) now.
>
> After some heavy workloads we see quite a few directories that cannot be
> deleted, although ls and find show that they are empty. rmdir says they are
> not empty.
>
> Additionally, ceph reports various weird size values for some, but not all of
> them:
> ls -la .tmp/tiny61/.mozilla/firefox/default.yat/
> insgesamt 0
> drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 .
> drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 ..
>
> Is this a known or a new bug? Can it be related to .snap pseudo dirs? The
> problem appeared without ever using snapshots, though.

I believe this is new. Based on the odd sizes (that's a 64-bit -1
interpreted as unsigned, fyi), my guess is that the "recursive
accounting" statistics are off and that's leading the MDS to believe
the directory is not empty even though it is. It's unlikely to be
directly related to snapshots, though it's not impossible.

Have you seen this on more than one MDS? If it's reproducible we could
more easily figure out the cause; otherwise the best we can do is to
maybe fix up the specific instance of it.
-Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot delete some empty dirs and weird sizes
  2012-01-31 18:40 ` Gregory Farnum
@ 2012-02-01 17:02   ` Amon Ott
  2012-02-02 21:09     ` Gregory Farnum
  0 siblings, 1 reply; 5+ messages in thread
From: Amon Ott @ 2012-02-01 17:02 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel, Sage Weil

On Tuesday 31 January 2012 wrote Gregory Farnum:
> On Tue, Jan 31, 2012 at 4:00 AM, Amon Ott <a.ott@m-privacy.de> wrote:
> > Hi again!
> >
> > We are running Ceph 0.41 and kernel 3.2.2 with current for-linus code
> > (commit 3d882ce47de80e0294a536bec771b5651885b4d3) now.
> >
> > After some heavy workloads we see quite a few directories that cannot be
> > deleted, although ls and find show that they are empty. rmdir says they
> > are not empty.
> >
> > Additionally, ceph reports various weird size values for some, but not
> > all of them:
> > ls -la .tmp/tiny61/.mozilla/firefox/default.yat/
> > insgesamt 0
> > drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 .
> > drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 ..
> >
> > Is this a known or a new bug? Can it be related to .snap pseudo dirs? The
> > problem appeared without ever using snapshots, though.
>
> I believe this is new. Based on the odd sizes (that's a 64-bit -1
> interpreted as unsigned, fyi), my guess is that the "recursive
> accounting" statistics are off and that's leading the MDS to believe
> the directory is not empty even though it is. It's unlikely to be
> directly related to snapshots, though it's not impossible.
>
> Have you seen this on more than one MDS? If it's reproducible we could
> more easily figure out the cause; otherwise the best we can do is to
> maybe fix up the specific instance of it.

I had to recreate ceph fs several times today because of kernel problems. Now 
I have only one dir that is wrong:
ls -la .tmp/tiny14/.config/pcmanfm/LXDE/
insgesamt 0
drwxr-xr-x 1 32252 users 393  1. Feb 15:19 .
drwxr-xr-x 1 32252 users   0  1. Feb 17:21 ..

This is probably caused by another reboot I had to do, although I think ceph 
should have recovered here. Might also be caused by this setting that I tried 
for a while, it is off now:
mds standby replay = true
With this setting, if the active mds gets killed, no mds is able to become 
active, so everything hangs. Had to reboot again.

Found that in mds log, the reported wrong size matches the dir total:

2012-02-01 17:21:51.306561 4f830b70 mds.0.cache.dir(1000000b055) _fetched  
badness: got (but i already had) [inode 100000066c9
[2,head] /tiny14/.config/pcmanfm/
LXDE.conf auth v4 s=393 n(v0 b393 1=1+0) (iversion lock) cr={4711=0-4194304@1} 
caps={5313=pAsLsXsFscr/-@1} | caps 0x1d13c600] mode 33188 mtime 2012-01-24 
15:55:59.0000002012-02-01 17:21:51.306646 4f830b70 log [ERR] : loaded dup 
inode 100000066c9 [2,head] v7 at /tiny14/.config/pcmanfm/LXDE/pcmanfm.conf, 
but inode 100000066c9.head v4 already exists 
at /tiny14/.config/pcmanfm/LXDE.conf
2012-02-01 17:21:51.349424 4f830b70 mds.0.cache.dir(100000066ae) mismatch 
between head items and fnode.fragstat! printing dentries
2012-02-01 17:21:51.349457 4f830b70 mds.0.cache.dir(100000066ae) 
get_num_head_items() = 2; fnode.fragstat.nfiles=0 fnode.fragstat.nsubdirs=1
2012-02-01 17:21:51.349493 4f830b70 mds.0.cache.dir(100000066ae) [dentry 
#1/tiny14/.config/pcmanfm/LXDE [2,head] auth (dversion lock) pv=0 v=16 
inode=0x1cff3828 | inodepin 0x1b9f4de0]
2012-02-01 17:21:51.349521 4f830b70 mds.0.cache.dir(100000066ae) [dentry 
#1/tiny14/.config/pcmanfm/LXDE.conf [2,head] auth (dn xlock x=1 by 
0x1ab21200) (dversion lock w=1 last_client=5313) pv=17 v=16 ap=2+2 
inode=0x1d13c600 | request lock inodepin authpin 0x1b90f064]
2012-02-01 17:21:51.349552 4f830b70 mds.0.cache.dir(100000066ae) mismatch 
between child accounted_rstats and my rstats!
2012-02-01 17:21:51.349573 4f830b70 mds.0.cache.dir(100000066ae) total of 
child dentrys: n(v0 rc2012-02-01 15:19:55.517733 b786 3=2+1)
2012-02-01 17:21:51.349591 4f830b70 mds.0.cache.dir(100000066ae) my rstats:              
n(v3 rc2012-02-01 15:19:55.517733 b393 2=1+1)
2012-02-01 17:21:51.349616 4f830b70 mds.0.cache.dir(100000066ae) [dentry 
#1/tiny14/.config/pcmanfm/LXDE [2,head] auth (dversion lock) pv=0 v=16 
inode=0x1cff3828 | inodepin 0x1b9f4de0] n(v0 rc2012-02-01 15:19:55.517733 
b393 2=1+1)
2012-02-01 17:21:51.349643 4f830b70 mds.0.cache.dir(100000066ae) [dentry 
#1/tiny14/.config/pcmanfm/LXDE.conf [2,head] auth (dn xlock x=1 by 
0x1ab21200) (dversion lock w=1 last_client=5313) pv=17 v=16 ap=2+2 
inode=0x1d13c600 | request lock inodepin authpin 0x1b90f064] n(v0 b393 1=1+0)


Then killed the active mds, another takes over and suddenly the missing file 
appears:
.tmp/tiny14/.config/pcmanfm/LXDE/insgesamt 1
drwxr-xr-x 1 32252 users 393  1. Feb 15:19 .
drwxr-xr-x 1 32252 users   0  1. Feb 17:21 ..
-rw-r--r-- 1 root  root  393 24. Jan 15:55 pcmanfm.conf

Restarted the original mds, it does not appear in "ceph mds dump", although it 
is running at 100% cpu. Same happened with other mds processes after killing 
and starting, now I have only one left that is working correctly.

Will leave the cluster in this state now and have another look tomorrow - 
maybe the spinning mds processes recover by some miracle.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Am Köllnischen Park 1    Fax: +49 30 24342336
10179 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot delete some empty dirs and weird sizes
  2012-02-01 17:02   ` Amon Ott
@ 2012-02-02 21:09     ` Gregory Farnum
  2012-02-03  8:24       ` Amon Ott
  0 siblings, 1 reply; 5+ messages in thread
From: Gregory Farnum @ 2012-02-02 21:09 UTC (permalink / raw)
  To: Amon Ott; +Cc: ceph-devel, Sage Weil

On Wed, Feb 1, 2012 at 9:02 AM, Amon Ott <a.ott@m-privacy.de> wrote:
> On Tuesday 31 January 2012 wrote Gregory Farnum:
>> On Tue, Jan 31, 2012 at 4:00 AM, Amon Ott <a.ott@m-privacy.de> wrote:
>> > Hi again!
>> >
>> > We are running Ceph 0.41 and kernel 3.2.2 with current for-linus code
>> > (commit 3d882ce47de80e0294a536bec771b5651885b4d3) now.
>> >
>> > After some heavy workloads we see quite a few directories that cannot be
>> > deleted, although ls and find show that they are empty. rmdir says they
>> > are not empty.
>> >
>> > Additionally, ceph reports various weird size values for some, but not
>> > all of them:
>> > ls -la .tmp/tiny61/.mozilla/firefox/default.yat/
>> > insgesamt 0
>> > drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 .
>> > drwxr-xr-x 1 tiny61 users 18446744073705748665 25. Jan 10:02 ..
>> >
>> > Is this a known or a new bug? Can it be related to .snap pseudo dirs? The
>> > problem appeared without ever using snapshots, though.
>>
>> I believe this is new. Based on the odd sizes (that's a 64-bit -1
>> interpreted as unsigned, fyi), my guess is that the "recursive
>> accounting" statistics are off and that's leading the MDS to believe
>> the directory is not empty even though it is. It's unlikely to be
>> directly related to snapshots, though it's not impossible.
>>
>> Have you seen this on more than one MDS? If it's reproducible we could
>> more easily figure out the cause; otherwise the best we can do is to
>> maybe fix up the specific instance of it.
>
> I had to recreate ceph fs several times today because of kernel problems. Now
> I have only one dir that is wrong:
> ls -la .tmp/tiny14/.config/pcmanfm/LXDE/
> insgesamt 0
> drwxr-xr-x 1 32252 users 393  1. Feb 15:19 .
> drwxr-xr-x 1 32252 users   0  1. Feb 17:21 ..
>
> This is probably caused by another reboot I had to do, although I think ceph
> should have recovered here. Might also be caused by this setting that I tried
> for a while, it is off now:
> mds standby replay = true
> With this setting, if the active mds gets killed, no mds is able to become
> active, so everything hangs. Had to reboot again.

Hrm. That setting simply tells the non-active MDSes that they should
follow the journal of the active MDS(es). They should still go active
if the MDS they're following fails — although it does slightly
increase the chances of them running into the same bugs in code and
dying at the same time.


> Found that in mds log, the reported wrong size matches the dir total:
>
> 2012-02-01 17:21:51.306561 4f830b70 mds.0.cache.dir(1000000b055) _fetched
> badness: got (but i already had) [inode 100000066c9
> [2,head] /tiny14/.config/pcmanfm/
> LXDE.conf auth v4 s=393 n(v0 b393 1=1+0) (iversion lock) cr={4711=0-4194304@1}
> caps={5313=pAsLsXsFscr/-@1} | caps 0x1d13c600] mode 33188 mtime 2012-01-24
> 15:55:59.0000002012-02-01 17:21:51.306646 4f830b70 log [ERR] : loaded dup
> inode 100000066c9 [2,head] v7 at /tiny14/.config/pcmanfm/LXDE/pcmanfm.conf,
> but inode 100000066c9.head v4 already exists
> at /tiny14/.config/pcmanfm/LXDE.conf
> 2012-02-01 17:21:51.349424 4f830b70 mds.0.cache.dir(100000066ae) mismatch
> between head items and fnode.fragstat! printing dentries
> 2012-02-01 17:21:51.349457 4f830b70 mds.0.cache.dir(100000066ae)
> get_num_head_items() = 2; fnode.fragstat.nfiles=0 fnode.fragstat.nsubdirs=1
> 2012-02-01 17:21:51.349493 4f830b70 mds.0.cache.dir(100000066ae) [dentry
> #1/tiny14/.config/pcmanfm/LXDE [2,head] auth (dversion lock) pv=0 v=16
> inode=0x1cff3828 | inodepin 0x1b9f4de0]
> 2012-02-01 17:21:51.349521 4f830b70 mds.0.cache.dir(100000066ae) [dentry
> #1/tiny14/.config/pcmanfm/LXDE.conf [2,head] auth (dn xlock x=1 by
> 0x1ab21200) (dversion lock w=1 last_client=5313) pv=17 v=16 ap=2+2
> inode=0x1d13c600 | request lock inodepin authpin 0x1b90f064]
> 2012-02-01 17:21:51.349552 4f830b70 mds.0.cache.dir(100000066ae) mismatch
> between child accounted_rstats and my rstats!
> 2012-02-01 17:21:51.349573 4f830b70 mds.0.cache.dir(100000066ae) total of
> child dentrys: n(v0 rc2012-02-01 15:19:55.517733 b786 3=2+1)
> 2012-02-01 17:21:51.349591 4f830b70 mds.0.cache.dir(100000066ae) my rstats:
> n(v3 rc2012-02-01 15:19:55.517733 b393 2=1+1)
> 2012-02-01 17:21:51.349616 4f830b70 mds.0.cache.dir(100000066ae) [dentry
> #1/tiny14/.config/pcmanfm/LXDE [2,head] auth (dversion lock) pv=0 v=16
> inode=0x1cff3828 | inodepin 0x1b9f4de0] n(v0 rc2012-02-01 15:19:55.517733
> b393 2=1+1)
> 2012-02-01 17:21:51.349643 4f830b70 mds.0.cache.dir(100000066ae) [dentry
> #1/tiny14/.config/pcmanfm/LXDE.conf [2,head] auth (dn xlock x=1 by
> 0x1ab21200) (dversion lock w=1 last_client=5313) pv=17 v=16 ap=2+2
> inode=0x1d13c600 | request lock inodepin authpin 0x1b90f064] n(v0 b393 1=1+0)
>
>
> Then killed the active mds, another takes over and suddenly the missing file
> appears:
> .tmp/tiny14/.config/pcmanfm/LXDE/insgesamt 1
> drwxr-xr-x 1 32252 users 393  1. Feb 15:19 .
> drwxr-xr-x 1 32252 users   0  1. Feb 17:21 ..
> -rw-r--r-- 1 root  root  393 24. Jan 15:55 pcmanfm.conf

So were you able to delete that file once it reappeared?

> Restarted the original mds, it does not appear in "ceph mds dump", although it
> is running at 100% cpu. Same happened with other mds processes after killing
> and starting, now I have only one left that is working correctly.

Remind me, you do only have one active MDS, correct?
Did you look at the logs and see what the MDS was doing with 100% cpu?

> Will leave the cluster in this state now and have another look tomorrow -
> maybe the spinning mds processes recover by some miracle.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot delete some empty dirs and weird sizes
  2012-02-02 21:09     ` Gregory Farnum
@ 2012-02-03  8:24       ` Amon Ott
  0 siblings, 0 replies; 5+ messages in thread
From: Amon Ott @ 2012-02-03  8:24 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel, Sage Weil

On Thursday 02 February 2012 wrote Gregory Farnum:
> On Wed, Feb 1, 2012 at 9:02 AM, Amon Ott <a.ott@m-privacy.de> wrote:
> > ceph should have recovered here. Might also be caused by this setting
> > that I tried for a while, it is off now:
> > mds standby replay = true
> > With this setting, if the active mds gets killed, no mds is able to
> > become active, so everything hangs. Had to reboot again.
>
> Hrm. That setting simply tells the non-active MDSes that they should
> follow the journal of the active MDS(es). They should still go active
> if the MDS they're following fails — although it does slightly
> increase the chances of them running into the same bugs in code and
> dying at the same time.

Well, in this test the following MDS did not become active, it kept stuck in 
replay mode. As waking up without that following is also fast enough for us, 
I quickly turned it off again. Just wanted to mention that there might be 
another bug lurking.

> > Then killed the active mds, another takes over and suddenly the missing
> > file appears:
> > .tmp/tiny14/.config/pcmanfm/LXDE/insgesamt 1
> > drwxr-xr-x 1 32252 users 393  1. Feb 15:19 .
> > drwxr-xr-x 1 32252 users   0  1. Feb 17:21 ..
> > -rw-r--r-- 1 root  root  393 24. Jan 15:55 pcmanfm.conf
>
> So were you able to delete that file once it reappeared?

It got deleted correctly by nightly cleanup cron job.

> > Restarted the original mds, it does not appear in "ceph mds dump",
> > although it is running at 100% cpu. Same happened with other mds
> > processes after killing and starting, now I have only one left that is
> > working correctly.
>
> Remind me, you do only have one active MDS, correct?
> Did you look at the logs and see what the MDS was doing with 100% cpu?

Four MDS defined, but using
max mds = 1

> > Will leave the cluster in this state now and have another look tomorrow -
> > maybe the spinning mds processes recover by some miracle.

After yet another complete reboot it seems to be stable again now. So far I 
can say that Ceph FS runs pretty stable with the following conditions:
- 4 nodes with each 8 to 12 CPU cores, 12 GB of RAM (CPUs and RAM mostly 
unused by Ceph)
- Ceph 0.41
- 3 mon, 4 mds (max mds 1), 4 osd, all 4 nodes are both server and client
- Kernel 3.1.10 with some hand integrated patches for Ceph fixes (3.2.2 was 
unstable, need to check)
- OSD storage on normal hard drive, ext4 without journal (btrfs crashes about 
once per day)
- OSD journals on separate SSD
- MDS does not get restarted without reboot (yes, only reboot helps)

From our experience, it could be a bit faster when writing into many different 
files, which is our major work load. However, the last few versions already 
brought significant improvements on stability and speed. The main problem we 
see is the MDS takeover and recovery, which has given a lot of trouble so 
far.

Thanks once more for your good work!

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Am Köllnischen Park 1    Fax: +49 30 24342336
10179 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-02-03  8:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-31 12:00 Cannot delete some empty dirs and weird sizes Amon Ott
2012-01-31 18:40 ` Gregory Farnum
2012-02-01 17:02   ` Amon Ott
2012-02-02 21:09     ` Gregory Farnum
2012-02-03  8:24       ` Amon Ott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.