* nilfs2 weird issue - snapshots are gone, cleanerd not running @ 2012-07-09 7:33 Piotr Szymaniak 2012-07-09 9:28 ` Vyacheslav Dubeyko 2012-07-09 9:33 ` dexen deVries 0 siblings, 2 replies; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-09 7:33 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 2621 bytes --] Hi. I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out of space on my / and found that nilfs_cleanerd isn't working. When I start it from the command line it exits instantly. Also, all previous checkpoints on / (also on two other mountpoints on different machine) are gone. What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On the second machine it's runnig fine (cleaned _all_ checkpoints), on the first one with disk space issue it exits just like 2.1.3. Here are some fs details. Machine with disk space issues, rootfs: CNO DATE TIME MODE FLG NBLKINC ICNT 147688 2012-07-09 08:38:14 cp - 11075 242915 147689 2012-07-09 08:38:14 cp - 60 242895 (…) 148999 2012-07-09 09:13:46 cp - 60 242888 149000 2012-07-09 09:19:45 cp - 44 242888 Filesystem Size Used Avail Use% Mounted on rootfs 24G 13G 11G 56% / mount shows: /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) There's no nilfs_cleanerd with pid 15356. Second machine rootfs: CNO DATE TIME MODE FLG NBLKINC ICNT 92246 2012-07-09 08:16:58 cp - 118 44669 (…) 92439 2012-07-09 09:19:14 cp - 29 44668 92440 2012-07-09 09:19:46 cp - 33 44668 Filesystem Size Used Avail Use% Mounted on rootfs 3.7G 888M 2.6G 26% / (it should be around 3G used) Second machine second mountpoint: CNO DATE TIME MODE FLG NBLKINC ICNT 1496 2012-07-09 03:31:23 cp - 8837 132766 1497 2012-07-09 03:31:26 cp - 468 132766 1498 2012-07-09 03:41:27 cp - 1474 132765 (this fs should containt *all* 1498 checkpoints) Filesystem Size Used Avail Use% Mounted on /dev/dm-2 117G 58G 54G 76% /mnt/home_backup (in this one it should be around 100G of used space) mount: /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) Both cleaners running (the second mountpoint - /mnt/home_backup - is under heavy load and I suppose it will end with around 20G used space). Where to go from this point? How to debug nilfs_cleanerd issue? Piotr Szymaniak. -- Wybudowaliśmy dla siebie klatkę - cywilizację - gdyż byliśmy zdolni do myślenia, a teraz musimy myśleć, ponieważ jesteśmy zamknięci w tej klatce. -- William Wharton, "Birdy" [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-09 7:33 nilfs2 weird issue - snapshots are gone, cleanerd not running Piotr Szymaniak @ 2012-07-09 9:28 ` Vyacheslav Dubeyko 2012-07-09 16:56 ` Piotr Szymaniak 2012-07-09 9:33 ` dexen deVries 1 sibling, 1 reply; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-09 9:28 UTC (permalink / raw) To: Piotr Szymaniak; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Piotr, Does system journals on your machines contain any interested details about reported issue? Could you try to extract some error or warning messages from system journal? Thanks, Vyacheslav Dubeyko. On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > Hi. > > I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > of space on my / and found that nilfs_cleanerd isn't working. When I > start it from the command line it exits instantly. Also, all previous > checkpoints on / (also on two other mountpoints on different machine) > are gone. > > What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > the second machine it's runnig fine (cleaned _all_ checkpoints), on the > first one with disk space issue it exits just like 2.1.3. > > Here are some fs details. Machine with disk space issues, rootfs: > CNO DATE TIME MODE FLG NBLKINC ICNT > 147688 2012-07-09 08:38:14 cp - 11075 242915 > 147689 2012-07-09 08:38:14 cp - 60 242895 > (…) > 148999 2012-07-09 09:13:46 cp - 60 242888 > 149000 2012-07-09 09:19:45 cp - 44 242888 > > Filesystem Size Used Avail Use% Mounted on > rootfs 24G 13G 11G 56% / > > mount shows: > /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > > There's no nilfs_cleanerd with pid 15356. > > > Second machine rootfs: > CNO DATE TIME MODE FLG NBLKINC ICNT > 92246 2012-07-09 08:16:58 cp - 118 44669 > (…) > 92439 2012-07-09 09:19:14 cp - 29 44668 > 92440 2012-07-09 09:19:46 cp - 33 44668 > > Filesystem Size Used Avail Use% Mounted on > rootfs 3.7G 888M 2.6G 26% / > > (it should be around 3G used) > > Second machine second mountpoint: > CNO DATE TIME MODE FLG NBLKINC ICNT > 1496 2012-07-09 03:31:23 cp - 8837 132766 > 1497 2012-07-09 03:31:26 cp - 468 132766 > 1498 2012-07-09 03:41:27 cp - 1474 132765 > > (this fs should containt *all* 1498 checkpoints) > > Filesystem Size Used Avail Use% Mounted on > /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > > (in this one it should be around 100G of used space) > > mount: > /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > > Both cleaners running (the second mountpoint - /mnt/home_backup - is under > heavy load and I suppose it will end with around 20G used space). > > Where to go from this point? How to debug nilfs_cleanerd issue? > > > Piotr Szymaniak. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-09 9:28 ` Vyacheslav Dubeyko @ 2012-07-09 16:56 ` Piotr Szymaniak 2012-07-09 18:55 ` Vyacheslav Dubeyko 0 siblings, 1 reply; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-09 16:56 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 3744 bytes --] On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > Hi Piotr, > > Does system journals on your machines contain any interested details > about reported issue? Could you try to extract some error or warning > messages from system journal? (resend as I replied only to Vyacheslav) If by journals you mean logs then no. I'm only able to find some like this: Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) That's all about nilfs in the last week and current log has only manual runs related to those operation described before. Piotr Szymaniak. > On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > > Hi. > > > > I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > > of space on my / and found that nilfs_cleanerd isn't working. When I > > start it from the command line it exits instantly. Also, all previous > > checkpoints on / (also on two other mountpoints on different machine) > > are gone. > > > > What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > > the second machine it's runnig fine (cleaned _all_ checkpoints), on the > > first one with disk space issue it exits just like 2.1.3. > > > > Here are some fs details. Machine with disk space issues, rootfs: > > CNO DATE TIME MODE FLG NBLKINC ICNT > > 147688 2012-07-09 08:38:14 cp - 11075 242915 > > 147689 2012-07-09 08:38:14 cp - 60 242895 > > (…) > > 148999 2012-07-09 09:13:46 cp - 60 242888 > > 149000 2012-07-09 09:19:45 cp - 44 242888 > > > > Filesystem Size Used Avail Use% Mounted on > > rootfs 24G 13G 11G 56% / > > > > mount shows: > > /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > > > > There's no nilfs_cleanerd with pid 15356. > > > > > > Second machine rootfs: > > CNO DATE TIME MODE FLG NBLKINC ICNT > > 92246 2012-07-09 08:16:58 cp - 118 44669 > > (…) > > 92439 2012-07-09 09:19:14 cp - 29 44668 > > 92440 2012-07-09 09:19:46 cp - 33 44668 > > > > Filesystem Size Used Avail Use% Mounted on > > rootfs 3.7G 888M 2.6G 26% / > > > > (it should be around 3G used) > > > > Second machine second mountpoint: > > CNO DATE TIME MODE FLG NBLKINC ICNT > > 1496 2012-07-09 03:31:23 cp - 8837 132766 > > 1497 2012-07-09 03:31:26 cp - 468 132766 > > 1498 2012-07-09 03:41:27 cp - 1474 132765 > > > > (this fs should containt *all* 1498 checkpoints) > > > > Filesystem Size Used Avail Use% Mounted on > > /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > > > > (in this one it should be around 100G of used space) > > > > mount: > > /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > > /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > > > > Both cleaners running (the second mountpoint - /mnt/home_backup - is under > > heavy load and I suppose it will end with around 20G used space). > > > > Where to go from this point? How to debug nilfs_cleanerd issue? > > > > > > Piotr Szymaniak. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Marriage is like a coffin and each kid is like another nail. -- Homer Simpson [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-09 16:56 ` Piotr Szymaniak @ 2012-07-09 18:55 ` Vyacheslav Dubeyko [not found] ` <51D5FCEA-7103-4D4A-BADA-99A9780D9B68-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-09 18:55 UTC (permalink / raw) To: Piotr Szymaniak; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Piotr, You are right. I can reproduce this issue very simply. The nilfs_cleanerd doesn't started during mount really. I can detect some suspicious output of strace during mount and next trying to start of nilfs_cleanerd: .... set_tid_address(0xb76a0768) = 21036 set_robust_list(0xb76a0770, 0xc) = 0 futex(0xbfdd4f90, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0xbfdd4f90, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfdd4fa0) = -1 EAGAIN (Resource temporarily unavailable) .... mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) But maybe it is not reason of the problem. It needs to investigate the issue more deeply. Thanks, Vyacheslav Dubeyko. On Jul 9, 2012, at 8:56 PM, Piotr Szymaniak wrote: > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: >> Hi Piotr, >> >> Does system journals on your machines contain any interested details >> about reported issue? Could you try to extract some error or warning >> messages from system journal? > > (resend as I replied only to Vyacheslav) > > If by journals you mean logs then no. I'm only able to find some like > this: > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > That's all about nilfs in the last week and current log has only manual > runs related to those operation described before. > > Piotr Szymaniak. > > >> On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: >>> Hi. >>> >>> I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out >>> of space on my / and found that nilfs_cleanerd isn't working. When I >>> start it from the command line it exits instantly. Also, all previous >>> checkpoints on / (also on two other mountpoints on different machine) >>> are gone. >>> >>> What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On >>> the second machine it's runnig fine (cleaned _all_ checkpoints), on the >>> first one with disk space issue it exits just like 2.1.3. >>> >>> Here are some fs details. Machine with disk space issues, rootfs: >>> CNO DATE TIME MODE FLG NBLKINC ICNT >>> 147688 2012-07-09 08:38:14 cp - 11075 242915 >>> 147689 2012-07-09 08:38:14 cp - 60 242895 >>> (…) >>> 148999 2012-07-09 09:13:46 cp - 60 242888 >>> 149000 2012-07-09 09:19:45 cp - 44 242888 >>> >>> Filesystem Size Used Avail Use% Mounted on >>> rootfs 24G 13G 11G 56% / >>> >>> mount shows: >>> /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) >>> >>> There's no nilfs_cleanerd with pid 15356. >>> >>> >>> Second machine rootfs: >>> CNO DATE TIME MODE FLG NBLKINC ICNT >>> 92246 2012-07-09 08:16:58 cp - 118 44669 >>> (…) >>> 92439 2012-07-09 09:19:14 cp - 29 44668 >>> 92440 2012-07-09 09:19:46 cp - 33 44668 >>> >>> Filesystem Size Used Avail Use% Mounted on >>> rootfs 3.7G 888M 2.6G 26% / >>> >>> (it should be around 3G used) >>> >>> Second machine second mountpoint: >>> CNO DATE TIME MODE FLG NBLKINC ICNT >>> 1496 2012-07-09 03:31:23 cp - 8837 132766 >>> 1497 2012-07-09 03:31:26 cp - 468 132766 >>> 1498 2012-07-09 03:41:27 cp - 1474 132765 >>> >>> (this fs should containt *all* 1498 checkpoints) >>> >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/dm-2 117G 58G 54G 76% /mnt/home_backup >>> >>> (in this one it should be around 100G of used space) >>> >>> mount: >>> /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) >>> /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) >>> >>> Both cleaners running (the second mountpoint - /mnt/home_backup - is under >>> heavy load and I suppose it will end with around 20G used space). >>> >>> Where to go from this point? How to debug nilfs_cleanerd issue? >>> >>> >>> Piotr Szymaniak. >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Marriage is like a coffin and each kid is like another nail. > -- Homer Simpson -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <51D5FCEA-7103-4D4A-BADA-99A9780D9B68-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running [not found] ` <51D5FCEA-7103-4D4A-BADA-99A9780D9B68-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2012-07-10 1:53 ` Ryusuke Konishi [not found] ` <20120710.105315.33988123.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Ryusuke Konishi @ 2012-07-10 1:53 UTC (permalink / raw) To: slava-yeENwD64cLxBDgjK7y7TUQ Cc: szarpaj-TbOm9Ca2r9GrDJvtcaxF/A, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Vyacheslav, On Mon, 9 Jul 2012 22:55:40 +0400, Vyacheslav Dubeyko wrote: > Hi Piotr, > > You are right. I can reproduce this issue very simply. The nilfs_cleanerd doesn't started during mount really. > > I can detect some suspicious output of strace during mount and next trying to start of nilfs_cleanerd: > > .... > set_tid_address(0xb76a0768) = 21036 > set_robust_list(0xb76a0770, 0xc) = 0 > futex(0xbfdd4f90, FUTEX_WAKE_PRIVATE, 1) = 0 > futex(0xbfdd4f90, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfdd4fa0) = -1 EAGAIN (Resource temporarily unavailable) > > .... > mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) > > But maybe it is not reason of the problem. It needs to investigate the issue more deeply. Your problem looks that of FAQ #8 on http://www.nilfs.org/en/faq.html > 8. cleanerd (or chcp/mkcp command) fails with an error: ``cannot open > nilfs on /dev/xxx: Function not implemented''. > > Confirm whether tmpfs (former shm fs) is mounted on /dev/shm. POSIX > semaphores do not work if the filesystem on /dev/shm is wrong, > which causes the above failure. > > Some systems are using ramfs instead of tmpfs. You may need to > change kernel configuration and rebuild kernel to enable tmpfs. Please confirm if tmpfs is mounted on /dev/shm. The same issue is reported on the following thread: http://marc.info/?t=133190016900003&r=1&w=2 Regards, Ryusuke Konishi > Thanks, > Vyacheslav Dubeyko. > > On Jul 9, 2012, at 8:56 PM, Piotr Szymaniak wrote: > > > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > >> Hi Piotr, > >> > >> Does system journals on your machines contain any interested details > >> about reported issue? Could you try to extract some error or warning > >> messages from system journal? > > > > (resend as I replied only to Vyacheslav) > > > > If by journals you mean logs then no. I'm only able to find some like > > this: > > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > > > That's all about nilfs in the last week and current log has only manual > > runs related to those operation described before. > > > > Piotr Szymaniak. > > > > > >> On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > >>> Hi. > >>> > >>> I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > >>> of space on my / and found that nilfs_cleanerd isn't working. When I > >>> start it from the command line it exits instantly. Also, all previous > >>> checkpoints on / (also on two other mountpoints on different machine) > >>> are gone. > >>> > >>> What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > >>> the second machine it's runnig fine (cleaned _all_ checkpoints), on the > >>> first one with disk space issue it exits just like 2.1.3. > >>> > >>> Here are some fs details. Machine with disk space issues, rootfs: > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > >>> 147688 2012-07-09 08:38:14 cp - 11075 242915 > >>> 147689 2012-07-09 08:38:14 cp - 60 242895 > >>> (…) > >>> 148999 2012-07-09 09:13:46 cp - 60 242888 > >>> 149000 2012-07-09 09:19:45 cp - 44 242888 > >>> > >>> Filesystem Size Used Avail Use% Mounted on > >>> rootfs 24G 13G 11G 56% / > >>> > >>> mount shows: > >>> /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > >>> > >>> There's no nilfs_cleanerd with pid 15356. > >>> > >>> > >>> Second machine rootfs: > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > >>> 92246 2012-07-09 08:16:58 cp - 118 44669 > >>> (…) > >>> 92439 2012-07-09 09:19:14 cp - 29 44668 > >>> 92440 2012-07-09 09:19:46 cp - 33 44668 > >>> > >>> Filesystem Size Used Avail Use% Mounted on > >>> rootfs 3.7G 888M 2.6G 26% / > >>> > >>> (it should be around 3G used) > >>> > >>> Second machine second mountpoint: > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > >>> 1496 2012-07-09 03:31:23 cp - 8837 132766 > >>> 1497 2012-07-09 03:31:26 cp - 468 132766 > >>> 1498 2012-07-09 03:41:27 cp - 1474 132765 > >>> > >>> (this fs should containt *all* 1498 checkpoints) > >>> > >>> Filesystem Size Used Avail Use% Mounted on > >>> /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > >>> > >>> (in this one it should be around 100G of used space) > >>> > >>> mount: > >>> /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > >>> /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > >>> > >>> Both cleaners running (the second mountpoint - /mnt/home_backup - is under > >>> heavy load and I suppose it will end with around 20G used space). > >>> > >>> Where to go from this point? How to debug nilfs_cleanerd issue? > >>> > >>> > >>> Piotr Szymaniak. > >> > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > Marriage is like a coffin and each kid is like another nail. > > -- Homer Simpson > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20120710.105315.33988123.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running [not found] ` <20120710.105315.33988123.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2012-07-10 7:18 ` Vyacheslav Dubeyko 2012-07-10 8:51 ` Ryusuke Konishi 0 siblings, 1 reply; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-10 7:18 UTC (permalink / raw) To: Ryusuke Konishi Cc: szarpaj-TbOm9Ca2r9GrDJvtcaxF/A, linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 7469 bytes --] Hi Ryusuke, Unfortunately, my kernel is compiled with enabled CONFIG_SHMEM and CONFIG_TMPFS options. Moreover, I can see that /dev/shm mounted with tmpfs type: linux-2.6.34-gentoo-r6 # mount /dev/sda3 on / type xfs (rw,noatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) udev on /dev type tmpfs (rw,nosuid,relatime,size=10240k,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev) usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85) /dev/sdb2 on /mnt/nilfs2 type nilfs2 (rw,gcpid=6331) And it is possible to see it from strace output also (tmpfs magic - 0x1021994): statfs("/dev/shm", {f_type=0x1021994, f_bsize=4096, f_blocks=64251, f_bfree=64250, f_bavail=64250, f_files=64251, f_ffree=64249, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0 However, I have not last stable kernel on its machine (2.6.34-gentoo-r6). Maybe, it can make some side effects. I am going to continue investigation of the issue. I attach to the e-mail full strace output (10-07-2012-nilfs-cleanerd-issue.txt). With the best regards, Vyacheslav Dubeyko. On Tue, 2012-07-10 at 10:53 +0900, Ryusuke Konishi wrote: > Hi Vyacheslav, > On Mon, 9 Jul 2012 22:55:40 +0400, Vyacheslav Dubeyko wrote: > > Hi Piotr, > > > > You are right. I can reproduce this issue very simply. The nilfs_cleanerd doesn't started during mount really. > > > > I can detect some suspicious output of strace during mount and next trying to start of nilfs_cleanerd: > > > > .... > > set_tid_address(0xb76a0768) = 21036 > > set_robust_list(0xb76a0770, 0xc) = 0 > > futex(0xbfdd4f90, FUTEX_WAKE_PRIVATE, 1) = 0 > > futex(0xbfdd4f90, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfdd4fa0) = -1 EAGAIN (Resource temporarily unavailable) > > > > .... > > mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) > > > > But maybe it is not reason of the problem. It needs to investigate the issue more deeply. > > Your problem looks that of FAQ #8 on http://www.nilfs.org/en/faq.html > > > 8. cleanerd (or chcp/mkcp command) fails with an error: ``cannot open > > nilfs on /dev/xxx: Function not implemented''. > > > > Confirm whether tmpfs (former shm fs) is mounted on /dev/shm. POSIX > > semaphores do not work if the filesystem on /dev/shm is wrong, > > which causes the above failure. > > > > Some systems are using ramfs instead of tmpfs. You may need to > > change kernel configuration and rebuild kernel to enable tmpfs. > > Please confirm if tmpfs is mounted on /dev/shm. > > The same issue is reported on the following thread: > > http://marc.info/?t=133190016900003&r=1&w=2 > > > Regards, > Ryusuke Konishi > > > Thanks, > > Vyacheslav Dubeyko. > > > > On Jul 9, 2012, at 8:56 PM, Piotr Szymaniak wrote: > > > > > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > > >> Hi Piotr, > > >> > > >> Does system journals on your machines contain any interested details > > >> about reported issue? Could you try to extract some error or warning > > >> messages from system journal? > > > > > > (resend as I replied only to Vyacheslav) > > > > > > If by journals you mean logs then no. I'm only able to find some like > > > this: > > > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > > > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > > > > > That's all about nilfs in the last week and current log has only manual > > > runs related to those operation described before. > > > > > > Piotr Szymaniak. > > > > > > > > >> On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > > >>> Hi. > > >>> > > >>> I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > > >>> of space on my / and found that nilfs_cleanerd isn't working. When I > > >>> start it from the command line it exits instantly. Also, all previous > > >>> checkpoints on / (also on two other mountpoints on different machine) > > >>> are gone. > > >>> > > >>> What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > > >>> the second machine it's runnig fine (cleaned _all_ checkpoints), on the > > >>> first one with disk space issue it exits just like 2.1.3. > > >>> > > >>> Here are some fs details. Machine with disk space issues, rootfs: > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > >>> 147688 2012-07-09 08:38:14 cp - 11075 242915 > > >>> 147689 2012-07-09 08:38:14 cp - 60 242895 > > >>> (…) > > >>> 148999 2012-07-09 09:13:46 cp - 60 242888 > > >>> 149000 2012-07-09 09:19:45 cp - 44 242888 > > >>> > > >>> Filesystem Size Used Avail Use% Mounted on > > >>> rootfs 24G 13G 11G 56% / > > >>> > > >>> mount shows: > > >>> /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > > >>> > > >>> There's no nilfs_cleanerd with pid 15356. > > >>> > > >>> > > >>> Second machine rootfs: > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > >>> 92246 2012-07-09 08:16:58 cp - 118 44669 > > >>> (…) > > >>> 92439 2012-07-09 09:19:14 cp - 29 44668 > > >>> 92440 2012-07-09 09:19:46 cp - 33 44668 > > >>> > > >>> Filesystem Size Used Avail Use% Mounted on > > >>> rootfs 3.7G 888M 2.6G 26% / > > >>> > > >>> (it should be around 3G used) > > >>> > > >>> Second machine second mountpoint: > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > >>> 1496 2012-07-09 03:31:23 cp - 8837 132766 > > >>> 1497 2012-07-09 03:31:26 cp - 468 132766 > > >>> 1498 2012-07-09 03:41:27 cp - 1474 132765 > > >>> > > >>> (this fs should containt *all* 1498 checkpoints) > > >>> > > >>> Filesystem Size Used Avail Use% Mounted on > > >>> /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > > >>> > > >>> (in this one it should be around 100G of used space) > > >>> > > >>> mount: > > >>> /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > > >>> /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > > >>> > > >>> Both cleaners running (the second mountpoint - /mnt/home_backup - is under > > >>> heavy load and I suppose it will end with around 20G used space). > > >>> > > >>> Where to go from this point? How to debug nilfs_cleanerd issue? > > >>> > > >>> > > >>> Piotr Szymaniak. > > >> > > >> > > >> -- > > >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > > > Marriage is like a coffin and each kid is like another nail. > > > -- Homer Simpson > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: 10-07-2012-nilfs-cleanerd-issue.txt --] [-- Type: text/plain, Size: 9213 bytes --] slavad-gentoo-pc linux-2.6.34-gentoo-r6 # strace -f nilfs_cleanerd execve("/sbin/nilfs_cleanerd", ["nilfs_cleanerd"], [/* 27 vars */]) = 0 brk(0) = 0x8051000 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7702000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=13973, ...}) = 0 mmap2(NULL, 13973, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76fe000 close(3) = 0 open("/usr/lib/libnilfs.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\22\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=58978, ...}) = 0 mmap2(NULL, 21704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76f8000 mmap2(0xb76fc000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3) = 0xb76fc000 close(3) = 0 open("/usr/lib/libnilfsgc.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\16\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=41262, ...}) = 0 mmap2(NULL, 16568, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76f3000 mmap2(0xb76f6000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2) = 0xb76f6000 close(3) = 0 open("/lib/librt.so.1", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\31\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=30552, ...}) = 0 mmap2(NULL, 33392, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76ea000 mmap2(0xb76f1000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb76f1000 close(3) = 0 open("/lib/libuuid.so.1", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0P\21\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=13872, ...}) = 0 mmap2(NULL, 16604, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76e5000 mmap2(0xb76e8000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2) = 0xb76e8000 close(3) = 0 open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20m\1\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1323292, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb76e4000 mmap2(NULL, 1333544, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb759e000 mmap2(0xb76de000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13f) = 0xb76de000 mmap2(0xb76e1000, 10536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb76e1000 close(3) = 0 open("/lib/libpthread.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0J\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=116790, ...}) = 0 mmap2(NULL, 98796, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7585000 mmap2(0xb759a000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14) = 0xb759a000 mmap2(0xb759c000, 4588, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb759c000 close(3) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7584000 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7584700, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect(0xb759a000, 4096, PROT_READ) = 0 mprotect(0xb76de000, 8192, PROT_READ) = 0 mprotect(0xb76e8000, 4096, PROT_READ) = 0 mprotect(0xb76f1000, 4096, PROT_READ) = 0 mprotect(0xb76f6000, 4096, PROT_READ) = 0 mprotect(0xb76fc000, 4096, PROT_READ) = 0 mprotect(0x804e000, 4096, PROT_READ) = 0 mprotect(0xb7720000, 4096, PROT_READ) = 0 munmap(0xb76fe000, 13973) = 0 set_tid_address(0xb7584768) = 3805 set_robust_list(0xb7584770, 0xc) = 0 futex(0xbfcde720, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0xbfcde720, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfcde730) = -1 EAGAIN (Resource temporarily unavailable) rt_sigaction(SIGRTMIN, {0xb75893e0, [], SA_SIGINFO}, NULL, 8) = 0 rt_sigaction(SIGRT_1, {0xb75898d0, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0 uname({sys="Linux", node="slavad-gentoo-pc", ...}) = 0 clone(Process 3806 attached child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7584768) = 3806 [pid 3805] exit_group(0) = ? setsid() = 3806 chdir("/") = 0 close(0) = 0 close(1) = 0 close(2) = 0 open("/dev/null", O_RDONLY|O_LARGEFILE) = 0 open("/dev/null", O_WRONLY|O_LARGEFILE) = 1 open("/dev/null", O_WRONLY|O_LARGEFILE) = 2 brk(0) = 0x8051000 brk(0x8072000) = 0x8072000 time(NULL) = 1337504136 open("/etc/localtime", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=2194, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=2194, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7701000 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\0"..., 4096) = 2194 _llseek(3, -27, [2167], SEEK_CUR) = 0 read(3, "\nMSK-3MSD,M3.5.0,M10.5.0/3\n", 4096) = 27 close(3) = 0 munmap(0xb7701000, 4096) = 0 socket(PF_FILE, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 3 connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = -1 EPROTOTYPE (Protocol wrong type for socket) close(3) = 0 socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3 connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0 send(3, "<30>May 20 12:55:36 nilfs_cleane"..., 48, MSG_NOSIGNAL) = 48 open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 4 fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7701000 read(4, "rootfs / rootfs rw 0 0\n/dev/sda3"..., 1024) = 457 close(4) = 0 munmap(0xb7701000, 4096) = 0 open("/dev/sdb2", O_RDONLY|O_LARGEFILE) = 4 ioctl(4, BLKGETSIZE64, 1085736960) = 0 _llseek(4, 1024, [1024], SEEK_SET) = 0 read(4, "\2\0\0\0\0\00044\30\1\0\0\373\363\213N\324d\270\0\2\0\0\0\201\0\0\0\0\0\0\0"..., 1024) = 1024 _llseek(4, 1085730816, [1085730816], SEEK_SET) = 0 read(4, "\2\0\0\0\0\00044\30\1\0\0\373\363\213N\324d\270\0\2\0\0\0\201\0\0\0\0\0\0\0"..., 1024) = 1024 open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 5 fstat64(5, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7701000 read(5, "rootfs / rootfs rw 0 0\n/dev/sda3"..., 1024) = 457 close(5) = 0 munmap(0xb7701000, 4096) = 0 open("/mnt/nilfs2", O_RDONLY|O_LARGEFILE) = 5 stat64("/dev/sdb2", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 18), ...}) = 0 statfs("/dev/shm", {f_type=0x1021994, f_bsize=4096, f_blocks=64251, f_bfree=64250, f_bavail=64250, f_files=64251, f_ffree=64249, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0 futex(0xb759d188, FUTEX_WAKE_PRIVATE, 2147483647) = 0 open("/dev/shm/sem.nilfs-cleaner-2066", O_RDWR|O_NOFOLLOW) = 7 fstat64(7, {st_mode=S_IFREG|0700, st_size=16, ...}) = 0 mmap2(NULL, 16, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0) = 0xb7701000 close(7) = 0 stat64("/etc/nilfs_cleanerd.conf", {st_mode=S_IFREG|0644, st_size=1813, ...}) = 0 open("/etc/nilfs_cleanerd.conf", O_RDONLY|O_LARGEFILE) = 7 fstat64(7, {st_mode=S_IFREG|0644, st_size=1813, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7700000 read(7, "# nilfs_cleanerd.conf - configur"..., 4096) = 1813 read(7, "", 4096) = 0 close(7) = 0 munmap(0xb7700000, 4096) = 0 stat64("/dev/sdb2", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 18), ...}) = 0 mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) time(NULL) = 1337504136 send(3, "<27>May 20 12:55:36 nilfs_cleane"..., 102, MSG_NOSIGNAL) = 102 munmap(0xb7701000, 16) = 0 close(4) = 0 close(5) = 0 time(NULL) = 1337504136 send(3, "<27>May 20 12:55:36 nilfs_cleane"..., 101, MSG_NOSIGNAL) = 101 time(NULL) = 1337504136 send(3, "<30>May 20 12:55:36 nilfs_cleane"..., 51, MSG_NOSIGNAL) = 51 close(3) = 0 exit_group(1) = ? Process 3806 detached ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 7:18 ` Vyacheslav Dubeyko @ 2012-07-10 8:51 ` Ryusuke Konishi [not found] ` <20120710.175131.21311203.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Ryusuke Konishi @ 2012-07-10 8:51 UTC (permalink / raw) To: Vyacheslav Dubeyko Cc: szarpaj-TbOm9Ca2r9GrDJvtcaxF/A, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi, On Tue, 10 Jul 2012 11:18:54 +0400, Vyacheslav Dubeyko wrote: > Hi Ryusuke, > > Unfortunately, my kernel is compiled with enabled CONFIG_SHMEM and > CONFIG_TMPFS options. Moreover, I can see that /dev/shm mounted with > tmpfs type: > > linux-2.6.34-gentoo-r6 # mount > /dev/sda3 on / type xfs (rw,noatime) > proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) > sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) > udev on /dev type tmpfs (rw,nosuid,relatime,size=10240k,mode=755) > devpts on /dev/pts type devpts > (rw,nosuid,noexec,relatime,gid=5,mode=620) > shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev) > usbfs on /proc/bus/usb type usbfs > (rw,noexec,nosuid,devmode=0664,devgid=85) > /dev/sdb2 on /mnt/nilfs2 type nilfs2 (rw,gcpid=6331) > > And it is possible to see it from strace output also (tmpfs magic - > 0x1021994): > > statfs("/dev/shm", {f_type=0x1021994, f_bsize=4096, f_blocks=64251, > f_bfree=64250, f_bavail=64250, f_files=64251, f_ffree=64249, f_fsid={0, > 0}, f_namelen=255, f_frsize=4096}) = 0 > > However, I have not last stable kernel on its machine > (2.6.34-gentoo-r6). Maybe, it can make some side effects. > > I am going to continue investigation of the issue. I attach to the > e-mail full strace output (10-07-2012-nilfs-cleanerd-issue.txt). Ok, this looks a different problem. How is CONFIG_POSIX_MQEUEU ? Is it enabled in your kernel ? Regards, Ryusuke Konishi > With the best regards, > Vyacheslav Dubeyko. > > On Tue, 2012-07-10 at 10:53 +0900, Ryusuke Konishi wrote: > > Hi Vyacheslav, > > On Mon, 9 Jul 2012 22:55:40 +0400, Vyacheslav Dubeyko wrote: > > > Hi Piotr, > > > > > > You are right. I can reproduce this issue very simply. The nilfs_cleanerd doesn't started during mount really. > > > > > > I can detect some suspicious output of strace during mount and next trying to start of nilfs_cleanerd: > > > > > > .... > > > set_tid_address(0xb76a0768) = 21036 > > > set_robust_list(0xb76a0770, 0xc) = 0 > > > futex(0xbfdd4f90, FUTEX_WAKE_PRIVATE, 1) = 0 > > > futex(0xbfdd4f90, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfdd4fa0) = -1 EAGAIN (Resource temporarily unavailable) > > > > > > .... > > > mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) > > > > > > But maybe it is not reason of the problem. It needs to investigate the issue more deeply. > > > > Your problem looks that of FAQ #8 on http://www.nilfs.org/en/faq.html > > > > > 8. cleanerd (or chcp/mkcp command) fails with an error: ``cannot open > > > nilfs on /dev/xxx: Function not implemented''. > > > > > > Confirm whether tmpfs (former shm fs) is mounted on /dev/shm. POSIX > > > semaphores do not work if the filesystem on /dev/shm is wrong, > > > which causes the above failure. > > > > > > Some systems are using ramfs instead of tmpfs. You may need to > > > change kernel configuration and rebuild kernel to enable tmpfs. > > > > Please confirm if tmpfs is mounted on /dev/shm. > > > > The same issue is reported on the following thread: > > > > http://marc.info/?t=133190016900003&r=1&w=2 > > > > > > Regards, > > Ryusuke Konishi > > > > > Thanks, > > > Vyacheslav Dubeyko. > > > > > > On Jul 9, 2012, at 8:56 PM, Piotr Szymaniak wrote: > > > > > > > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > > > >> Hi Piotr, > > > >> > > > >> Does system journals on your machines contain any interested details > > > >> about reported issue? Could you try to extract some error or warning > > > >> messages from system journal? > > > > > > > > (resend as I replied only to Vyacheslav) > > > > > > > > If by journals you mean logs then no. I'm only able to find some like > > > > this: > > > > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > > > > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > > > > > > > That's all about nilfs in the last week and current log has only manual > > > > runs related to those operation described before. > > > > > > > > Piotr Szymaniak. > > > > > > > > > > > >> On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > > > >>> Hi. > > > >>> > > > >>> I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > > > >>> of space on my / and found that nilfs_cleanerd isn't working. When I > > > >>> start it from the command line it exits instantly. Also, all previous > > > >>> checkpoints on / (also on two other mountpoints on different machine) > > > >>> are gone. > > > >>> > > > >>> What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > > > >>> the second machine it's runnig fine (cleaned _all_ checkpoints), on the > > > >>> first one with disk space issue it exits just like 2.1.3. > > > >>> > > > >>> Here are some fs details. Machine with disk space issues, rootfs: > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > >>> 147688 2012-07-09 08:38:14 cp - 11075 242915 > > > >>> 147689 2012-07-09 08:38:14 cp - 60 242895 > > > >>> (…) > > > >>> 148999 2012-07-09 09:13:46 cp - 60 242888 > > > >>> 149000 2012-07-09 09:19:45 cp - 44 242888 > > > >>> > > > >>> Filesystem Size Used Avail Use% Mounted on > > > >>> rootfs 24G 13G 11G 56% / > > > >>> > > > >>> mount shows: > > > >>> /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > > > >>> > > > >>> There's no nilfs_cleanerd with pid 15356. > > > >>> > > > >>> > > > >>> Second machine rootfs: > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > >>> 92246 2012-07-09 08:16:58 cp - 118 44669 > > > >>> (…) > > > >>> 92439 2012-07-09 09:19:14 cp - 29 44668 > > > >>> 92440 2012-07-09 09:19:46 cp - 33 44668 > > > >>> > > > >>> Filesystem Size Used Avail Use% Mounted on > > > >>> rootfs 3.7G 888M 2.6G 26% / > > > >>> > > > >>> (it should be around 3G used) > > > >>> > > > >>> Second machine second mountpoint: > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > >>> 1496 2012-07-09 03:31:23 cp - 8837 132766 > > > >>> 1497 2012-07-09 03:31:26 cp - 468 132766 > > > >>> 1498 2012-07-09 03:41:27 cp - 1474 132765 > > > >>> > > > >>> (this fs should containt *all* 1498 checkpoints) > > > >>> > > > >>> Filesystem Size Used Avail Use% Mounted on > > > >>> /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > > > >>> > > > >>> (in this one it should be around 100G of used space) > > > >>> > > > >>> mount: > > > >>> /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > > > >>> /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > > > >>> > > > >>> Both cleaners running (the second mountpoint - /mnt/home_backup - is under > > > >>> heavy load and I suppose it will end with around 20G used space). > > > >>> > > > >>> Where to go from this point? How to debug nilfs_cleanerd issue? > > > >>> > > > >>> > > > >>> Piotr Szymaniak. > > > >> > > > >> > > > >> -- > > > >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > -- > > > > Marriage is like a coffin and each kid is like another nail. > > > > -- Homer Simpson > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20120710.175131.21311203.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running [not found] ` <20120710.175131.21311203.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2012-07-10 10:38 ` Vyacheslav Dubeyko 2012-07-10 11:09 ` Ryusuke Konishi 0 siblings, 1 reply; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-10 10:38 UTC (permalink / raw) To: Ryusuke Konishi Cc: szarpaj-TbOm9Ca2r9GrDJvtcaxF/A, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Ryusuke, On Tue, 2012-07-10 at 17:51 +0900, Ryusuke Konishi wrote: > Ok, this looks a different problem. > > How is CONFIG_POSIX_MQEUEU ? > Is it enabled in your kernel ? > Yes, in my kernel CONFIG_POSIX_MQEUEU option was not enabled. Now, after recompilation of kernel with enabled CONFIG_POSIX_MQEUEU option the nilfs_cleanerd started successfully and working. I think after analysis of strace output that Piotr Szymaniak has the same problem. But maybe I wrong. Thanks, Vyacheslav Dubeyko. > Regards, > Ryusuke Konishi > > > With the best regards, > > Vyacheslav Dubeyko. > > > > On Tue, 2012-07-10 at 10:53 +0900, Ryusuke Konishi wrote: > > > Hi Vyacheslav, > > > On Mon, 9 Jul 2012 22:55:40 +0400, Vyacheslav Dubeyko wrote: > > > > Hi Piotr, > > > > > > > > You are right. I can reproduce this issue very simply. The nilfs_cleanerd doesn't started during mount really. > > > > > > > > I can detect some suspicious output of strace during mount and next trying to start of nilfs_cleanerd: > > > > > > > > .... > > > > set_tid_address(0xb76a0768) = 21036 > > > > set_robust_list(0xb76a0770, 0xc) = 0 > > > > futex(0xbfdd4f90, FUTEX_WAKE_PRIVATE, 1) = 0 > > > > futex(0xbfdd4f90, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfdd4fa0) = -1 EAGAIN (Resource temporarily unavailable) > > > > > > > > .... > > > > mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) > > > > > > > > But maybe it is not reason of the problem. It needs to investigate the issue more deeply. > > > > > > Your problem looks that of FAQ #8 on http://www.nilfs.org/en/faq.html > > > > > > > 8. cleanerd (or chcp/mkcp command) fails with an error: ``cannot open > > > > nilfs on /dev/xxx: Function not implemented''. > > > > > > > > Confirm whether tmpfs (former shm fs) is mounted on /dev/shm. POSIX > > > > semaphores do not work if the filesystem on /dev/shm is wrong, > > > > which causes the above failure. > > > > > > > > Some systems are using ramfs instead of tmpfs. You may need to > > > > change kernel configuration and rebuild kernel to enable tmpfs. > > > > > > Please confirm if tmpfs is mounted on /dev/shm. > > > > > > The same issue is reported on the following thread: > > > > > > http://marc.info/?t=133190016900003&r=1&w=2 > > > > > > > > > Regards, > > > Ryusuke Konishi > > > > > > > Thanks, > > > > Vyacheslav Dubeyko. > > > > > > > > On Jul 9, 2012, at 8:56 PM, Piotr Szymaniak wrote: > > > > > > > > > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > > > > >> Hi Piotr, > > > > >> > > > > >> Does system journals on your machines contain any interested details > > > > >> about reported issue? Could you try to extract some error or warning > > > > >> messages from system journal? > > > > > > > > > > (resend as I replied only to Vyacheslav) > > > > > > > > > > If by journals you mean logs then no. I'm only able to find some like > > > > > this: > > > > > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > > > > > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > > > > > > > > > That's all about nilfs in the last week and current log has only manual > > > > > runs related to those operation described before. > > > > > > > > > > Piotr Szymaniak. > > > > > > > > > > > > > > >> On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > > > > >>> Hi. > > > > >>> > > > > >>> I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > > > > >>> of space on my / and found that nilfs_cleanerd isn't working. When I > > > > >>> start it from the command line it exits instantly. Also, all previous > > > > >>> checkpoints on / (also on two other mountpoints on different machine) > > > > >>> are gone. > > > > >>> > > > > >>> What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > > > > >>> the second machine it's runnig fine (cleaned _all_ checkpoints), on the > > > > >>> first one with disk space issue it exits just like 2.1.3. > > > > >>> > > > > >>> Here are some fs details. Machine with disk space issues, rootfs: > > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > > >>> 147688 2012-07-09 08:38:14 cp - 11075 242915 > > > > >>> 147689 2012-07-09 08:38:14 cp - 60 242895 > > > > >>> (…) > > > > >>> 148999 2012-07-09 09:13:46 cp - 60 242888 > > > > >>> 149000 2012-07-09 09:19:45 cp - 44 242888 > > > > >>> > > > > >>> Filesystem Size Used Avail Use% Mounted on > > > > >>> rootfs 24G 13G 11G 56% / > > > > >>> > > > > >>> mount shows: > > > > >>> /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > > > > >>> > > > > >>> There's no nilfs_cleanerd with pid 15356. > > > > >>> > > > > >>> > > > > >>> Second machine rootfs: > > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > > >>> 92246 2012-07-09 08:16:58 cp - 118 44669 > > > > >>> (…) > > > > >>> 92439 2012-07-09 09:19:14 cp - 29 44668 > > > > >>> 92440 2012-07-09 09:19:46 cp - 33 44668 > > > > >>> > > > > >>> Filesystem Size Used Avail Use% Mounted on > > > > >>> rootfs 3.7G 888M 2.6G 26% / > > > > >>> > > > > >>> (it should be around 3G used) > > > > >>> > > > > >>> Second machine second mountpoint: > > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > > >>> 1496 2012-07-09 03:31:23 cp - 8837 132766 > > > > >>> 1497 2012-07-09 03:31:26 cp - 468 132766 > > > > >>> 1498 2012-07-09 03:41:27 cp - 1474 132765 > > > > >>> > > > > >>> (this fs should containt *all* 1498 checkpoints) > > > > >>> > > > > >>> Filesystem Size Used Avail Use% Mounted on > > > > >>> /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > > > > >>> > > > > >>> (in this one it should be around 100G of used space) > > > > >>> > > > > >>> mount: > > > > >>> /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > > > > >>> /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > > > > >>> > > > > >>> Both cleaners running (the second mountpoint - /mnt/home_backup - is under > > > > >>> heavy load and I suppose it will end with around 20G used space). > > > > >>> > > > > >>> Where to go from this point? How to debug nilfs_cleanerd issue? > > > > >>> > > > > >>> > > > > >>> Piotr Szymaniak. > > > > >> > > > > >> > > > > >> -- > > > > >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > > > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > -- > > > > > Marriage is like a coffin and each kid is like another nail. > > > > > -- Homer Simpson > > > > > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 10:38 ` Vyacheslav Dubeyko @ 2012-07-10 11:09 ` Ryusuke Konishi [not found] ` <20120710.200937.163315083.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Ryusuke Konishi @ 2012-07-10 11:09 UTC (permalink / raw) To: Vyacheslav Dubeyko, Piotr Szymaniak; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Tue, 10 Jul 2012 14:38:55 +0400, Vyacheslav Dubeyko wrote: > Hi Ryusuke, > > On Tue, 2012-07-10 at 17:51 +0900, Ryusuke Konishi wrote: > > Ok, this looks a different problem. > > > > How is CONFIG_POSIX_MQEUEU ? > > Is it enabled in your kernel ? > > > Yes, in my kernel CONFIG_POSIX_MQEUEU option was not enabled. Now, after > recompilation of kernel with enabled CONFIG_POSIX_MQEUEU option the > nilfs_cleanerd started successfully and working. I misspelled the option name, of course it's CONFIG_POSIX_MQUEUE. Anyway, I'll add this matter to the FAQ item. > I think after analysis of strace output that Piotr Szymaniak has the > same problem. But maybe I wrong. According to the log, his problem looks different. cleanerd fails with ENOENT error (No such file or directory) if it couldn't find the given device in /proc/mounts. His /proc/mounts looks: rootfs / rootfs rw 0 0 /dev/root / nilfs2 rw 0 0 .. And, the strace log shows that /dev/root didn't exist. I guess the problem would be fixed if a proper symbolic link pointing to the real device is created as /dev/root. Regards, Ryusuke Konishi > Thanks, > Vyacheslav Dubeyko. > > > Regards, > > Ryusuke Konishi > > > > > With the best regards, > > > Vyacheslav Dubeyko. > > > > > > On Tue, 2012-07-10 at 10:53 +0900, Ryusuke Konishi wrote: > > > > Hi Vyacheslav, > > > > On Mon, 9 Jul 2012 22:55:40 +0400, Vyacheslav Dubeyko wrote: > > > > > Hi Piotr, > > > > > > > > > > You are right. I can reproduce this issue very simply. The nilfs_cleanerd doesn't started during mount really. > > > > > > > > > > I can detect some suspicious output of strace during mount and next trying to start of nilfs_cleanerd: > > > > > > > > > > .... > > > > > set_tid_address(0xb76a0768) = 21036 > > > > > set_robust_list(0xb76a0770, 0xc) = 0 > > > > > futex(0xbfdd4f90, FUTEX_WAKE_PRIVATE, 1) = 0 > > > > > futex(0xbfdd4f90, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, bfdd4fa0) = -1 EAGAIN (Resource temporarily unavailable) > > > > > > > > > > .... > > > > > mq_open("nilfs-cleanerq-2066", O_RDONLY|O_CREAT, 0600, {mq_maxmsg=6, mq_msgsize=4096}) = -1 ENOSYS (Function not implemented) > > > > > > > > > > But maybe it is not reason of the problem. It needs to investigate the issue more deeply. > > > > > > > > Your problem looks that of FAQ #8 on http://www.nilfs.org/en/faq.html > > > > > > > > > 8. cleanerd (or chcp/mkcp command) fails with an error: ``cannot open > > > > > nilfs on /dev/xxx: Function not implemented''. > > > > > > > > > > Confirm whether tmpfs (former shm fs) is mounted on /dev/shm. POSIX > > > > > semaphores do not work if the filesystem on /dev/shm is wrong, > > > > > which causes the above failure. > > > > > > > > > > Some systems are using ramfs instead of tmpfs. You may need to > > > > > change kernel configuration and rebuild kernel to enable tmpfs. > > > > > > > > Please confirm if tmpfs is mounted on /dev/shm. > > > > > > > > The same issue is reported on the following thread: > > > > > > > > http://marc.info/?t=133190016900003&r=1&w=2 > > > > > > > > > > > > Regards, > > > > Ryusuke Konishi > > > > > > > > > Thanks, > > > > > Vyacheslav Dubeyko. > > > > > > > > > > On Jul 9, 2012, at 8:56 PM, Piotr Szymaniak wrote: > > > > > > > > > > > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > > > > > >> Hi Piotr, > > > > > >> > > > > > >> Does system journals on your machines contain any interested details > > > > > >> about reported issue? Could you try to extract some error or warning > > > > > >> messages from system journal? > > > > > > > > > > > > (resend as I replied only to Vyacheslav) > > > > > > > > > > > > If by journals you mean logs then no. I'm only able to find some like > > > > > > this: > > > > > > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > > > > > > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > > > > > > > > > > > That's all about nilfs in the last week and current log has only manual > > > > > > runs related to those operation described before. > > > > > > > > > > > > Piotr Szymaniak. > > > > > > > > > > > > > > > > > >> On Mon, 2012-07-09 at 09:33 +0200, Piotr Szymaniak wrote: > > > > > >>> Hi. > > > > > >>> > > > > > >>> I've upgraded nilfs-utils (running Gentoo) on 29 july. Today I ran out > > > > > >>> of space on my / and found that nilfs_cleanerd isn't working. When I > > > > > >>> start it from the command line it exits instantly. Also, all previous > > > > > >>> checkpoints on / (also on two other mountpoints on different machine) > > > > > >>> are gone. > > > > > >>> > > > > > >>> What I did? Downgraded nilfs-utils to 2.1.1, remounted mountpoints. On > > > > > >>> the second machine it's runnig fine (cleaned _all_ checkpoints), on the > > > > > >>> first one with disk space issue it exits just like 2.1.3. > > > > > >>> > > > > > >>> Here are some fs details. Machine with disk space issues, rootfs: > > > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > > > >>> 147688 2012-07-09 08:38:14 cp - 11075 242915 > > > > > >>> 147689 2012-07-09 08:38:14 cp - 60 242895 > > > > > >>> (…) > > > > > >>> 148999 2012-07-09 09:13:46 cp - 60 242888 > > > > > >>> 149000 2012-07-09 09:19:45 cp - 44 242888 > > > > > >>> > > > > > >>> Filesystem Size Used Avail Use% Mounted on > > > > > >>> rootfs 24G 13G 11G 56% / > > > > > >>> > > > > > >>> mount shows: > > > > > >>> /dev/sda2 on / type nilfs2 (rw,noatime,nodiratime,gcpid=15356) > > > > > >>> > > > > > >>> There's no nilfs_cleanerd with pid 15356. > > > > > >>> > > > > > >>> > > > > > >>> Second machine rootfs: > > > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > > > >>> 92246 2012-07-09 08:16:58 cp - 118 44669 > > > > > >>> (…) > > > > > >>> 92439 2012-07-09 09:19:14 cp - 29 44668 > > > > > >>> 92440 2012-07-09 09:19:46 cp - 33 44668 > > > > > >>> > > > > > >>> Filesystem Size Used Avail Use% Mounted on > > > > > >>> rootfs 3.7G 888M 2.6G 26% / > > > > > >>> > > > > > >>> (it should be around 3G used) > > > > > >>> > > > > > >>> Second machine second mountpoint: > > > > > >>> CNO DATE TIME MODE FLG NBLKINC ICNT > > > > > >>> 1496 2012-07-09 03:31:23 cp - 8837 132766 > > > > > >>> 1497 2012-07-09 03:31:26 cp - 468 132766 > > > > > >>> 1498 2012-07-09 03:41:27 cp - 1474 132765 > > > > > >>> > > > > > >>> (this fs should containt *all* 1498 checkpoints) > > > > > >>> > > > > > >>> Filesystem Size Used Avail Use% Mounted on > > > > > >>> /dev/dm-2 117G 58G 54G 76% /mnt/home_backup > > > > > >>> > > > > > >>> (in this one it should be around 100G of used space) > > > > > >>> > > > > > >>> mount: > > > > > >>> /dev/dm-2 on /mnt/home_backup type nilfs2 (rw,gcpid=13135) > > > > > >>> /dev/sda3 on / type nilfs2 (rw,noatime,nodiratime,gcpid=1363) > > > > > >>> > > > > > >>> Both cleaners running (the second mountpoint - /mnt/home_backup - is under > > > > > >>> heavy load and I suppose it will end with around 20G used space). > > > > > >>> > > > > > >>> Where to go from this point? How to debug nilfs_cleanerd issue? > > > > > >>> > > > > > >>> > > > > > >>> Piotr Szymaniak. > > > > > >> > > > > > >> > > > > > >> -- > > > > > >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > > > > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > -- > > > > > > Marriage is like a coffin and each kid is like another nail. > > > > > > -- Homer Simpson > > > > > > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20120710.200937.163315083.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running [not found] ` <20120710.200937.163315083.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2012-07-10 14:07 ` Piotr Szymaniak 2012-07-10 16:40 ` Ryusuke Konishi 0 siblings, 1 reply; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-10 14:07 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: Vyacheslav Dubeyko, linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 1381 bytes --] On Tue, Jul 10, 2012 at 08:09:37PM +0900, Ryusuke Konishi wrote: > On Tue, 10 Jul 2012 14:38:55 +0400, Vyacheslav Dubeyko wrote: > > I think after analysis of strace output that Piotr Szymaniak has the > > same problem. But maybe I wrong. > > According to the log, his problem looks different. > > cleanerd fails with ENOENT error (No such file or directory) > if it couldn't find the given device in /proc/mounts. > > His /proc/mounts looks: > > rootfs / rootfs rw 0 0 > /dev/root / nilfs2 rw 0 0 > .. > > And, the strace log shows that /dev/root didn't exist. > > I guess the problem would be fixed if a proper symbolic link pointing > to the real device is created as /dev/root. Yes, it seems so. Just after I wrote my amateur strace analysis I tried to fix it by hand. Made a symlink from /dev/root to /dev/sda2 and remounted /. nilfs_cleanerd started fine. But why is it failing on /dev/root anyway? It's started with /dev/disk/by-uuid/uuid pointing to a proper device (sda2). Do we really need /dev/root? Also, maybe there's a reason to remove /dev/root? I will try later with some older udev to see if it is created. Piotr Szymaniak. -- <kow`> "There are 10 types of people in the world... those who understand binary and those who don't." <SpaceRain> That's only 2 types of people, kow. <SpaceRain> STUPID -- bash.org [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 14:07 ` Piotr Szymaniak @ 2012-07-10 16:40 ` Ryusuke Konishi [not found] ` <20120711.014049.157490457.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Ryusuke Konishi @ 2012-07-10 16:40 UTC (permalink / raw) To: Piotr Szymaniak; +Cc: Vyacheslav Dubeyko, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi, On Tue, 10 Jul 2012 16:07:11 +0200, Piotr Szymaniak wrote: > On Tue, Jul 10, 2012 at 08:09:37PM +0900, Ryusuke Konishi wrote: > > On Tue, 10 Jul 2012 14:38:55 +0400, Vyacheslav Dubeyko wrote: > > > I think after analysis of strace output that Piotr Szymaniak has the > > > same problem. But maybe I wrong. > > > > According to the log, his problem looks different. > > > > cleanerd fails with ENOENT error (No such file or directory) > > if it couldn't find the given device in /proc/mounts. > > > > His /proc/mounts looks: > > > > rootfs / rootfs rw 0 0 > > /dev/root / nilfs2 rw 0 0 > > .. > > > > And, the strace log shows that /dev/root didn't exist. > > > > I guess the problem would be fixed if a proper symbolic link pointing > > to the real device is created as /dev/root. > > Yes, it seems so. Just after I wrote my amateur strace analysis I tried > to fix it by hand. Made a symlink from /dev/root to /dev/sda2 and > remounted /. nilfs_cleanerd started fine. > > But why is it failing on /dev/root anyway? It's started with > /dev/disk/by-uuid/uuid pointing to a proper device (sda2). Do we really > need /dev/root? I don't know, that is not a matter of NILFS. NILFS does not replace the root device with /dev/root. NILFS library just tries to find a mount instance by comparing canonical path names of the given device node and that in /proc/mounts. Symbolic links are followed in this canonicalization. The difference was simply that /dev/disk/by-uuid/uuid was a proper symlink to the real device and /dev/root was not in your one machine for some reason. > Also, maybe there's a reason to remove /dev/root? I will try later with > some older udev to see if it is created. In my nilfs-rooted machine, the root device node appears as /dev/sda? in /proc/mounts, not /dev/root. I don't know what makes this difference. Regards, Ryusuke Konishi > Piotr Szymaniak. > -- > <kow`> "There are 10 types of people in the world... those who > understand binary and those who don't." > <SpaceRain> That's only 2 types of people, kow. > <SpaceRain> STUPID > -- bash.org -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20120711.014049.157490457.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running [not found] ` <20120711.014049.157490457.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2012-07-12 11:42 ` Piotr Szymaniak 0 siblings, 0 replies; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-12 11:42 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 1671 bytes --] On Wed, Jul 11, 2012 at 01:40:49AM +0900, Ryusuke Konishi wrote: > I don't know, that is not a matter of NILFS. > NILFS does not replace the root device with /dev/root. > > NILFS library just tries to find a mount instance by comparing > canonical path names of the given device node and that in > /proc/mounts. Symbolic links are followed in this canonicalization. That's why I asked. I suppose it is checking if the given partition is mounted, right? > The difference was simply that /dev/disk/by-uuid/uuid was a proper > symlink to the real device and /dev/root was not in your one machine > for some reason. I checked udev in the first place. udev-182-r3 (from Gentoo portage tree) with udev-init-scripts-10 creates /dev/root. udev-186 with udev-init-scripts-12 doesn't. There are some bugs about missing /dev/root on Gentoo's Bugzilla (grub2 related). > > Also, maybe there's a reason to remove /dev/root? I will try later with > > some older udev to see if it is created. > > In my nilfs-rooted machine, the root device node appears as /dev/sda? > in /proc/mounts, not /dev/root. I don't know what makes this > difference. All (at least those checked) my Gentoo Linux boxes have /dev/root in /proc/mounts, but I don't know if this is Gentoo specific. I think some mentioned above bug pointed to similar "missing /dev/root" bug in Debian. Piotr Szymaniak. -- Mają tam rożki, barwione wody sodowe, batoniki, posypki, orzeszki, piwo słodowe, koktajle i taki kretyński, pańciowaty, kurewski dzwonek, który robi ding-dong albo bing-bang, kiedy ktoś wchodzi. -- Peter Hedges, "What's Eating Gilbert Grape" [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-09 7:33 nilfs2 weird issue - snapshots are gone, cleanerd not running Piotr Szymaniak 2012-07-09 9:28 ` Vyacheslav Dubeyko @ 2012-07-09 9:33 ` dexen deVries 2012-07-09 9:49 ` Vyacheslav Dubeyko 2012-07-10 7:51 ` Piotr Szymaniak 1 sibling, 2 replies; 20+ messages in thread From: dexen deVries @ 2012-07-09 9:33 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Piotr Szymaniak On Monday 09 of July 2012 09:33:00 you wrote: >(...) > Where to go from this point? How to debug nilfs_cleanerd issue? using strace -f on the nilfs_cleanerd should give you a clue what causes the exit. it may exit if can't connect to syslogd. if `lscp' reports no checkpoints, try `lscp -a' -- should show even minor checkpoints (which are checkpoints created by GC, AFAIK). -- dexen deVries [[[↓][→]]] "all dichotomies are either true or false" is a true paradox because it's paradoxical only if it is a paradox ;) -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-09 9:33 ` dexen deVries @ 2012-07-09 9:49 ` Vyacheslav Dubeyko 2012-07-10 7:51 ` Piotr Szymaniak 1 sibling, 0 replies; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-09 9:49 UTC (permalink / raw) To: dexen deVries; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA, Piotr Szymaniak Hi, On Mon, 2012-07-09 at 11:33 +0200, dexen deVries wrote: > using strace -f on the nilfs_cleanerd should give you a clue what causes the > exit. it may exit if can't connect to syslogd. I think that strace can generate many information. :-) Maybe, firstly, it makes sense to increase verbosity level of nilfs_cleanerd to debug level (please, see log_priority parameters in nilfs_cleanerd.conf). With the best regards, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-09 9:33 ` dexen deVries 2012-07-09 9:49 ` Vyacheslav Dubeyko @ 2012-07-10 7:51 ` Piotr Szymaniak 2012-07-10 8:34 ` Vyacheslav Dubeyko 2012-07-10 9:52 ` Piotr Szymaniak 1 sibling, 2 replies; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-10 7:51 UTC (permalink / raw) To: dexen deVries; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 2074 bytes --] On Mon, Jul 09, 2012 at 11:33:13AM +0200, dexen deVries wrote: > On Monday 09 of July 2012 09:33:00 you wrote: > >(...) > > Where to go from this point? How to debug nilfs_cleanerd issue? > > using strace -f on the nilfs_cleanerd should give you a clue what causes the > exit. it may exit if can't connect to syslogd. Hi list, I will try to sort information from few messages together here. Attached strace output from my "nilfs_cleanerd" for rootfs. The script is in /etc/local.d (again, Gentoo Linux) and it looks like this (removed commented parts and empty lines): CONFILE="/etc/nilfs_cleanerd_rootfs.conf" TARGETDISK="/dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7" /sbin/nilfs_cleanerd -c "${CONFILE}" "${TARGETDISK}" And here's the mentioned nilfs_cleanerd_rootfs.conf: protection_period 3600 min_clean_segments 13% max_clean_segments 25% clean_check_interval 10 selection_policy timestamp # timestamp in ascend order nsegments_per_clean 2 mc_nsegments_per_clean 4 cleaning_interval 5 mc_cleaning_interval 1 retry_interval 60 use_mmap log_priority info > if `lscp' reports no checkpoints, try `lscp -a' -- should show even minor > checkpoints (which are checkpoints created by GC, AFAIK). That's right, lscp -a gives some more output: maszyn ~ # lscp -a | wc -l 1428 Running nilfs_cleanerd with more verbose output (log_priority debug) gives me this: Jul 10 09:45:26 [nilfs_cleanerd] start Jul 10 09:45:26 [nilfs_cleanerd] cannot open nilfs on /dev/sda2: No such file or directory Jul 10 09:45:26 [nilfs_cleanerd] cannot create cleanerd on /dev/sda2: No such file or directory Jul 10 09:45:26 [nilfs_cleanerd] shutdown maszyn ~ # ll /dev/sda2 brw-rw---- 1 root disk 8, 2 lip 10 2012 /dev/sda2 maszyn ~ # ll /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 lrwxrwxrwx 1 root root 10 lip 10 2012 /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 -> ../../sda2 Piotr Szymaniak. -- Gdy mowisz, co bedzie w przyszlym roku, diabel chichocze. -- Przyslowie japonskie [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 7:51 ` Piotr Szymaniak @ 2012-07-10 8:34 ` Vyacheslav Dubeyko 2012-07-10 9:50 ` Piotr Szymaniak 2012-07-10 9:52 ` Piotr Szymaniak 1 sibling, 1 reply; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-10 8:34 UTC (permalink / raw) To: Piotr Szymaniak; +Cc: dexen deVries, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Piotr, Could you send your output of "strace -f"? I guess that it is problem with semaphore opening as I can see in my reproduction. But it needs to compare our cases. Thanks, Vyacheslav Dubeyko. On Tue, 2012-07-10 at 09:51 +0200, Piotr Szymaniak wrote: > On Mon, Jul 09, 2012 at 11:33:13AM +0200, dexen deVries wrote: > > On Monday 09 of July 2012 09:33:00 you wrote: > > >(...) > > > Where to go from this point? How to debug nilfs_cleanerd issue? > > > > using strace -f on the nilfs_cleanerd should give you a clue what causes the > > exit. it may exit if can't connect to syslogd. > > Hi list, > > I will try to sort information from few messages together here. > > Attached strace output from my "nilfs_cleanerd" for rootfs. The script > is in /etc/local.d (again, Gentoo Linux) and it looks like this (removed > commented parts and empty lines): > CONFILE="/etc/nilfs_cleanerd_rootfs.conf" > TARGETDISK="/dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7" > /sbin/nilfs_cleanerd -c "${CONFILE}" "${TARGETDISK}" > > And here's the mentioned nilfs_cleanerd_rootfs.conf: > protection_period 3600 > min_clean_segments 13% > max_clean_segments 25% > clean_check_interval 10 > selection_policy timestamp # timestamp in ascend order > nsegments_per_clean 2 > mc_nsegments_per_clean 4 > cleaning_interval 5 > mc_cleaning_interval 1 > retry_interval 60 > use_mmap > log_priority info > > > > if `lscp' reports no checkpoints, try `lscp -a' -- should show even minor > > checkpoints (which are checkpoints created by GC, AFAIK). > > That's right, lscp -a gives some more output: > maszyn ~ # lscp -a | wc -l > 1428 > > Running nilfs_cleanerd with more verbose output (log_priority debug) > gives me this: > Jul 10 09:45:26 [nilfs_cleanerd] start > Jul 10 09:45:26 [nilfs_cleanerd] cannot open nilfs on /dev/sda2: No such > file or directory > Jul 10 09:45:26 [nilfs_cleanerd] cannot create cleanerd on /dev/sda2: No > such file or directory > Jul 10 09:45:26 [nilfs_cleanerd] shutdown > > > maszyn ~ # ll /dev/sda2 > brw-rw---- 1 root disk 8, 2 lip 10 2012 /dev/sda2 > > maszyn ~ # ll /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 > lrwxrwxrwx 1 root root 10 lip 10 2012 > /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 -> ../../sda2 > > > Piotr Szymaniak. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 8:34 ` Vyacheslav Dubeyko @ 2012-07-10 9:50 ` Piotr Szymaniak 2012-07-10 10:43 ` Vyacheslav Dubeyko 0 siblings, 1 reply; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-10 9:50 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1.1: Type: text/plain, Size: 3287 bytes --] On Tue, Jul 10, 2012 at 12:34:42PM +0400, Vyacheslav Dubeyko wrote: > Hi Piotr, > > Could you send your output of "strace -f"? I guess that it is problem > with semaphore opening as I can see in my reproduction. But it needs to > compare our cases. I'm not a strace guru, but it looks like some issue with new udev (or whatever creating devices/symlinks in /dev). Right now I'm missing /dev/root at machine with issues with nilfs_cleanerd: maszyn ~ (: LANG="C" ls -l /dev/root ls: cannot access /dev/root: No such file or directory While on the other: wloczykij ~ (: LANG="C" ls -l /dev/root lrwxrwxrwx 1 root root 4 Jun 27 21:50 /dev/root -> sda3 strace -f log attached. Piotr Szymaniak. > On Tue, 2012-07-10 at 09:51 +0200, Piotr Szymaniak wrote: > > On Mon, Jul 09, 2012 at 11:33:13AM +0200, dexen deVries wrote: > > > On Monday 09 of July 2012 09:33:00 you wrote: > > > >(...) > > > > Where to go from this point? How to debug nilfs_cleanerd issue? > > > > > > using strace -f on the nilfs_cleanerd should give you a clue what causes the > > > exit. it may exit if can't connect to syslogd. > > > > Hi list, > > > > I will try to sort information from few messages together here. > > > > Attached strace output from my "nilfs_cleanerd" for rootfs. The script > > is in /etc/local.d (again, Gentoo Linux) and it looks like this (removed > > commented parts and empty lines): > > CONFILE="/etc/nilfs_cleanerd_rootfs.conf" > > TARGETDISK="/dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7" > > /sbin/nilfs_cleanerd -c "${CONFILE}" "${TARGETDISK}" > > > > And here's the mentioned nilfs_cleanerd_rootfs.conf: > > protection_period 3600 > > min_clean_segments 13% > > max_clean_segments 25% > > clean_check_interval 10 > > selection_policy timestamp # timestamp in ascend order > > nsegments_per_clean 2 > > mc_nsegments_per_clean 4 > > cleaning_interval 5 > > mc_cleaning_interval 1 > > retry_interval 60 > > use_mmap > > log_priority info > > > > > > > if `lscp' reports no checkpoints, try `lscp -a' -- should show even minor > > > checkpoints (which are checkpoints created by GC, AFAIK). > > > > That's right, lscp -a gives some more output: > > maszyn ~ # lscp -a | wc -l > > 1428 > > > > Running nilfs_cleanerd with more verbose output (log_priority debug) > > gives me this: > > Jul 10 09:45:26 [nilfs_cleanerd] start > > Jul 10 09:45:26 [nilfs_cleanerd] cannot open nilfs on /dev/sda2: No such > > file or directory > > Jul 10 09:45:26 [nilfs_cleanerd] cannot create cleanerd on /dev/sda2: No > > such file or directory > > Jul 10 09:45:26 [nilfs_cleanerd] shutdown > > > > > > maszyn ~ # ll /dev/sda2 > > brw-rw---- 1 root disk 8, 2 lip 10 2012 /dev/sda2 > > > > maszyn ~ # ll /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 > > lrwxrwxrwx 1 root root 10 lip 10 2012 > > /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 -> ../../sda2 > > > > > > Piotr Szymaniak. > -- (...) wszystko to sprawilo, iz przekroczyl owa umowna granice, ktora jego przyjaciel, prawnik Dan Tabares, nazywal linia MTWD. Gdy raz juz przekroczyles linie MTWD cokolwiek by sie zdarzylo, ty po prostu Masz To W Dupie. -- Graham Masterton, "The Burning" [-- Attachment #1.2: nilfs2.strace-f.log --] [-- Type: text/plain, Size: 8188 bytes --] execve("/sbin/nilfs_cleanerd", ["nilfs_cleanerd", "-c", "/etc/nilfs_cleanerd_rootfs.conf", "/dev/disk/by-uuid/1aa9e6fb-cf7d-"...], [/* 41 vars */]) = 0 brk(0) = 0x9121000 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7798000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=108478, ...}) = 0 mmap2(NULL, 108478, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb777d000 close(3) = 0 open("/lib/libnilfs.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 \25\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=22884, ...}) = 0 mmap2(NULL, 25800, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7776000 mmap2(0xb777b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4) = 0xb777b000 close(3) = 0 open("/lib/libnilfsgc.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\20\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=17748, ...}) = 0 mmap2(NULL, 20664, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7770000 mmap2(0xb7774000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3) = 0xb7774000 close(3) = 0 open("/lib/librt.so.1", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\36\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=34648, ...}) = 0 mmap2(NULL, 37488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7766000 mmap2(0xb776e000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7) = 0xb776e000 close(3) = 0 open("/lib/libuuid.so.1", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\22\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=17928, ...}) = 0 mmap2(NULL, 20712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7760000 mmap2(0xb7764000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3) = 0xb7764000 close(3) = 0 open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\240\306\1\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1536064, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775f000 mmap2(NULL, 1550904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75e4000 mprotect(0xb7758000, 4096, PROT_NONE) = 0 mmap2(0xb7759000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x174) = 0xb7759000 mmap2(0xb775c000, 10808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb775c000 close(3) = 0 open("/lib/libpthread.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20k\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=124430, ...}) = 0 mmap2(NULL, 107020, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75c9000 mmap2(0xb75e0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16) = 0xb75e0000 mmap2(0xb75e2000, 4620, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb75e2000 close(3) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75c8000 set_thread_area({entry_number:-1 -> 6, base_addr:0xb75c8910, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect(0xb75e0000, 4096, PROT_READ) = 0 mprotect(0xb7759000, 8192, PROT_READ) = 0 mprotect(0xb7764000, 4096, PROT_READ) = 0 mprotect(0xb776e000, 4096, PROT_READ) = 0 mprotect(0xb7774000, 4096, PROT_READ) = 0 mprotect(0xb777b000, 4096, PROT_READ) = 0 mprotect(0x804f000, 4096, PROT_READ) = 0 mprotect(0xb77b8000, 4096, PROT_READ) = 0 munmap(0xb777d000, 108478) = 0 set_tid_address(0xb75c8978) = 2246 set_robust_list(0xb75c8980, 12) = 0 futex(0xbff0883c, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0xbff0883c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, b75c8910) = -1 EAGAIN (Resource temporarily unavailable) rt_sigaction(SIGRTMIN, {0xb75cf4c0, [], SA_SIGINFO}, NULL, 8) = 0 rt_sigaction(SIGRT_1, {0xb75cf540, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0 uname({sys="Linux", node="maszyn", ...}) = 0 readlink("/dev", 0xbff0649f, 4096) = -1 EINVAL (Invalid argument) readlink("/dev/disk", 0xbff0649f, 4096) = -1 EINVAL (Invalid argument) readlink("/dev/disk/by-uuid", 0xbff0649f, 4096) = -1 EINVAL (Invalid argument) readlink("/dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7", "../../sda2", 4096) = 10 brk(0) = 0x9121000 brk(0x9142000) = 0x9142000 readlink("/dev/sda2", 0xbff0649f, 4096) = -1 EINVAL (Invalid argument) clone(Process 2247 attached child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75c8978) = 2247 [pid 2246] exit_group(0) = ? [pid 2247] setsid() = 2247 [pid 2247] chdir("/" <unfinished ...> [pid 2246] +++ exited with 0 +++ <... chdir resumed> ) = 0 close(0) = 0 close(1) = 0 close(2) = 0 open("/dev/null", O_RDONLY|O_LARGEFILE) = 0 open("/dev/null", O_WRONLY|O_LARGEFILE) = 1 open("/dev/null", O_WRONLY|O_LARGEFILE) = 2 time(NULL) = 1341913492 open("/etc/localtime", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=2679, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=2679, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7797000 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\n\0\0\0\0"..., 4096) = 2679 _llseek(3, -28, [2651], SEEK_CUR) = 0 read(3, "\nCET-1CEST,M3.5.0,M10.5.0/3\n", 4096) = 28 close(3) = 0 munmap(0xb7797000, 4096) = 0 socket(PF_FILE, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 3 connect(3, {sa_family=AF_FILE, sun_path="/dev/log"}, 110) = 0 send(3, "<30>Jul 10 11:44:52 nilfs_cleane"..., 47, MSG_NOSIGNAL) = 47 open("/dev/sda2", O_RDONLY|O_LARGEFILE) = 4 ioctl(4, BLKGETSIZE64, 25769803776) = 0 _llseek(4, 1024, [1024], SEEK_SET) = 0 read(4, "\2\0\0\0\0\00044\30\1\0\0\221\fK\363\265~$\234\2\0\0\0\377\v\0\0\0\0\0\0"..., 1024) = 1024 _llseek(4, 25769799680, [25769799680], SEEK_SET) = 0 read(4, "\2\0\0\0\0\00044\30\1\0\0\221\fK\363\265~$\234\2\0\0\0\377\v\0\0\0\0\0\0"..., 1024) = 1024 readlink("/dev", 0xbff04bef, 4096) = -1 EINVAL (Invalid argument) readlink("/dev/sda2", 0xbff04bef, 4096) = -1 EINVAL (Invalid argument) open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 5 fstat64(5, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7797000 read(5, "rootfs / rootfs rw 0 0\n/dev/root"..., 1024) = 1024 readlink("/dev", 0xbff04bef, 4096) = -1 EINVAL (Invalid argument) readlink("/dev/root", 0xbff04bef, 4096) = -1 ENOENT (No such file or directory) read(5, " 0 0\n/dev/sdd5 /mnt/de fuseblk r"..., 1024) = 207 read(5, "", 1024) = 0 close(5) = 0 munmap(0xb7797000, 4096) = 0 time(NULL) = 1341913492 send(3, "<27>Jul 10 11:44:52 nilfs_cleane"..., 99, MSG_NOSIGNAL) = 99 time(NULL) = 1341913492 send(3, "<27>Jul 10 11:44:52 nilfs_cleane"..., 104, MSG_NOSIGNAL) = 104 time(NULL) = 1341913492 send(3, "<30>Jul 10 11:44:52 nilfs_cleane"..., 50, MSG_NOSIGNAL) = 50 close(3) = 0 exit_group(1) = ? +++ exited with 1 +++ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 9:50 ` Piotr Szymaniak @ 2012-07-10 10:43 ` Vyacheslav Dubeyko 0 siblings, 0 replies; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-10 10:43 UTC (permalink / raw) To: Piotr Szymaniak; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Piotr, Could you check that in your kernel CONFIG_POSIX_MQEUEU (General Setup \POSIX Message Queues), CONFIG_SHMEM and CONFIG_TMPFS options are enabled? Thanks, Vyacheslav Dubeyko. On Tue, 2012-07-10 at 11:50 +0200, Piotr Szymaniak wrote: > On Tue, Jul 10, 2012 at 12:34:42PM +0400, Vyacheslav Dubeyko wrote: > > Hi Piotr, > > > > Could you send your output of "strace -f"? I guess that it is problem > > with semaphore opening as I can see in my reproduction. But it needs to > > compare our cases. > > I'm not a strace guru, but it looks like some issue with new udev (or > whatever creating devices/symlinks in /dev). Right now I'm missing /dev/root > at machine with issues with nilfs_cleanerd: > > maszyn ~ (: LANG="C" ls -l /dev/root > ls: cannot access /dev/root: No such file or directory > > While on the other: > wloczykij ~ (: LANG="C" ls -l /dev/root > lrwxrwxrwx 1 root root 4 Jun 27 21:50 /dev/root -> sda3 > > strace -f log attached. > > Piotr Szymaniak. > > > > > On Tue, 2012-07-10 at 09:51 +0200, Piotr Szymaniak wrote: > > > On Mon, Jul 09, 2012 at 11:33:13AM +0200, dexen deVries wrote: > > > > On Monday 09 of July 2012 09:33:00 you wrote: > > > > >(...) > > > > > Where to go from this point? How to debug nilfs_cleanerd issue? > > > > > > > > using strace -f on the nilfs_cleanerd should give you a clue what causes the > > > > exit. it may exit if can't connect to syslogd. > > > > > > Hi list, > > > > > > I will try to sort information from few messages together here. > > > > > > Attached strace output from my "nilfs_cleanerd" for rootfs. The script > > > is in /etc/local.d (again, Gentoo Linux) and it looks like this (removed > > > commented parts and empty lines): > > > CONFILE="/etc/nilfs_cleanerd_rootfs.conf" > > > TARGETDISK="/dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7" > > > /sbin/nilfs_cleanerd -c "${CONFILE}" "${TARGETDISK}" > > > > > > And here's the mentioned nilfs_cleanerd_rootfs.conf: > > > protection_period 3600 > > > min_clean_segments 13% > > > max_clean_segments 25% > > > clean_check_interval 10 > > > selection_policy timestamp # timestamp in ascend order > > > nsegments_per_clean 2 > > > mc_nsegments_per_clean 4 > > > cleaning_interval 5 > > > mc_cleaning_interval 1 > > > retry_interval 60 > > > use_mmap > > > log_priority info > > > > > > > > > > if `lscp' reports no checkpoints, try `lscp -a' -- should show even minor > > > > checkpoints (which are checkpoints created by GC, AFAIK). > > > > > > That's right, lscp -a gives some more output: > > > maszyn ~ # lscp -a | wc -l > > > 1428 > > > > > > Running nilfs_cleanerd with more verbose output (log_priority debug) > > > gives me this: > > > Jul 10 09:45:26 [nilfs_cleanerd] start > > > Jul 10 09:45:26 [nilfs_cleanerd] cannot open nilfs on /dev/sda2: No such > > > file or directory > > > Jul 10 09:45:26 [nilfs_cleanerd] cannot create cleanerd on /dev/sda2: No > > > such file or directory > > > Jul 10 09:45:26 [nilfs_cleanerd] shutdown > > > > > > > > > maszyn ~ # ll /dev/sda2 > > > brw-rw---- 1 root disk 8, 2 lip 10 2012 /dev/sda2 > > > > > > maszyn ~ # ll /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 > > > lrwxrwxrwx 1 root root 10 lip 10 2012 > > > /dev/disk/by-uuid/1aa9e6fb-cf7d-45bd-bbfb-08110a8840b7 -> ../../sda2 > > > > > > > > > Piotr Szymaniak. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running 2012-07-10 7:51 ` Piotr Szymaniak 2012-07-10 8:34 ` Vyacheslav Dubeyko @ 2012-07-10 9:52 ` Piotr Szymaniak 1 sibling, 0 replies; 20+ messages in thread From: Piotr Szymaniak @ 2012-07-10 9:52 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 405 bytes --] On Tue, Jul 10, 2012 at 09:51:31AM +0200, Piotr Szymaniak wrote: > Attached strace output from my "nilfs_cleanerd" for rootfs. Ooops, sorry. Looks like the attachment was missing. Piotr Szymaniak. -- Nie wiedzialem, co powiedziec. Czulem sie nieswojo. Nie wiedzialem, jak do niego przemowic, czym go pocieszyc. Swiat lez jest taki tajemniczy. -- Antoine De Saint-Exupery, "Le Petit Prince" [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: nilfs2 weird issue - snapshots are gone, cleanerd not running @ 2012-07-09 10:31 Vyacheslav Dubeyko 0 siblings, 0 replies; 20+ messages in thread From: Vyacheslav Dubeyko @ 2012-07-09 10:31 UTC (permalink / raw) To: szarpaj-TbOm9Ca2r9GrDJvtcaxF/A; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Piotr, Does your nilfs_cleanerd.conf file contains any special values of configuration parameters? Could you publish content of your nilfs_cleanerd.conf file? Thanks, Vyacheslav Dubeyko. On Mon, 2012-07-09 at 11:41 +0200, Piotr Szymaniak wrote: > On Mon, Jul 09, 2012 at 01:28:32PM +0400, Vyacheslav Dubeyko wrote: > > Hi Piotr, > > > > Does system journals on your machines contain any interested details > > about reported issue? Could you try to extract some error or warning > > messages from system journal? > > If by journals you mean logs then no. I'm only able to find some like > this: > Jul 3 10:32:45 wloczykij nilfs_cleanerd[1434]: resume (clean check) > Jul 3 10:41:37 wloczykij nilfs_cleanerd[1434]: pause (clean check) > > That's all about nilfs in the last week and current log has only manual > runs related to those operation described before. > > Piotr Szymaniak. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2012-07-12 11:42 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-07-09 7:33 nilfs2 weird issue - snapshots are gone, cleanerd not running Piotr Szymaniak 2012-07-09 9:28 ` Vyacheslav Dubeyko 2012-07-09 16:56 ` Piotr Szymaniak 2012-07-09 18:55 ` Vyacheslav Dubeyko [not found] ` <51D5FCEA-7103-4D4A-BADA-99A9780D9B68-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 2012-07-10 1:53 ` Ryusuke Konishi [not found] ` <20120710.105315.33988123.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2012-07-10 7:18 ` Vyacheslav Dubeyko 2012-07-10 8:51 ` Ryusuke Konishi [not found] ` <20120710.175131.21311203.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2012-07-10 10:38 ` Vyacheslav Dubeyko 2012-07-10 11:09 ` Ryusuke Konishi [not found] ` <20120710.200937.163315083.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2012-07-10 14:07 ` Piotr Szymaniak 2012-07-10 16:40 ` Ryusuke Konishi [not found] ` <20120711.014049.157490457.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2012-07-12 11:42 ` Piotr Szymaniak 2012-07-09 9:33 ` dexen deVries 2012-07-09 9:49 ` Vyacheslav Dubeyko 2012-07-10 7:51 ` Piotr Szymaniak 2012-07-10 8:34 ` Vyacheslav Dubeyko 2012-07-10 9:50 ` Piotr Szymaniak 2012-07-10 10:43 ` Vyacheslav Dubeyko 2012-07-10 9:52 ` Piotr Szymaniak 2012-07-09 10:31 Vyacheslav Dubeyko
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.