All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yehuda Sadeh Weinraub <yehudasa@gmail.com>
To: Bogdan Lobodzinski <bogdan@mail.desy.de>
Cc: Wido den Hollander <wido@widodh.nl>,
	Sage Weil <sage@newdream.net>,
	ceph-devel@vger.kernel.org
Subject: Re: Write operation is stuck
Date: Fri, 3 Sep 2010 12:20:21 -0700	[thread overview]
Message-ID: <AANLkTi=dnfxFqSbj-nqCHOXz7+kR+5PH2Uc-jSse7s-w@mail.gmail.com> (raw)
In-Reply-To: <AANLkTikSCXufwJC+=Ze4PhkvKnD6QK9hCqDasdrGsmLQ@mail.gmail.com>

> On Fri, Sep 3, 2010 at 8:02 AM, Bogdan Lobodzinski <bogdan@mail.desy.de> wrote:
>>
>> Hello all,
>>
>> let me continue my troubles, the title can stay the same.
>> As I wrote, my ceph configuration survived my critical test
>> svn co https://root.cern.ch/svn/root/trunk root
>> and suddenly, during the night, at 5 oclock ceph became stuck again - without any kind of user activity, no work at all with /ceph directory.
>> The node is running as
>> mds1, mon1, osd0
>>
>> System log file reports (the problem starts with entry:
>> "Sep  2 05:44:42 h1farm183 kernel: [72426.976029] ceph: mds0 caps stale" ):
>> --------
>> Sep  1 12:40:38 h1farm183 kernel: [10983.398458] Btrfs loaded
>> Sep  1 12:44:25 h1farm183 kernel: [11210.109913] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5)
>> Sep  1 13:08:25 h1farm183 kernel: [12650.255052] device fsid 754ae49f827ffac4-290543ed0a3b19a1 devid 1 transid 7 /dev/sdb
>> 1
>> Sep  1 14:25:06 h1farm183 kernel: [17251.100851] RPC: Registered udp transport module.
>> Sep  1 14:25:06 h1farm183 kernel: [17251.100854] RPC: Registered tcp transport module.
>> Sep  1 14:25:06 h1farm183 kernel: [17251.100855] RPC: Registered tcp NFSv4.1 backchannel transport module.
>> Sep  1 14:25:20 h1farm183 kernel: [17265.404967] device fsid 754ae49f827ffac4-290543ed0a3b19a1 devid 1 transid 7 /dev/sdb
>> 1
>> Sep  1 14:25:20 h1farm183 kernel: [17265.562870] udev: starting version 151
>> Sep  1 14:25:26 h1farm183 kernel: [17271.752817] device fsid 754ae49f827ffac4-290543ed0a3b19a1 devid 1 transid 7 /dev/sdb
>> 1
>> ...
>> Sep  1 16:41:51 h1farm183 kernel: [25456.385184] device fsid 4940eafa1c110ce7-c14b44192348589f devid 1 transid 12 /dev/sdb1
>> Sep  1 16:42:21 h1farm183 kernel: [25486.297025] ceph: client4100 fsid 4ea08089-acf1-b738-6f72-96c3ed029b71
>> Sep  1 16:42:21 h1farm183 kernel: [25486.297169] ceph: mon0 131.169.74.116:6789 session established
>> Sep  2 02:37:54 h1farm183 rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="863" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
>> Sep  2 05:44:42 h1farm183 kernel: [72426.976029] ceph: mds0 caps stale
>> Sep  2 05:44:57 h1farm183 kernel: [72441.976037] ceph: mds0 caps stale
>> Sep  2 05:45:27 h1farm183 kernel: [72472.066320] ceph: mds0 reconnect start
>> Sep  2 05:45:27 h1farm183 kernel: [72472.069681] Modules linked in: nfs lockd nfs_acl auth_rpcgss sunrpc ceph btrfs zlib_deflate crc32c libcrc32c ppdev lp parport openafs(P) ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp fbcon tileblit font bitblit softcursor vga16fb vgastate radeon ttm mptctl drm_kms_helper bnx2 drm usbhid i5000_edac hid dell_wmi shpchp edac_core agpgart i2c_algo_bit i5k_amb dcdbas psmouse serio_raw mptsas mptscsih mptbase scsi_transport_sas [last unloaded: kvm]
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332]
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] Pid: 6184, comm: ceph-msgr/1 Tainted: P           (2.6.32-24-generic-pae #42-Ubuntu) PowerEdge 1950
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] EIP: 0060:[<c01ea907>] EFLAGS: 00010246 CPU: 1
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] EIP is at kunmap_high+0x97/0xa0
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] EAX: 00000000 EBX: f5d17000 ECX: c0916848 EDX: 00000292
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] ESI: c17ee940 EDI: f5d18000 EBP: f5fb3c6c ESP: f5fb3c64
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332]  c07d9280 f50b10a0 f5fb3c74 c0138307 f5fb3c98 f9ad7d54 00000000 f5fb3cbc
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] <0> 00000038 0000002b eaee1018 ee4bcd70 00000000 f5fb3d14 f9ada09d 00000000
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332] <0> eaee108c 0000005c f60bab40 eaee0e00 ee788440 f50b10a0 00000a21 00000000
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332]  [<c0138307>] ? kunmap+0x57/0x60
>> Sep  2 05:45:27 h1farm183 kernel: [72472.072332]  [<f9ad7d54>] ? ceph_pagelist_append+0x54/0x110 [ceph]
...
>> The node was stuck at all.
>> Do you know what can be a reason ?

Maybe the following patch fixes it? I'll push a fix to the unstable
branch, let me know if it works for you.

Thanks,
Yehuda

diff --git a/fs/ceph/pagelist.c b/fs/ceph/pagelist.c
index b6859f4..46a368b 100644
--- a/fs/ceph/pagelist.c
+++ b/fs/ceph/pagelist.c
@@ -5,10 +5,18 @@

 #include "pagelist.h"

+static void ceph_pagelist_unmap_tail(struct ceph_pagelist *pl)
+{
+	struct page *page = list_entry(pl->head.prev, struct page,
+				       lru);
+	kunmap(page);
+}
+
 int ceph_pagelist_release(struct ceph_pagelist *pl)
 {
 	if (pl->mapped_tail)
-		kunmap(pl->mapped_tail);
+		ceph_pagelist_unmap_tail(pl);
+
 	while (!list_empty(&pl->head)) {
 		struct page *page = list_first_entry(&pl->head, struct page,
 						     lru);
@@ -26,7 +34,7 @@ static int ceph_pagelist_addpage(struct ceph_pagelist *pl)
 	pl->room += PAGE_SIZE;
 	list_add_tail(&page->lru, &pl->head);
 	if (pl->mapped_tail)
-		kunmap(pl->mapped_tail);
+		ceph_pagelist_unmap_tail(pl);
 	pl->mapped_tail = kmap(page);
 	return 0;
 }
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-09-03 19:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-27 12:18 Write operation is stuck Bogdan Lobodzinski
2010-08-27 15:42 ` Wido den Hollander
2010-08-27 16:09 ` Sage Weil
2010-08-30 15:32   ` Bogdan Lobodzinski
2010-08-30 19:39     ` Sage Weil
2010-08-31  7:56       ` Bogdan Lobodzinski
2010-09-01 15:21         ` Bogdan Lobodzinski
2010-09-01 19:29           ` Wido den Hollander
2010-09-03 15:02             ` Bogdan Lobodzinski
2010-09-03 17:10               ` Yehuda Sadeh Weinraub
2010-09-03 19:20                 ` Yehuda Sadeh Weinraub [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-02-10 21:26 Talyansky, Roman
2010-02-10 21:39 ` Sage Weil
2010-02-10 22:44   ` Talyansky, Roman
2010-02-10 22:49     ` Sage Weil
2010-02-16 17:27   ` Talyansky, Roman
2010-02-16 18:35     ` Sage Weil
2010-02-19 15:40       ` Talyansky, Roman
2010-02-19 18:39         ` Sage Weil
2010-02-23 14:11           ` Talyansky, Roman
2010-02-23 18:11             ` Yehuda Sadeh Weinraub
2010-02-24 13:34               ` Talyansky, Roman
2010-02-24 14:56                 ` Sage Weil
2010-02-24 16:42                   ` Talyansky, Roman
2010-02-24 18:43                     ` Sage Weil
2010-02-24 23:21                       ` Talyansky, Roman
2010-02-25 10:07                       ` Talyansky, Roman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=dnfxFqSbj-nqCHOXz7+kR+5PH2Uc-jSse7s-w@mail.gmail.com' \
    --to=yehudasa@gmail.com \
    --cc=bogdan@mail.desy.de \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    --cc=wido@widodh.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.