All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Talyansky, Roman" <roman.talyansky@sap.com>
To: Sage Weil <sage@newdream.net>
Cc: "ceph-devel@lists.sourceforge.net" <ceph-devel@lists.sourceforge.net>
Subject: Re: Write operation is stuck
Date: Fri, 19 Feb 2010 16:40:12 +0100	[thread overview]
Message-ID: <C6A64D82E3A5D24B949315CFBC1FA1AD072A238242@DEWDFECCR01.wdf.sap.corp> (raw)
In-Reply-To: <Pine.LNX.4.64.1002161032280.31640@cobra.newdream.net>

Hi Sage,

Thanks for the answer.

> It looks like dmesg shows it trying to connect to the monitor at .70, but you tested .83?
Since I test several ceph versions simultaneously I could confuse the error checking at different nodes.
I'll double check this and let you know.

> It also looks like the IO is synchronous, which may have something 
> to do with your performance.  Are you mounting with -o sync or using 
> direct IO, or are multiple clients reading and writing to the same file or 
> something?
The IO is indeed synchronous. However the performance under ceph is much worse than even under nfs, which looks strange. I do not mount with -o synch. And in our experiments multiple clients read and write the same file.

Thanks,
Roman


-----Original Message-----
From: Sage Weil [mailto:sage@newdream.net] 
Sent: Tuesday, February 16, 2010 8:35 PM
To: Talyansky, Roman
Cc: ceph-devel@lists.sourceforge.net
Subject: Re: [ceph-devel] Write operation is stuck

On Tue, 16 Feb 2010, Talyansky, Roman wrote:

> Hi Sage,
> 
> I am trying to reproduce the hang with the latest client and servers.
> I am able to start the servers, however mount fails with input/output error 5. The dmesg listing shows the following info:
> 
> [17008.244739] ceph: loaded 0.18.0 (mon/mds/osd proto 15/30/22)
> [17015.888143] ceph: mon0 10.55.147.70:6789 connection failed
> [17025.880170] ceph: mon0 10.55.147.70:6789 connection failed
> [17035.880121] ceph: mon0 10.55.147.70:6789 connection failed
> [17045.880189] ceph: mon0 10.55.147.70:6789 connection failed
> [17055.880130] ceph: mon0 10.55.147.70:6789 connection failed
> [17065.880113] ceph: mon0 10.55.147.70:6789 connection failed
> [17075.880170] ceph: mon0 10.55.147.70:6789 connection failed
> 
> The server is reachable, as the following command output shows:
> 
> $ nc 10.55.147.83 6789
> ceph v027

It looks like dmesg shows it trying to connect to the monitor at .70, but 
you tested .83?

> I started running the experiments with ceph 0.18 using the 
> configuration, where clients and servers run on separate nodes. It turns 
> out that the performance is extremely bad. Looking at dmesg trace I see 
> ceph-related faults (the partial trace is attached to the email).

The oops in the attached trace.txt was fixed last week in the unstable 
code.  It also looks like the IO is synchronous, which may have something 
to do with your performance.  Are you mounting with -o sync or using 
direct IO, or are multiple clients reading and writing to the same file or 
something?

Thanks-
sage


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev

  reply	other threads:[~2010-02-19 15:40 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-10 21:26 Write operation is stuck Talyansky, Roman
2010-02-10 21:39 ` Sage Weil
2010-02-10 22:44   ` Talyansky, Roman
2010-02-10 22:49     ` Sage Weil
2010-02-16 17:27   ` Talyansky, Roman
2010-02-16 18:35     ` Sage Weil
2010-02-19 15:40       ` Talyansky, Roman [this message]
2010-02-19 18:39         ` Sage Weil
2010-02-23 14:11           ` Talyansky, Roman
2010-02-23 18:11             ` Yehuda Sadeh Weinraub
2010-02-24 13:34               ` Talyansky, Roman
2010-02-24 14:56                 ` Sage Weil
2010-02-24 16:42                   ` Talyansky, Roman
2010-02-24 18:43                     ` Sage Weil
2010-02-24 23:21                       ` Talyansky, Roman
2010-02-25 10:07                       ` Talyansky, Roman
2010-08-27 12:18 Bogdan Lobodzinski
2010-08-27 15:42 ` Wido den Hollander
2010-08-27 16:09 ` Sage Weil
2010-08-30 15:32   ` Bogdan Lobodzinski
2010-08-30 19:39     ` Sage Weil
2010-08-31  7:56       ` Bogdan Lobodzinski
2010-09-01 15:21         ` Bogdan Lobodzinski
2010-09-01 19:29           ` Wido den Hollander
2010-09-03 15:02             ` Bogdan Lobodzinski
2010-09-03 17:10               ` Yehuda Sadeh Weinraub
2010-09-03 19:20                 ` Yehuda Sadeh Weinraub

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C6A64D82E3A5D24B949315CFBC1FA1AD072A238242@DEWDFECCR01.wdf.sap.corp \
    --to=roman.talyansky@sap.com \
    --cc=ceph-devel@lists.sourceforge.net \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.