* Trouble with paxos service for large PG count
@ 2013-04-01 21:14 Jim Schutt
2013-04-02 13:41 ` Joao Eduardo Luis
2013-04-02 15:42 ` Joao Eduardo Luis
0 siblings, 2 replies; 8+ messages in thread
From: Jim Schutt @ 2013-04-01 21:14 UTC (permalink / raw)
To: ceph-devel
Hi,
I've been having trouble starting a new file system
created using the current next branch (most recently,
commit 3b5f663f11).
I believe the trouble is related to how long it takes paxos to
process a pgmap proposal.
For a configuration with 1 mon, 1 mds, and 576 osds, using
pg_bits = 3 and debug paxos = 10, if I start just the monitor,
here's what I get when paxos processes the first non-trivial
pgmap proposal:
2013-04-01 14:04:16.330735 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:04:16.358973 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 4943990 bytes; ctx = 0x11e81f0
2013-04-01 14:04:16.359021 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes
2013-04-01 14:04:28.164876 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-01 14:04:28.226789 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-01 14:04:28.226885 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1260f50)
2013-04-01 14:04:28.226892 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 11.867914 to finish
2013-04-01 14:04:28.226941 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active
When I redo the test with pg_bits = 4 (27696 PGs total) I get this:
2013-04-01 14:25:16.993255 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:25:17.051729 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 9887726 bytes; ctx = 0x11e81f0
2013-04-01 14:25:17.051882 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 9887726 bytes
2013-04-01 14:25:17.051888 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-01 14:26:08.185143 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 9887726 bytes
2013-04-01 14:26:08.296351 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-01 14:26:08.408900 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-01 14:26:08.408928 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1260f50)
2013-04-01 14:26:08.408931 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 51.357088 to finish
2013-04-01 14:26:08.408974 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active
and at pg_bits = 5 (55392 PGs total) I get this:
2013-04-01 14:36:15.385280 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:36:15.500292 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 19775198 bytes; ctx = 0x11e81f0
2013-04-01 14:36:15.500433 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 19775198 bytes
2013-04-01 14:36:15.500438 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-01 14:39:43.645062 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 19775198 bytes
2013-04-01 14:39:43.875099 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-01 14:39:44.086097 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-01 14:39:44.086128 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1260f50)
2013-04-01 14:39:44.086136 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 208.585710 to finish
2013-04-01 14:39:44.086156 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active
For this last configuration, after collecting the above
I waited a bit, started all the OSDs, waited a bit longer,
then collected this:
2013-04-01 14:54:37.364686 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:54:37.433641 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos active c 1..27) queue_proposal bl 10629660 bytes; ctx = 0x11ece50
2013-04-01 14:54:37.433750 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued 28 10629660 bytes
2013-04-01 14:54:37.433755 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued list_proposals 1 in queue:
2013-04-01 14:55:38.684532 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) begin for 28 10629660 bytes
2013-04-01 14:55:38.814528 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..27) commit 28
2013-04-01 14:55:38.937087 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finishing proposal
2013-04-01 14:55:38.937120 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finish it (proposal = 0x1c6e3c0)
2013-04-01 14:55:38.937124 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal proposal took 61.503375 to finish
2013-04-01 14:55:38.937168 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active
It looks like finish_queued_proposal processing time is scaling
quadratically with the proposal length for pgmaps.
FWIW, I don't believe I saw any issues of this sort for
versions 0.58 and earlier.
Please let me know if there is any other information I can
provide that will help to help fix this.
Thanks -- Jim
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
2013-04-01 21:14 Trouble with paxos service for large PG count Jim Schutt
@ 2013-04-02 13:41 ` Joao Eduardo Luis
2013-04-02 15:37 ` Joao Eduardo Luis
2013-04-02 15:42 ` Joao Eduardo Luis
1 sibling, 1 reply; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 13:41 UTC (permalink / raw)
To: Jim Schutt; +Cc: ceph-devel
Hi Jim,
One thing to keep in mind is that with the monitor's rework we now share
the Paxos instance across all Paxos services. That may slow things down
a bit, given paxos proposals for different services are now queued and
have to wait their turn. But what's happening to you appears to be
something completely different -- see below.
On 04/01/2013 10:14 PM, Jim Schutt wrote:
> [snip]
>
> For this last configuration, after collecting the above
> I waited a bit, started all the OSDs, waited a bit longer,
> then collected this:
>
> 2013-04-01 14:54:37.364686 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
> 2013-04-01 14:54:37.433641 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos active c 1..27) queue_proposal bl 10629660 bytes; ctx = 0x11ece50
> 2013-04-01 14:54:37.433750 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued 28 10629660 bytes
> 2013-04-01 14:54:37.433755 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued list_proposals 1 in queue:
> 2013-04-01 14:55:38.684532 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) begin for 28 10629660 bytes
> 2013-04-01 14:55:38.814528 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..27) commit 28
> 2013-04-01 14:55:38.937087 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finishing proposal
> 2013-04-01 14:55:38.937120 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finish it (proposal = 0x1c6e3c0)
> 2013-04-01 14:55:38.937124 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal proposal took 61.503375 to finish
> 2013-04-01 14:55:38.937168 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active
>
> It looks like finish_queued_proposal processing time is scaling
> quadratically with the proposal length for pgmaps.
Ah! The reason is for such a delay just became obvious, and it's due to
a quite dumb mistake. Basically, during Paxos::begin() we're running
the whole transaction on the JSON formatter, and then outputting it with
log level 30 -- we should be checking for the log level first to avoid
spending valuable time on that, specially when transactions are huge.
Besides that, while looking into another bug, I noticed that there's a
slight problem with the logic of monitor and, at a given point, each
transaction we create ends up not only containing an incremental but
also a full version, which is bound to slow things down and exacerbate
what I just described in the previous paragraph.
-Joao
>
> FWIW, I don't believe I saw any issues of this sort for
> versions 0.58 and earlier.
>
> Please let me know if there is any other information I can
> provide that will help to help fix this.
>
> Thanks -- Jim
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
2013-04-02 13:41 ` Joao Eduardo Luis
@ 2013-04-02 15:37 ` Joao Eduardo Luis
0 siblings, 0 replies; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 15:37 UTC (permalink / raw)
To: Jim Schutt; +Cc: ceph-devel
On 04/02/2013 02:41 PM, Joao Eduardo Luis wrote:
> Hi Jim,
>
> One thing to keep in mind is that with the monitor's rework we now share
> the Paxos instance across all Paxos services. That may slow things down
> a bit, given paxos proposals for different services are now queued and
> have to wait their turn. But what's happening to you appears to be
> something completely different -- see below.
>
> On 04/01/2013 10:14 PM, Jim Schutt wrote:
>> [snip]
>>
>> For this last configuration, after collecting the above
>> I waited a bit, started all the OSDs, waited a bit longer,
>> then collected this:
>>
>> 2013-04-01 14:54:37.364686 7ffff328d700 10
>> mon.cs31@0(leader).paxosservice(pgmap) propose_pending
>> 2013-04-01 14:54:37.433641 7ffff328d700 5
>> mon.cs31@0(leader).paxos(paxos active c 1..27) queue_proposal bl
>> 10629660 bytes; ctx = 0x11ece50
>> 2013-04-01 14:54:37.433750 7ffff328d700 5
>> mon.cs31@0(leader).paxos(paxos preparing update c 1..27)
>> propose_queued 28 10629660 bytes
>> 2013-04-01 14:54:37.433755 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos preparing update c 1..27)
>> propose_queued list_proposals 1 in queue:
>> 2013-04-01 14:55:38.684532 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos preparing update c 1..27) begin for 28
>> 10629660 bytes
>> 2013-04-01 14:55:38.814528 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos updating c 1..27) commit 28
>> 2013-04-01 14:55:38.937087 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal
>> finishing proposal
>> 2013-04-01 14:55:38.937120 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal
>> finish it (proposal = 0x1c6e3c0)
>> 2013-04-01 14:55:38.937124 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal
>> proposal took 61.503375 to finish
>> 2013-04-01 14:55:38.937168 7ffff328d700 10
>> mon.cs31@0(leader).paxosservice(pgmap) _active
>>
>> It looks like finish_queued_proposal processing time is scaling
>> quadratically with the proposal length for pgmaps.
>
> Ah! The reason is for such a delay just became obvious, and it's due to
> a quite dumb mistake. Basically, during Paxos::begin() we're running
> the whole transaction on the JSON formatter, and then outputting it with
> log level 30 -- we should be checking for the log level first to avoid
> spending valuable time on that, specially when transactions are huge.
Well, this might not be right after all.
>
> Besides that, while looking into another bug, I noticed that there's a
> slight problem with the logic of monitor and, at a given point, each
> transaction we create ends up not only containing an incremental but
> also a full version, which is bound to slow things down and exacerbate
> what I just described in the previous paragraph.
But this would certainly explain the 10MB transaction on 'Paxos::begin'.
I'm now actively looking into this and created ticket #4620 on the
tracker [1].
-Joao
[1] - http://tracker.ceph.com/issues/4620
>
>
> -Joao
>
>>
>> FWIW, I don't believe I saw any issues of this sort for
>> versions 0.58 and earlier.
>>
>> Please let me know if there is any other information I can
>> provide that will help to help fix this.
>>
>> Thanks -- Jim
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
2013-04-01 21:14 Trouble with paxos service for large PG count Jim Schutt
2013-04-02 13:41 ` Joao Eduardo Luis
@ 2013-04-02 15:42 ` Joao Eduardo Luis
2013-04-02 18:18 ` Jim Schutt
1 sibling, 1 reply; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 15:42 UTC (permalink / raw)
To: Jim Schutt; +Cc: ceph-devel
On 04/01/2013 10:14 PM, Jim Schutt wrote:
> Hi,
>
> I've been having trouble starting a new file system
> created using the current next branch (most recently,
> commit 3b5f663f11).
>
> I believe the trouble is related to how long it takes paxos to
> process a pgmap proposal.
>
> For a configuration with 1 mon, 1 mds, and 576 osds, using
> pg_bits = 3 and debug paxos = 10, if I start just the monitor,
> here's what I get when paxos processes the first non-trivial
> pgmap proposal:
>
Just noticed one other thing. With 'debug paxos = 10', you should have
a whole bunch of output (the proposal's dump) after this:
> 2013-04-01 14:04:16.330735 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
> 2013-04-01 14:04:16.358973 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 4943990 bytes; ctx = 0x11e81f0
> 2013-04-01 14:04:16.359021 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
> 2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
and before this:
> 2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes
for every snippet you sent on your previous email.
The code responsible for that shouldn't ever have made into master, and
should be to blame for a great deal of the time spent.
Jim, can you confirm such output is present?
-Joao
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
2013-04-02 15:42 ` Joao Eduardo Luis
@ 2013-04-02 18:18 ` Jim Schutt
[not found] ` <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>
0 siblings, 1 reply; 8+ messages in thread
From: Jim Schutt @ 2013-04-02 18:18 UTC (permalink / raw)
To: Joao Eduardo Luis; +Cc: ceph-devel
On 04/02/2013 09:42 AM, Joao Eduardo Luis wrote:
> On 04/01/2013 10:14 PM, Jim Schutt wrote:
>> Hi,
>>
>> I've been having trouble starting a new file system
>> created using the current next branch (most recently,
>> commit 3b5f663f11).
>>
>> I believe the trouble is related to how long it takes paxos to
>> process a pgmap proposal.
>>
>> For a configuration with 1 mon, 1 mds, and 576 osds, using
>> pg_bits = 3 and debug paxos = 10, if I start just the monitor,
>> here's what I get when paxos processes the first non-trivial
>> pgmap proposal:
>>
>
> Just noticed one other thing. With 'debug paxos = 10', you should have a whole bunch of output (the proposal's dump) after this:
>
>> 2013-04-01 14:04:16.330735 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
>> 2013-04-01 14:04:16.358973 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 4943990 bytes; ctx = 0x11e81f0
>> 2013-04-01 14:04:16.359021 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
>> 2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
>
> and before this:
>
>> 2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes
>
> for every snippet you sent on your previous email.
>
> The code responsible for that shouldn't ever have made into master, and should be to blame for a great deal of the time spent.
>
> Jim, can you confirm such output is present?
Yep, here's what I see:
2013-04-01 14:04:16.359021 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
-- entry #0
proposed queued 0.000008 tx dump:
{ "ops": [
{ "op_num": 0,
"type": "PUT",
"prefix": "pgmap",
"key": "full_1",
"length": 46,
"bl": "0000 : 05 04 28 00 00 00 01 00 00 00 00 00 00 00 00 00 : ..(.............\n0010 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 33 33 : ..............33\n0020 : 73 3f 9a 99 59 3f 40 e8 59 51 90 f0 a4 07 : s?..Y?@.YQ....\n"},
{ "op_num": 1,
"type": "PUT",
"prefix": "pgmap",
"key": "full_latest",
"length": 8,
"bl": "0000 : 01 00 00 00 00 00 00 00 : ........\n"},
{ "op_num": 2,
"type": "PUT",
"prefix": "pgmap",
"key": "2",
"length": 4943790,
"bl": "0000 : 06 05 a8 6f 4b 00 02 00 00 00 00 00 00 00 18 36 : ...oK..........6
[snipped lots more of the bufferlist, then]
{ "op_num": 3,
"type": "PUT",
"prefix": "pgmap",
"key": "last_committed",
"length": 8,
"bl": "0000 : 02 00 00 00 00 00 00 00 : ........\n"}]}
2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes
As you suggested, I see that sort of thing for every "propose_queued list_proposals" in my logs
-- Jim
>
>
> -Joao
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
[not found] ` <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>
@ 2013-04-02 19:16 ` Jim Schutt
2013-04-02 20:36 ` Jim Schutt
0 siblings, 1 reply; 8+ messages in thread
From: Jim Schutt @ 2013-04-02 19:16 UTC (permalink / raw)
To: Joao Luis; +Cc: ceph-devel
On 04/02/2013 12:28 PM, Joao Luis wrote:
> Right. I'll push a patch to bump that sort of output to 30 when I get home.
Thanks - but FWIW, I don't think it's the root cause of my
issue -- more below....
>
> If you're willing, try reducing the paxos debug level to 0 and let us know
> if those delays in the proposal go away.
When I started debugging why I couldn't start new filesystems, several
days ago, I wasn't using any debugging. It took me until last Friday
to track the issue into paxos, which is when I turned on its debugging.
Also, I've added the following bits of debug output:
diff --git a/src/mon/Paxos.cc b/src/mon/Paxos.cc
index bf3eb92..70a8b5d 100644
--- a/src/mon/Paxos.cc
+++ b/src/mon/Paxos.cc
@@ -494,6 +494,8 @@ void Paxos::begin(bufferlist& v)
MonitorDBStore::Transaction t;
t.put(get_name(), last_committed+1, new_value);
+ utime_t tstart = ceph_clock_now(g_ceph_context);
+
dout(30) << __func__ << " transaction dump:\n";
JSONFormatter f(true);
t.dump(&f);
@@ -506,7 +508,14 @@ void Paxos::begin(bufferlist& v)
f.flush(*_dout);
*_dout << dendl;
+ dout(10) << __func__ << " - JSON formatting, debug_tx decode took "
+ << ceph_clock_now(g_ceph_context) - tstart << dendl;
+
+ tstart = ceph_clock_now(g_ceph_context);
get_store()->apply_transaction(t);
+ dout(10) << __func__ << " - get_store()->apply_transaction took "
+ << ceph_clock_now(g_ceph_context) - tstart << dendl;
+
if (mon->get_quorum().size() == 1) {
// we're alone, take it easy
diff --git a/src/mon/PaxosService.cc b/src/mon/PaxosService.cc
index a66c5ec..808ce62 100644
--- a/src/mon/PaxosService.cc
+++ b/src/mon/PaxosService.cc
@@ -158,6 +158,8 @@ void PaxosService::propose_pending()
encode_pending(&t);
have_pending = false;
+ utime_t tstart = ceph_clock_now(g_ceph_context);
+
dout(30) << __func__ << " transaction dump:\n";
JSONFormatter f(true);
t.dump(&f);
@@ -166,6 +168,9 @@ void PaxosService::propose_pending()
t.encode(bl);
+ dout(10) << __func__ << " - JSON formatting, encode took "
+ << ceph_clock_now(g_ceph_context) - tstart << dendl;
+
// apply to paxos
proposing.set(1);
paxos->propose_new_value(bl, new C_Committed(this));
which produces this:
2013-04-02 13:04:22.311743 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-02 13:04:22.426810 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending - JSON formatting, encode took 0.000170
2013-04-02 13:04:22.426874 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 19775198 bytes; ctx = 0x11e81f0
2013-04-02 13:04:22.426999 7ffff7fbe780 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 19775198 bytes
2013-04-02 13:04:22.427003 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-02 13:07:51.190312 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 19775198 bytes
2013-04-02 13:07:51.190860 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) begin - JSON formatting, debug_tx decode took 0.000002
2013-04-02 13:07:51.402134 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) begin - get_store()->apply_transaction took 0.211275
2013-04-02 13:07:51.402178 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-02 13:07:51.635113 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-02 13:07:51.635152 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1261340)
2013-04-02 13:07:51.635159 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 209.208162 to finish
2013-04-02 13:07:51.635193 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active
I thought I was timing the same JSON formatting in propose_pending()
and begin(), but they take much different amounts of time.
I don't understand why yet.....
Also, I just noticed the elapsed time between these two lines:
2013-04-02 13:04:22.427003 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-02 13:07:51.190312 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 19775198 bytes
It accounts for nearly the whole of the delay that finish_queued_proposal()
is reporting. Why did begin() take so long to start, I wonder ???
-- Jim
>
> -Joao
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
2013-04-02 19:16 ` Jim Schutt
@ 2013-04-02 20:36 ` Jim Schutt
2013-04-02 23:24 ` Joao Eduardo Luis
0 siblings, 1 reply; 8+ messages in thread
From: Jim Schutt @ 2013-04-02 20:36 UTC (permalink / raw)
To: Jim Schutt; +Cc: Joao Luis, ceph-devel
On 04/02/2013 01:16 PM, Jim Schutt wrote:
> On 04/02/2013 12:28 PM, Joao Luis wrote:
>> Right. I'll push a patch to bump that sort of output to 30 when I get home.
>
> Thanks - but FWIW, I don't think it's the root cause of my
> issue -- more below....
OK, I see now that you're talking about the output from list_proposals().
Indeed, when I cause list_proposals() in propose_queued() to only
be called at paxos debug 30, the stall in finish_queued_proposal()
mostly goes away. Now, with 55392 PGs I'm seeing this:
2013-04-02 14:21:19.401820 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-02 14:21:19.535731 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos active c 1..33) queue_proposal bl 19870402 bytes; ctx = 0x11ecff0
2013-04-02 14:21:19.535917 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) propose_queued 34 19870402 bytes
2013-04-02 14:21:19.535925 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) begin for 34 19870402 bytes
2013-04-02 14:21:19.754326 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..33) commit 34
2013-04-02 14:21:19.975747 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finishing proposal
2013-04-02 14:21:19.975799 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finish it (proposal = 0x790c0c0)
2013-04-02 14:21:19.975802 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal proposal took 0.439887 to finish
2013-04-02 14:21:19.975913 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active
>>
>> If you're willing, try reducing the paxos debug level to 0 and let us know
>> if those delays in the proposal go away.
>
> When I started debugging why I couldn't start new filesystems, several
> days ago, I wasn't using any debugging. It took me until last Friday
> to track the issue into paxos, which is when I turned on its debugging.
>
With that list_proposals() debug output issue out of the way, I'm
still having trouble starting up a new filesystem -- still working
to try to isolate the source of the issue....
Thanks -- Jim
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Trouble with paxos service for large PG count
2013-04-02 20:36 ` Jim Schutt
@ 2013-04-02 23:24 ` Joao Eduardo Luis
0 siblings, 0 replies; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 23:24 UTC (permalink / raw)
To: Jim Schutt; +Cc: ceph-devel
On 04/02/2013 09:36 PM, Jim Schutt wrote:
> On 04/02/2013 01:16 PM, Jim Schutt wrote:
>> On 04/02/2013 12:28 PM, Joao Luis wrote:
>>> Right. I'll push a patch to bump that sort of output to 30 when I get home.
>>
>> Thanks - but FWIW, I don't think it's the root cause of my
>> issue -- more below....
>
> OK, I see now that you're talking about the output from list_proposals().
> Indeed, when I cause list_proposals() in propose_queued() to only
> be called at paxos debug 30, the stall in finish_queued_proposal()
> mostly goes away. Now, with 55392 PGs I'm seeing this:
>
> 2013-04-02 14:21:19.401820 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
> 2013-04-02 14:21:19.535731 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos active c 1..33) queue_proposal bl 19870402 bytes; ctx = 0x11ecff0
> 2013-04-02 14:21:19.535917 7ffff328d700 5 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) propose_queued 34 19870402 bytes
> 2013-04-02 14:21:19.535925 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) begin for 34 19870402 bytes
> 2013-04-02 14:21:19.754326 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..33) commit 34
> 2013-04-02 14:21:19.975747 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finishing proposal
> 2013-04-02 14:21:19.975799 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finish it (proposal = 0x790c0c0)
> 2013-04-02 14:21:19.975802 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal proposal took 0.439887 to finish
> 2013-04-02 14:21:19.975913 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active
>
>>>
>>> If you're willing, try reducing the paxos debug level to 0 and let us know
>>> if those delays in the proposal go away.
>>
>> When I started debugging why I couldn't start new filesystems, several
>> days ago, I wasn't using any debugging. It took me until last Friday
>> to track the issue into paxos, which is when I turned on its debugging.
>>
>
> With that list_proposals() debug output issue out of the way, I'm
> still having trouble starting up a new filesystem -- still working
> to try to isolate the source of the issue....
>
> Thanks -- Jim
>
>
Good to know!
With regard to your troubles, what appears to go wrong?
-Joao
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-04-02 23:25 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-01 21:14 Trouble with paxos service for large PG count Jim Schutt
2013-04-02 13:41 ` Joao Eduardo Luis
2013-04-02 15:37 ` Joao Eduardo Luis
2013-04-02 15:42 ` Joao Eduardo Luis
2013-04-02 18:18 ` Jim Schutt
[not found] ` <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>
2013-04-02 19:16 ` Jim Schutt
2013-04-02 20:36 ` Jim Schutt
2013-04-02 23:24 ` Joao Eduardo Luis
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.