All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem Jan Withagen <wjw@digiware.nl>
To: Ceph Development <ceph-devel@vger.kernel.org>,
	Haomai Wang <haomai@xsky.com>
Subject: Re: Not matching event states in ./msg/async/AsyncConnection.cc
Date: Wed, 22 Jun 2016 15:47:55 +0200	[thread overview]
Message-ID: <a95dcbdc-2447-076a-516d-bb717d58ca0c@digiware.nl> (raw)
In-Reply-To: <0a8a87d5-777c-7e78-0400-cf1978217928@digiware.nl>

This time also to the list.

On 20-6-2016 04:40, Haomai Wang wrote:
> On Mon, Jun 20, 2016 at 5:28 AM, Willem Jan Withagen <wjw@digiware.nl> wrote:
>> Hi,
>>
>> I 've just rebased, but now I'm getting BUG warnings in del_event()
>> calls. And I suspect that it is because of the snippet below...
>>
>> In STATE_CONNECTING EVENT_READABLE is set on sd, but we try a bit latter
>> to remove EVENT_WRITABLE. Which abort since asserts were added to the
>> kevent code.
>>
>> Could it be that in STATE_CONNECTING_RE EVENT_READABLE needs to be removed?
> 
> https://github.com/ceph/ceph/pull/9086/commits/a74ce419133881ff8618733a0501c4a47e1368e3

'mmmm,

That piece of code is already in the code that I'm testing.
So that could be the source of problems. I don't know.
Not going to pretend I understand anything of the statmachine for
setting up connections.

>> ./msg/async/AsyncConnection.cc at line 1014
>>     case STATE_CONNECTING_RE:
>>       {
>>         r = net.reconnect(get_peer_addr(), sd);
>>         if (r < 0) {
>>           ldout(async_msgr->cct, 1) << __func__ << " reconnect failed "
>> << dendl;
>>           goto fail;
>>         } else if (r > 0) {
>>           ldout(async_msgr->cct, 10) << __func__ << " nonblock connect
>> inprogress" << dendl;
>>           center->create_file_event(sd, EVENT_WRITABLE, read_handler);
>>           break;
>>         }
>>
>>       lderr(async_msgr->cct) << __func__ << ":" <<__LINE__
>>         << " delete_file_event(" << sd << ", EVENT_WRITABLE)"
>>         << dendl;
>>         center->delete_file_event(sd, EVENT_WRITABLE);
>>         state = STATE_CONNECTING_WAIT_BANNER;
>>         break;
>>       }

If I put debug code in Event and {Epoll,Kqueue}Event, and the trace for
this file handle is:
2016-06-22 12:54:59.183258 811653300 -1 Event(0x81164f448
owner=0x811653300 nevent=5000 time_id=1).create_file_event create event
started fd=13 mask=1 original mask is 0
2016-06-22 12:54:59.183279 811653300 -1 KqueueDriver.add_event add event
fd = 13 to kqfd = 10 cur_mask = 0 add_mask = 1
2016-06-22 12:54:59.183286 811653300 -1 Event(0x81164f448
owner=0x811653300 nevent=5000 time_id=1).create_file_event create event
end fd=13 mask=1 original mask is 1
2016-06-22 12:54:59.183293 811653300 -1 Event(0x81164f448
owner=0x811653300 nevent=5000 time_id=1).delete_file_event delete event
started fd=13 mask=2 original mask is 1
2016-06-22 12:54:59.183304 811653300 -1 KqueueDriver.del_event delete
event fd = 13 to kqfd = 10 cur_mask = 1 delmask = 2
2016-06-22 12:54:59.183308 811653300 -1 KqueueDriver.del_event unable to
delete event: (2) No such file or directory. for handle: 13 kqfd handle: 10

And then we hit assert(BUG)

Looking at the e_poll output from Linux, it looks like:
2016-06-22 15:28:15.237564 7ff58d7fa700 -1 Event(0x7ff5900b1038
owner=140692617668352 nevent=5000 time_id=1).create_file_event create
event started fd=13 mask=1 original mask is 0
2016-06-22 15:28:15.237595 7ff58d7fa700 -1 Event(0x7ff5900b1038
owner=140692617668352 nevent=5000 time_id=1).create_file_event create
event end fd=13 mask=1 original mask is 1
2016-06-22 15:28:15.237609 7ff58d7fa700 -1 Event(0x7ff5900b1038
owner=140692617668352 nevent=5000 time_id=1).delete_file_event delete
event started fd=13 mask=2 original mask is 1
2016-06-22 15:28:15.237613 7ff58d7fa700 -1 EpollDriver.del_event delete
event fd = 13 to epfd = 10 cur_mask = 1 delmask = 2

The access pattern is the same, but that does not generate an error.

I think the error message actually means that there is no event in the
Keventqueue, that actually matches. And that is because EVENT_WRITABLE
is asked to be removed, but the event actually only has EVENT_READ
activated. So the EV_DELETE fails.

I have not tried this trace on Linux to see what is the result there.
Perhaps Linux e_poll stuff does not really care if you want a
non-existent filter?
But the assert is rather detrimental on FreeBSD code :(

But the other strange part for me is that the code tries to remove
events that are really not inserted....
Wether that is a problem in the current code? I do not know, but it
certainly strange.

--WjW



       reply	other threads:[~2016-06-22 13:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <0a8a87d5-777c-7e78-0400-cf1978217928@digiware.nl>
2016-06-22 13:47 ` Willem Jan Withagen [this message]
2016-06-22 14:36   ` Not matching event states in ./msg/async/AsyncConnection.cc Haomai Wang
2016-06-22 15:25     ` Willem Jan Withagen
2016-06-22 22:09       ` Willem Jan Withagen
2016-06-22 22:16     ` Willem Jan Withagen
2016-06-23  7:22       ` Willem Jan Withagen
2016-06-23  7:44         ` Haomai Wang
2016-06-23  9:40           ` Willem Jan Withagen
2016-06-23 12:28             ` Haomai Wang
2016-06-23 13:16               ` Willem Jan Withagen
2016-06-23 15:16                 ` Willem Jan Withagen
2016-06-19 21:28 Willem Jan Withagen
2016-06-20  2:40 ` Haomai Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a95dcbdc-2447-076a-516d-bb717d58ca0c@digiware.nl \
    --to=wjw@digiware.nl \
    --cc=ceph-devel@vger.kernel.org \
    --cc=haomai@xsky.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.