All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Xinze Chi (信泽)" <xmdxcxz@gmail.com>
To: Gregory Farnum <gfarnum@redhat.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: some issue about peering progress
Date: Thu, 2 Nov 2017 11:35:15 +0800	[thread overview]
Message-ID: <CANE=7sVop6coM0oTppBxK=CWH5B1XQQ9AHpd-MpAoxrb-Q1gFQ@mail.gmail.com> (raw)
In-Reply-To: <CAJ4mKGaFP+5vePBNzm0fk3pqONirfCyGhrGynZ4BTD3AwfNMuw@mail.gmail.com>

The Stray set send_notify false only if go to activate?

2017-11-02 10:27 GMT+08:00 Gregory Farnum <gfarnum@redhat.com>:
> On Wed, Nov 1, 2017 at 5:27 PM Xinze Chi (信泽) <xmdxcxz@gmail.com> wrote:
>>
>> 2017-11-02 4:26 GMT+08:00 Gregory Farnum <gfarnum@redhat.com>:
>> > On Fri, Oct 27, 2017 at 12:46 AM Xinze Chi (信泽) <xmdxcxz@gmail.com> wrote:
>> >>
>> >> hi, all:
>> >>
>> >>      I confuse about the notify message during peering. Such as:
>> >>
>> >>     epoch 1, primary osd do Pering , GetInfo and GetMissing, it
>> >> calling the func  proc_replica_log. in this func the last_complete and
>> >> last_update maybe reset.
>> >>
>> >>     Before go to Activate. the OSDMap change (the new osdmap do not
>> >> lead to restart peering), the non-primary osd send the notify to
>> >> primary.
>> >
>> >
>> > I don't think this can happen. The OSD won't re-send a notify during
>> > the same peering interval, and even if it did the message would be
>> > tagged with a new (higher) epoch so the PG wouldn't process it until
>> > after it had switched states, right?
>> >
>>
>>    I just want to understand this algorithm. When the Stray osd received ActMap
>>
>> it would send_notity even if during the same peering interval. see
>> Stray::react(const ActMap&).
>
> Note the
>
> if (pg->should_send_notify()
>
> check preceding that block. It checks a boolean send_notify value that
> is set true only when it enters a new peering interval, and is set
> false as soon as it shares its info. So I don't think the primary's
> behavior matters at all (other than from a security perspective,
> anyway).
>
>
>>   You say the priamry osd wouldn't process the notify msg, I do not
>> find out the code. The primary
>>
>> call handle_pg_notify and process it.
>
> I didn't actually track the order of the state machine here; I just
> saw that PG::RecoveryState::Active::react(const MNotifyRec& notevt)
> will throw them out if it's already seen the info. You're right
> PG::RecoveryState::Primary::react(const MNotifyRec& notevt) will
> process it unconditionally. I'm not sure if those are the replica and
> primary states, or if you move from Primary to Active (or vice versa).
> -Greg
>
>>
>>
>> >>
>> >>
>> >>     When the primary receive the notify, Primary::react(const
>> >> MNotifyRec& notevt), so it call the func proc_replica_info.
>> >>
>> >>     In the func, we update the pg info including last_complete and
>> >> last_update which modified in proc_replica_log.
>> >
>> > Note also that "PG::RecoveryState::Active::react(const MNotifyRec&
>> > notevt)" does *not* unconditionally invoke proc_replica_info(). I
>> > think you were trying to say we hadn't reached this state on receipt
>> > of the message? But as I mentioned above, I think we block so that's
>> > not actually possible either.
>> >
>> >>
>> >>     When the primary call the func activate, the primary osd  process
>> >> recovering based on pg info got by notify instead of proc_replica_log.
>> >>
>> >>     so it is a bug?
>> >
>> > Have you seen issues in the wild, or just trying to understand this
>> > code/algorithm? I would be surprised if we had undiscovered issues
>> > here just because our tests exercise peering quite vigorously, but I
>> > might be missing what's happening in my own code skims.
>> > -Greg
>>
>>
>>
>> --
>> Regards,
>> Xinze Chi



-- 
Regards,
Xinze Chi

  parent reply	other threads:[~2017-11-02  3:35 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-27  7:45 some issue about peering progress Xinze Chi (信泽)
2017-11-01 16:42 ` Ning Yao
2017-11-01 20:26 ` Gregory Farnum
     [not found]   ` <CANE=7sXWMXTpTfgG6NmwxYYyjYA2_UZ3oNun4eAw+QNiht2nkg@mail.gmail.com>
     [not found]     ` <CAJ4mKGaFP+5vePBNzm0fk3pqONirfCyGhrGynZ4BTD3AwfNMuw@mail.gmail.com>
2017-11-02  3:35       ` Xinze Chi (信泽) [this message]
2017-11-28 23:35         ` Gregory Farnum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANE=7sVop6coM0oTppBxK=CWH5B1XQQ9AHpd-MpAoxrb-Q1gFQ@mail.gmail.com' \
    --to=xmdxcxz@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gfarnum@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.