From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ning Yao Subject: Re: crc error when decode_message? Date: Wed, 18 Mar 2015 10:52:14 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-ob0-f176.google.com ([209.85.214.176]:33036 "EHLO mail-ob0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752741AbbCRCwO (ORCPT ); Tue, 17 Mar 2015 22:52:14 -0400 Received: by obcxo2 with SMTP id xo2so22571326obc.0 for ; Tue, 17 Mar 2015 19:52:14 -0700 (PDT) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Gregory Farnum Cc: Haomai Wang , Xinze Chi , "ceph-devel@vger.kernel.org" Thanks all guys. I got the ideas Regards Ning Yao 2015-03-17 21:58 GMT+08:00 Gregory Farnum : > On Tue, Mar 17, 2015 at 6:46 AM, Sage Weil wrote: >> On Tue, 17 Mar 2015, Ning Yao wrote: >>> 2015-03-16 22:06 GMT+08:00 Haomai Wang : >>> > On Mon, Mar 16, 2015 at 10:04 PM, Xinze Chi wrote: >>> >> How to process the write request in primary? >>> >> >>> >> Thanks. >>> >> >>> >> 2015-03-16 22:01 GMT+08:00 Haomai Wang : >>> >>> AFAR Pipe and AsyncConnection both will mark self fault and shutdown >>> >>> socket and peer will detect this reset. So each side has chance to >>> >>> rebuild the session. >>> >>> >>> >>> On Mon, Mar 16, 2015 at 9:19 PM, Xinze Chi wrote: >>> >>>> Such as, Client send write request to osd.0 (primary), osd.0 send >>> >>>> MOSDSubOp to osd.1 and osd.2 >>> >>>> >>> >>>> osd.1 send reply to osd.0 (primary), but accident happened: >>> >>>> >>> >>>> 1. decode_message crc error when decode reply msg >>> >>>> or >>> >>>> 2. the reply msg is lost when send to osd.0, so osd.0 do not receive replay msg >>> >>>> >>> >>>> Could anyone tell me what is the behavior if osd.0 (primary)? >>> >>>> >>> > >>> > osd.0 and osd.1 both will try to reconnect peer side, and the lost >>> > message will be resend to osd.0 from osd.1 >>> So I wonder if different routing path delays the arrival of one >>> message, then the in_seq would be setting ahead, then based on the >>> logic. Later, if the delaying message arrives, it will be dropping and >>> discard. Thus, if it is just a sub_op reply message as xinze >>> describes, how ceph works after that? It seems repop of the write Op >>> will be waiting infinite times until the osd restart? >> >> These sorts of scenarios are why src/msg/simple/Pipe.cc (an in particular, >> accept()) is not so simple. The case you describe is >> >> https://github.com/ceph/ceph/blob/master/src/msg/simple/Pipe.cc#L492 >> or >> https://github.com/ceph/ceph/blob/master/src/msg/simple/Pipe.cc#L492 >> >> In other words, this is all masked by the Messenger layer so that the >> higher layers (OSD.cc etc) see a single, ordered, reliable stream of >> messages and all of the failure/retry/reconnect logic is hidden. > > Just to be clear, that's the original described case of reconnecting. > The different routing paths stuff are all handled by TCP underneath > us, which is one of the reasons we use it. ;) > -Greg