From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751688AbaAGOvi (ORCPT ); Tue, 7 Jan 2014 09:51:38 -0500 Received: from smtp02.citrix.com ([66.165.176.63]:37892 "EHLO SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750983AbaAGOv3 (ORCPT ); Tue, 7 Jan 2014 09:51:29 -0500 X-IronPort-AV: E=Sophos;i="4.95,619,1384300800"; d="scan'208";a="88299966" Message-ID: <52CC1453.3090804@citrix.com> Date: Tue, 7 Jan 2014 14:50:59 +0000 From: Zoltan Kiss User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Wei Liu CC: , , , , Subject: Re: [PATCH net-next v2 1/9] xen-netback: Introduce TX grant map definitions References: <1386892097-15502-1-git-send-email-zoltan.kiss@citrix.com> <1386892097-15502-2-git-send-email-zoltan.kiss@citrix.com> <20131213153138.GL21900@zion.uk.xensource.com> <52AB506E.3040509@citrix.com> <20131213191423.GA12582@zion.uk.xensource.com> <52AF1A84.3090304@citrix.com> <20131216175036.GB25969@zion.uk.xensource.com> In-Reply-To: <20131216175036.GB25969@zion.uk.xensource.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.80.2.133] X-DLP: MIA1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 16/12/13 17:50, Wei Liu wrote: > On Mon, Dec 16, 2013 at 03:21:40PM +0000, Zoltan Kiss wrote: > [...] >>>>>>> >>>>>>> Should this be BUG_ON? AIUI this kthread should be the only one doing >>>>>>> unmap, right? >>>>> The NAPI instance can do it as well if it is a small packet fits >>>>> into PKT_PROT_LEN. But still this scenario shouldn't really happen, >>>>> I was just not sure we have to crash immediately. Maybe handle it as >>>>> a fatal error and destroy the vif? >>>>> >>> It depends. If this is within the trust boundary, i.e. everything at the >>> stage should have been sanitized then we should BUG_ON because there's >>> clearly a bug somewhere in the sanitization process, or in the >>> interaction of various backend routines. >> >> My understanding is that crashing should be avoided if we can bail >> out somehow. At this point there is clearly a bug in netback >> somewhere, something unmapped that page before it should have >> happened, or at least that array get corrupted somehow. However >> there is a chance that xenvif_fatal_tx_err() can contain the issue, >> and the rest of the system can go unaffected. >> > > That would make debugging much harder if a crash is caused by a previous > corrupted array and we pretend we can carry on serving IMHO. Now netback > is having three routines (NAPI, two kthreads) to serve a single vif, the > interation among them makes bug hard to reproduce. OK, I'll make this a BUG() in the next series. Zoli