From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wei Liu <wei.liu2@citrix.com>
Subject: Re: [PATCH net-next v2 1/9] xen-netback: Introduce TX
 grant map definitions
Date: Mon, 16 Dec 2013 17:50:36 +0000
Message-ID: <20131216175036.GB25969__36637.8178274774$1387216325$gmane$org@zion.uk.xensource.com>
References: <1386892097-15502-1-git-send-email-zoltan.kiss@citrix.com>
	<1386892097-15502-2-git-send-email-zoltan.kiss@citrix.com>
	<20131213153138.GL21900@zion.uk.xensource.com>
	<52AB506E.3040509@citrix.com>
	<20131213191423.GA12582@zion.uk.xensource.com>
	<52AF1A84.3090304@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta5.messagelabs.com ([195.245.231.135])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <wei.liu2@citrix.com>) id 1VscJU-0004vf-KN
	for xen-devel@lists.xenproject.org; Mon, 16 Dec 2013 17:50:40 +0000
Content-Disposition: inline
In-Reply-To: <52AF1A84.3090304@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: jonathan.davies@citrix.com, Wei Liu <wei.liu2@citrix.com>, ian.campbell@citrix.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org
List-Id: xen-devel@lists.xenproject.org

On Mon, Dec 16, 2013 at 03:21:40PM +0000, Zoltan Kiss wrote:
[...]
> >>>> >
> >>>> >Should this be BUG_ON? AIUI this kthread should be the only one doing
> >>>> >unmap, right?
> >>>The NAPI instance can do it as well if it is a small packet fits
> >>>into PKT_PROT_LEN. But still this scenario shouldn't really happen,
> >>>I was just not sure we have to crash immediately. Maybe handle it as
> >>>a fatal error and destroy the vif?
> >>>
> >It depends. If this is within the trust boundary, i.e. everything at the
> >stage should have been sanitized then we should BUG_ON because there's
> >clearly a bug somewhere in the sanitization process, or in the
> >interaction of various backend routines.
> 
> My understanding is that crashing should be avoided if we can bail
> out somehow. At this point there is clearly a bug in netback
> somewhere, something unmapped that page before it should have
> happened, or at least that array get corrupted somehow. However
> there is a chance that xenvif_fatal_tx_err() can contain the issue,
> and the rest of the system can go unaffected.
> 

That would make debugging much harder if a crash is caused by a previous
corrupted array and we pretend we can carry on serving IMHO. Now netback
is having three routines (NAPI, two kthreads) to serve a single vif, the
interation among them makes bug hard to reproduce.

Wei.