From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751104AbdA3RHZ (ORCPT ); Mon, 30 Jan 2017 12:07:25 -0500 Received: from userp1050.oracle.com ([156.151.31.82]:23341 "EHLO userp1050.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750775AbdA3RHY (ORCPT ); Mon, 30 Jan 2017 12:07:24 -0500 Subject: Re: [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM To: Vineeth Remanan Pillai , linux-kernel@vger.kernel.org References: <1484771149-12699-1-git-send-email-vineethp@u480fcf3b67f557f68df1.ant.amazon.com> <66b10c64-936a-8001-6855-2ff1ed626642@amazon.com> <38ccfaea-0a65-a6f3-c19a-e6f9c0d4ef76@oracle.com> <989bd104-13a9-f25f-b857-24ec49781f9c@amazon.com> Cc: David Miller , netdev@vger.kernel.org, Wei Liu , Paul Durrant , xen-devel From: Boris Ostrovsky Message-ID: <30069778-9509-8112-5089-2eea7b679236@oracle.com> Date: Mon, 30 Jan 2017 12:06:34 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <989bd104-13a9-f25f-b857-24ec49781f9c@amazon.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Source-IP: userp1040.oracle.com [156.151.31.81] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/30/2017 11:47 AM, Vineeth Remanan Pillai wrote: > >> 2. It tickles a latent bug during resume where the timer triggers >> before we re-connect. The trouble is that we now try to dereference >> queue->rx.sring which is NULL since we disconnect in >> netfront_resume(). (Curiously, I only observe it with 32-bit guests) > I think we may hit this bug after removing the timer as well. We call > RING_PUSH_REQUESTS_AND_CHECK_NOTIFY soon after, which also dereference > queue->rx.sring. If the timer is deleted in xennet_disconnect_backend() then why would anyone be pushing anything to the backend after that? -boris