All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andres Freund <andres@anarazel.de>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	Eric Dumazet <edumazet@google.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	c@redhat.com, Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>,
	Guenter Roeck <linux@roeck-us.net>
Subject: Re: upstream kernel crashes
Date: Mon, 15 Aug 2022 13:53:30 -0700	[thread overview]
Message-ID: <20220815205330.m54g7vcs77r6owd6@awork3.anarazel.de> (raw)
In-Reply-To: <20220815161423-mutt-send-email-mst@kernel.org>

Hi,

On 2022-08-15 16:21:51 -0400, Michael S. Tsirkin wrote:
> On Mon, Aug 15, 2022 at 10:46:17AM -0700, Andres Freund wrote:
> > Hi,
> >
> > On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> > > On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > > > Hi,
> > > >
> > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > OK so this gives us a quick revert as a solution for now.
> > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > If it crashes we either have a long standing problem in virtio
> > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > rings than what device requestes.
> > > > > Thanks!
> > > >
> > > > I applied the below and the problem persists.
> > > >
> > > > [...]
> > >
> > > Okay!
> >
> > Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
> > want me to test it with the 762faee5a267 reverted? I guess what you're trying
> > to test if a smaller queue than what's requested you'd want to do so without
> > the problematic patch applied...
> >
> >
> > Either way, I did this, and there are no issues that I could observe. No
> > oopses, no broken networking. But:
> >
> > To make sure it does something I added a debugging printk - which doesn't show
> > up. I assume this is at a point at least earlyprintk should work (which I see
> > getting enabled via serial)?
> >

> Sorry if I was unclear.  I wanted to know whether the change somehow
> exposes a driver bug or a GCP bug. So what I wanted to do is to test
> this patch on top of *5.19*, not on top of the revert.

Right, the 5.19 part was clear, just the earlier test:

> > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > OK so this gives us a quick revert as a solution for now.
> > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > If it crashes we either have a long standing problem in virtio
> > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > Thanks!

I wasn't sure about.

After I didn't see any effect on 5.19 + your patch, I grew a bit suspicious
and added the printks.


> Yes I think printk should work here.

The reason the debug patch didn't change anything, and that my debug printk
didn't show, is that gcp uses the legacy paths...

If there were a bug in the legacy path, it'd explain why the problem only
shows on gcp, and not in other situations.

I'll queue testing the legacy path with the equivalent change.

- Andres


Greetings,

Andres Freund
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: Andres Freund <andres@anarazel.de>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Jason Wang <jasowang@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Guenter Roeck <linux@roeck-us.net>,
	linux-kernel@vger.kernel.org,
	Greg KH <gregkh@linuxfoundation.org>,
	c@redhat.com
Subject: Re: upstream kernel crashes
Date: Mon, 15 Aug 2022 13:53:30 -0700	[thread overview]
Message-ID: <20220815205330.m54g7vcs77r6owd6@awork3.anarazel.de> (raw)
In-Reply-To: <20220815161423-mutt-send-email-mst@kernel.org>

Hi,

On 2022-08-15 16:21:51 -0400, Michael S. Tsirkin wrote:
> On Mon, Aug 15, 2022 at 10:46:17AM -0700, Andres Freund wrote:
> > Hi,
> >
> > On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> > > On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > > > Hi,
> > > >
> > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > OK so this gives us a quick revert as a solution for now.
> > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > If it crashes we either have a long standing problem in virtio
> > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > rings than what device requestes.
> > > > > Thanks!
> > > >
> > > > I applied the below and the problem persists.
> > > >
> > > > [...]
> > >
> > > Okay!
> >
> > Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
> > want me to test it with the 762faee5a267 reverted? I guess what you're trying
> > to test if a smaller queue than what's requested you'd want to do so without
> > the problematic patch applied...
> >
> >
> > Either way, I did this, and there are no issues that I could observe. No
> > oopses, no broken networking. But:
> >
> > To make sure it does something I added a debugging printk - which doesn't show
> > up. I assume this is at a point at least earlyprintk should work (which I see
> > getting enabled via serial)?
> >

> Sorry if I was unclear.  I wanted to know whether the change somehow
> exposes a driver bug or a GCP bug. So what I wanted to do is to test
> this patch on top of *5.19*, not on top of the revert.

Right, the 5.19 part was clear, just the earlier test:

> > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > OK so this gives us a quick revert as a solution for now.
> > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > If it crashes we either have a long standing problem in virtio
> > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > Thanks!

I wasn't sure about.

After I didn't see any effect on 5.19 + your patch, I grew a bit suspicious
and added the printks.


> Yes I think printk should work here.

The reason the debug patch didn't change anything, and that my debug printk
didn't show, is that gcp uses the legacy paths...

If there were a bug in the legacy path, it'd explain why the problem only
shows on gcp, and not in other situations.

I'll queue testing the legacy path with the equivalent change.

- Andres


Greetings,

Andres Freund

  reply	other threads:[~2022-08-15 21:01 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-14 21:26 upstream kernel crashes Guenter Roeck
2022-08-14 21:40 ` Linus Torvalds
2022-08-14 22:37   ` Andres Freund
2022-08-14 22:47     ` Linus Torvalds
2022-08-15  1:04       ` Jens Axboe
2022-08-15  1:36         ` Andres Freund
2022-08-15  3:18           ` Linus Torvalds
2022-08-15  7:11             ` Andres Freund
2022-08-15  7:29               ` Michael S. Tsirkin
2022-08-15  7:46                 ` Andres Freund
2022-08-15  7:53                   ` Michael S. Tsirkin
2022-08-15  8:02                   ` Michael S. Tsirkin
2022-08-15  8:02                     ` Michael S. Tsirkin
2022-08-15  7:51               ` Michael S. Tsirkin
2022-08-15  8:15                 ` Andres Freund
2022-08-15  8:28                   ` Michael S. Tsirkin
2022-08-15  8:34                     ` Andres Freund
2022-08-15 15:40                       ` Michael S. Tsirkin
2022-08-15 15:40                         ` Michael S. Tsirkin
2022-08-15 16:45                         ` Andres Freund
2022-08-15 16:45                           ` Andres Freund
2022-08-15 16:50                           ` Michael S. Tsirkin
2022-08-15 16:50                             ` Michael S. Tsirkin
2022-08-15 17:46                             ` Andres Freund
2022-08-15 17:46                               ` Andres Freund
2022-08-15 20:21                               ` Michael S. Tsirkin
2022-08-15 20:21                                 ` Michael S. Tsirkin
2022-08-15 20:53                                 ` Andres Freund [this message]
2022-08-15 20:53                                   ` Andres Freund
2022-08-15 21:04                                   ` Andres Freund
2022-08-15 21:04                                     ` Andres Freund
2022-08-15 21:10                                     ` Andres Freund
2022-08-15 21:10                                       ` Andres Freund
2022-08-15 21:32                                   ` Michael S. Tsirkin
2022-08-15 21:32                                     ` Michael S. Tsirkin
2022-08-16  2:45                                     ` Xuan Zhuo
2022-08-16  2:45                                       ` Xuan Zhuo
2022-08-17  6:13                                     ` Dmitry Vyukov
2022-08-17  6:13                                       ` Dmitry Vyukov via Virtualization
2022-08-17  6:36                                       ` Xuan Zhuo
2022-08-17  6:36                                         ` Xuan Zhuo
2022-08-17 10:53                                         ` Michael S. Tsirkin
2022-08-17 10:53                                           ` Michael S. Tsirkin
2022-08-17 15:58                                         ` Linus Torvalds
2022-08-17 15:58                                           ` Linus Torvalds
2022-08-18  1:55                                           ` Xuan Zhuo
2022-08-18  1:55                                             ` Xuan Zhuo
2022-08-15 20:45                             ` Guenter Roeck
2022-08-15 20:45                               ` Guenter Roeck
2022-08-15  6:36           ` Michael S. Tsirkin
2022-08-15  7:17             ` Andres Freund
2022-08-15  7:43               ` Michael S. Tsirkin
2022-08-15  1:17       ` Guenter Roeck
2022-08-15  1:29         ` Jens Axboe
2022-08-15  9:43 ` Michael S. Tsirkin
2022-08-15 15:49   ` Guenter Roeck
2022-08-15 16:01     ` Michael S. Tsirkin
2022-08-15 18:22       ` Guenter Roeck
2022-08-15 18:37         ` Linus Torvalds
2022-08-15 20:38           ` Guenter Roeck
2022-08-17 17:12 ` Linus Torvalds
2022-08-18  1:08   ` Andres Freund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220815205330.m54g7vcs77r6owd6@awork3.anarazel.de \
    --to=andres@anarazel.de \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=axboe@kernel.dk \
    --cc=c@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=martin.petersen@oracle.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.