From: Jason Wang <jasowang@redhat.com> To: "Michael S. Tsirkin" <mst@redhat.com> Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, maxime.coquelin@redhat.com, alvaro.karsz@solid-run.com, eperezma@redhat.com Subject: Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue Date: Wed, 28 Dec 2022 19:53:08 +0800 [thread overview] Message-ID: <CACGkMEv=+D+Es4sfde_X7F0zspVdy4Rs1Wi9qfCudsznsUrOTQ@mail.gmail.com> (raw) In-Reply-To: <0d9f1b89-9374-747b-3fb0-b4b28ad0ace1@redhat.com> On Wed, Dec 28, 2022 at 2:34 PM Jason Wang <jasowang@redhat.com> wrote: > > > 在 2022/12/27 17:38, Michael S. Tsirkin 写道: > > On Tue, Dec 27, 2022 at 05:12:58PM +0800, Jason Wang wrote: > >> 在 2022/12/27 15:33, Michael S. Tsirkin 写道: > >>> On Tue, Dec 27, 2022 at 12:30:35PM +0800, Jason Wang wrote: > >>>>> But device is still going and will later use the buffers. > >>>>> > >>>>> Same for timeout really. > >>>> Avoiding infinite wait/poll is one of the goals, another is to sleep. > >>>> If we think the timeout is hard, we can start from the wait. > >>>> > >>>> Thanks > >>> If the goal is to avoid disrupting traffic while CVQ is in use, > >>> that sounds more reasonable. E.g. someone is turning on promisc, > >>> a spike in CPU usage might be unwelcome. > >> > >> Yes, this would be more obvious is UP is used. > >> > >> > >>> things we should be careful to address then: > >>> 1- debugging. Currently it's easy to see a warning if CPU is stuck > >>> in a loop for a while, and we also get a backtrace. > >>> E.g. with this - how do we know who has the RTNL? > >>> We need to integrate with kernel/watchdog.c for good results > >>> and to make sure policy is consistent. > >> > >> That's fine, will consider this. So after some investigation, it seems the watchdog.c doesn't help. The only export helper is touch_softlockup_watchdog() which tries to avoid triggering the lockups warning for the known slow path. And before the patch, we end up with a real infinite loop which could be caught by RCU stall detector which is not the case of the sleep. What we can do is probably do a periodic netdev_err(). Thanks > >> > >> > >>> 2- overhead. In a very common scenario when device is in hypervisor, > >>> programming timers etc has a very high overhead, at bootup > >>> lots of CVQ commands are run and slowing boot down is not nice. > >>> let's poll for a bit before waiting? > >> > >> Then we go back to the question of choosing a good timeout for poll. And > >> poll seems problematic in the case of UP, scheduler might not have the > >> chance to run. > > Poll just a bit :) Seriously I don't know, but at least check once > > after kick. > > > I think it is what the current code did where the condition will be > check before trying to sleep in the wait_event(). > > > > > >>> 3- suprise removal. need to wake up thread in some way. what about > >>> other cases of device breakage - is there a chance this > >>> introduces new bugs around that? at least enumerate them please. > >> > >> The current code did: > >> > >> 1) check for vq->broken > >> 2) wakeup during BAD_RING() > >> > >> So we won't end up with a never woke up process which should be fine. > >> > >> Thanks > > > > BTW BAD_RING on removal will trigger dev_err. Not sure that is a good > > idea - can cause crashes if kernel panics on error. > > > Yes, it's better to use __virtqueue_break() instead. > > But consider we will start from a wait first, I will limit the changes > in virtio-net without bothering virtio core. > > Thanks > > > > > >>>
WARNING: multiple messages have this Message-ID (diff)
From: Jason Wang <jasowang@redhat.com> To: "Michael S. Tsirkin" <mst@redhat.com> Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, eperezma@redhat.com, edumazet@google.com, maxime.coquelin@redhat.com, kuba@kernel.org, pabeni@redhat.com, davem@davemloft.net Subject: Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue Date: Wed, 28 Dec 2022 19:53:08 +0800 [thread overview] Message-ID: <CACGkMEv=+D+Es4sfde_X7F0zspVdy4Rs1Wi9qfCudsznsUrOTQ@mail.gmail.com> (raw) In-Reply-To: <0d9f1b89-9374-747b-3fb0-b4b28ad0ace1@redhat.com> On Wed, Dec 28, 2022 at 2:34 PM Jason Wang <jasowang@redhat.com> wrote: > > > 在 2022/12/27 17:38, Michael S. Tsirkin 写道: > > On Tue, Dec 27, 2022 at 05:12:58PM +0800, Jason Wang wrote: > >> 在 2022/12/27 15:33, Michael S. Tsirkin 写道: > >>> On Tue, Dec 27, 2022 at 12:30:35PM +0800, Jason Wang wrote: > >>>>> But device is still going and will later use the buffers. > >>>>> > >>>>> Same for timeout really. > >>>> Avoiding infinite wait/poll is one of the goals, another is to sleep. > >>>> If we think the timeout is hard, we can start from the wait. > >>>> > >>>> Thanks > >>> If the goal is to avoid disrupting traffic while CVQ is in use, > >>> that sounds more reasonable. E.g. someone is turning on promisc, > >>> a spike in CPU usage might be unwelcome. > >> > >> Yes, this would be more obvious is UP is used. > >> > >> > >>> things we should be careful to address then: > >>> 1- debugging. Currently it's easy to see a warning if CPU is stuck > >>> in a loop for a while, and we also get a backtrace. > >>> E.g. with this - how do we know who has the RTNL? > >>> We need to integrate with kernel/watchdog.c for good results > >>> and to make sure policy is consistent. > >> > >> That's fine, will consider this. So after some investigation, it seems the watchdog.c doesn't help. The only export helper is touch_softlockup_watchdog() which tries to avoid triggering the lockups warning for the known slow path. And before the patch, we end up with a real infinite loop which could be caught by RCU stall detector which is not the case of the sleep. What we can do is probably do a periodic netdev_err(). Thanks > >> > >> > >>> 2- overhead. In a very common scenario when device is in hypervisor, > >>> programming timers etc has a very high overhead, at bootup > >>> lots of CVQ commands are run and slowing boot down is not nice. > >>> let's poll for a bit before waiting? > >> > >> Then we go back to the question of choosing a good timeout for poll. And > >> poll seems problematic in the case of UP, scheduler might not have the > >> chance to run. > > Poll just a bit :) Seriously I don't know, but at least check once > > after kick. > > > I think it is what the current code did where the condition will be > check before trying to sleep in the wait_event(). > > > > > >>> 3- suprise removal. need to wake up thread in some way. what about > >>> other cases of device breakage - is there a chance this > >>> introduces new bugs around that? at least enumerate them please. > >> > >> The current code did: > >> > >> 1) check for vq->broken > >> 2) wakeup during BAD_RING() > >> > >> So we won't end up with a never woke up process which should be fine. > >> > >> Thanks > > > > BTW BAD_RING on removal will trigger dev_err. Not sure that is a good > > idea - can cause crashes if kernel panics on error. > > > Yes, it's better to use __virtqueue_break() instead. > > But consider we will start from a wait first, I will limit the changes > in virtio-net without bothering virtio core. > > Thanks > > > > > >>> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2022-12-28 11:54 UTC|newest] Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-12-26 7:49 [PATCH 0/4] virtio-net: don't busy poll for cvq command Jason Wang 2022-12-26 7:49 ` Jason Wang 2022-12-26 7:49 ` [PATCH 1/4] virtio-net: convert rx mode setting to use workqueue Jason Wang 2022-12-26 7:49 ` Jason Wang 2022-12-27 7:39 ` Michael S. Tsirkin 2022-12-27 7:39 ` Michael S. Tsirkin 2022-12-27 9:06 ` Jason Wang 2022-12-27 9:06 ` Jason Wang 2022-12-30 2:51 ` Jakub Kicinski 2022-12-30 3:40 ` Jason Wang 2022-12-30 3:40 ` Jason Wang 2022-12-26 7:49 ` [PATCH 2/4] virtio_ring: switch to use BAD_RING() Jason Wang 2022-12-26 7:49 ` Jason Wang 2022-12-26 23:36 ` Michael S. Tsirkin 2022-12-26 23:36 ` Michael S. Tsirkin 2022-12-27 3:51 ` Jason Wang 2022-12-27 3:51 ` Jason Wang 2022-12-27 7:21 ` Michael S. Tsirkin 2022-12-27 7:21 ` Michael S. Tsirkin 2022-12-26 7:49 ` [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue Jason Wang 2022-12-26 7:49 ` Jason Wang 2022-12-26 23:34 ` Michael S. Tsirkin 2022-12-26 23:34 ` Michael S. Tsirkin 2022-12-27 3:47 ` Jason Wang 2022-12-27 3:47 ` Jason Wang 2022-12-27 7:19 ` Michael S. Tsirkin 2022-12-27 7:19 ` Michael S. Tsirkin 2022-12-27 9:09 ` Jason Wang 2022-12-27 9:09 ` Jason Wang 2022-12-26 23:38 ` Michael S. Tsirkin 2022-12-26 23:38 ` Michael S. Tsirkin 2022-12-27 4:30 ` Jason Wang 2022-12-27 4:30 ` Jason Wang 2022-12-27 7:33 ` Michael S. Tsirkin 2022-12-27 7:33 ` Michael S. Tsirkin 2022-12-27 9:12 ` Jason Wang 2022-12-27 9:12 ` Jason Wang 2022-12-27 9:38 ` Michael S. Tsirkin 2022-12-27 9:38 ` Michael S. Tsirkin 2022-12-28 6:34 ` Jason Wang 2022-12-28 6:34 ` Jason Wang 2022-12-28 11:53 ` Jason Wang [this message] 2022-12-28 11:53 ` Jason Wang 2022-12-29 7:07 ` Michael S. Tsirkin 2022-12-29 7:07 ` Michael S. Tsirkin 2022-12-29 8:04 ` Jason Wang 2022-12-29 8:04 ` Jason Wang 2022-12-29 8:10 ` Michael S. Tsirkin 2022-12-29 8:10 ` Michael S. Tsirkin 2022-12-30 3:43 ` Jason Wang 2022-12-30 3:43 ` Jason Wang 2023-01-27 10:35 ` Michael S. Tsirkin 2023-01-27 10:35 ` Michael S. Tsirkin 2023-01-29 5:48 ` Jason Wang 2023-01-29 5:48 ` Jason Wang 2023-01-29 7:30 ` Michael S. Tsirkin 2023-01-29 7:30 ` Michael S. Tsirkin 2023-01-30 2:53 ` Jason Wang 2023-01-30 2:53 ` Jason Wang 2023-01-30 5:43 ` Michael S. Tsirkin 2023-01-30 5:43 ` Michael S. Tsirkin 2023-01-30 7:44 ` Jason Wang 2023-01-30 7:44 ` Jason Wang 2023-01-30 11:18 ` Michael S. Tsirkin 2023-01-30 11:18 ` Michael S. Tsirkin 2023-01-31 3:24 ` Jason Wang 2023-01-31 3:24 ` Jason Wang 2023-01-31 7:32 ` Michael S. Tsirkin 2023-01-31 7:32 ` Michael S. Tsirkin [not found] ` <20230129073713.5236-1-hdanton@sina.com> 2023-01-30 3:58 ` Jason Wang 2022-12-26 7:49 ` [PATCH 4/4] virtio-net: sleep instead of busy waiting for cvq command Jason Wang 2022-12-26 7:49 ` Jason Wang 2022-12-27 2:19 ` Xuan Zhuo 2022-12-27 2:19 ` Xuan Zhuo 2022-12-27 4:33 ` Jason Wang 2022-12-27 4:33 ` Jason Wang 2022-12-27 6:58 ` Michael S. Tsirkin 2022-12-27 6:58 ` Michael S. Tsirkin 2022-12-27 9:17 ` Jason Wang 2022-12-27 9:17 ` Jason Wang 2022-12-27 9:31 ` Michael S. Tsirkin 2022-12-27 9:31 ` Michael S. Tsirkin 2022-12-28 6:35 ` Jason Wang 2022-12-28 6:35 ` Jason Wang 2022-12-28 8:31 ` Xuan Zhuo 2022-12-28 8:31 ` Xuan Zhuo 2022-12-28 11:41 ` Jason Wang 2022-12-28 11:41 ` Jason Wang 2022-12-29 2:09 ` Xuan Zhuo 2022-12-29 2:09 ` Xuan Zhuo 2022-12-29 3:22 ` Jason Wang 2022-12-29 3:22 ` Jason Wang 2022-12-29 3:41 ` Xuan Zhuo 2022-12-29 3:41 ` Xuan Zhuo 2022-12-29 4:08 ` Jason Wang 2022-12-29 4:08 ` Jason Wang 2022-12-29 6:13 ` Xuan Zhuo 2022-12-29 6:13 ` Xuan Zhuo 2022-12-28 8:39 ` Xuan Zhuo 2022-12-28 8:39 ` Xuan Zhuo 2022-12-28 11:43 ` Jason Wang 2022-12-28 11:43 ` Jason Wang 2022-12-29 2:01 ` Xuan Zhuo 2022-12-29 2:01 ` Xuan Zhuo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CACGkMEv=+D+Es4sfde_X7F0zspVdy4Rs1Wi9qfCudsznsUrOTQ@mail.gmail.com' \ --to=jasowang@redhat.com \ --cc=alvaro.karsz@solid-run.com \ --cc=davem@davemloft.net \ --cc=edumazet@google.com \ --cc=eperezma@redhat.com \ --cc=kuba@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=maxime.coquelin@redhat.com \ --cc=mst@redhat.com \ --cc=netdev@vger.kernel.org \ --cc=pabeni@redhat.com \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.