* [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event
@ 2015-01-05 6:04 Fam Zheng
2015-01-05 19:51 ` Venkatesh Srinivas
[not found] ` <CAHdzE-_QmNZVhmChHmy3FgRjJ692Apw2Va-SGoTz25s0fLjbvQ@mail.gmail.com>
0 siblings, 2 replies; 5+ messages in thread
From: Fam Zheng @ 2015-01-05 6:04 UTC (permalink / raw)
To: linux-scsi
Cc: James E.J. Bottomley, linux-kernel, Paolo Bonzini,
Christoph Hellwig, Michael S. Tsirkin
There is a race condition in virtscsi_handle_event, when many device
hotplug/unplug events flush in quickly.
The scsi_remove_device in virtscsi_handle_transport_reset may trigger
the BUG_ON in scsi_target_reap, because the state is altered behind it,
probably by scsi_scan_host of another event. I'm able to reproduce it by
repeatedly plugging and unplugging a scsi disk with the same lun number.
To make is safe, the mutex added in struct virtio_scsi is held in
virtscsi_handle_event, so that all the events are processed in a
synchronized way. With this lock, the panic goes away.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
drivers/scsi/virtio_scsi.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index c52bb5d..7f194d4 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -110,6 +110,9 @@ struct virtio_scsi {
/* CPU hotplug notifier */
struct notifier_block nb;
+ /* Protect the hotplug/unplug event handling */
+ struct mutex scan_lock;
+
/* Protected by event_vq lock */
bool stop_events;
@@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct *work)
struct virtio_scsi *vscsi = event_node->vscsi;
struct virtio_scsi_event *event = &event_node->event;
+ mutex_lock(&vscsi->scan_lock);
if (event->event &
cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
event->event &= ~cpu_to_virtio32(vscsi->vdev,
@@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct *work)
pr_err("Unsupport virtio scsi event %x\n", event->event);
}
virtscsi_kick_event(vscsi, event_node);
+ mutex_unlock(&vscsi->scan_lock);
}
static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
@@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
const char **names;
struct virtqueue **vqs;
+ mutex_init(&vscsi->scan_lock);
num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE;
vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
--
1.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event
2015-01-05 6:04 [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event Fam Zheng
@ 2015-01-05 19:51 ` Venkatesh Srinivas
[not found] ` <CAHdzE-_QmNZVhmChHmy3FgRjJ692Apw2Va-SGoTz25s0fLjbvQ@mail.gmail.com>
1 sibling, 0 replies; 5+ messages in thread
From: Venkatesh Srinivas @ 2015-01-05 19:51 UTC (permalink / raw)
To: Fam Zheng
Cc: linux-scsi, James E.J. Bottomley, Linux Kernel Developers List,
Paolo Bonzini, Christoph Hellwig, Michael S. Tsirkin
On Sun, Jan 4, 2015 at 10:04 PM, Fam Zheng <famz@redhat.com> wrote:
>
> There is a race condition in virtscsi_handle_event, when many device
> hotplug/unplug events flush in quickly.
>
> The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> the BUG_ON in scsi_target_reap, because the state is altered behind it,
> probably by scsi_scan_host of another event. I'm able to reproduce it by
> repeatedly plugging and unplugging a scsi disk with the same lun number.
>
> To make is safe, the mutex added in struct virtio_scsi is held in
> virtscsi_handle_event, so that all the events are processed in a
> synchronized way. With this lock, the panic goes away.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> drivers/scsi/virtio_scsi.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index c52bb5d..7f194d4 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -110,6 +110,9 @@ struct virtio_scsi {
> /* CPU hotplug notifier */
> struct notifier_block nb;
>
> + /* Protect the hotplug/unplug event handling */
> + struct mutex scan_lock;
> +
> /* Protected by event_vq lock */
> bool stop_events;
>
> @@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct *work)
> struct virtio_scsi *vscsi = event_node->vscsi;
> struct virtio_scsi_event *event = &event_node->event;
>
> + mutex_lock(&vscsi->scan_lock);
> if (event->event &
> cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
> event->event &= ~cpu_to_virtio32(vscsi->vdev,
> @@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct *work)
> pr_err("Unsupport virtio scsi event %x\n", event->event);
> }
> virtscsi_kick_event(vscsi, event_node);
> + mutex_unlock(&vscsi->scan_lock);
> }
>
> static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
> @@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
> const char **names;
> struct virtqueue **vqs;
>
> + mutex_init(&vscsi->scan_lock);
> num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE;
> vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
> callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
> --
> 1.9.3
Nice find.
This fix does have the effect of serializing all event handling via
scan_lock; perhaps you want to instead create a singlethreaded
workqueue in virtio_scsi and queue handle_event there, rather than
waiting on scan_lock on the system workqueue?
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
-- vs;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event
[not found] ` <CAHdzE-_QmNZVhmChHmy3FgRjJ692Apw2Va-SGoTz25s0fLjbvQ@mail.gmail.com>
@ 2015-01-05 22:10 ` Michael S. Tsirkin
2015-01-06 3:57 ` Fam Zheng
2015-01-06 7:15 ` Michael S. Tsirkin
0 siblings, 2 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2015-01-05 22:10 UTC (permalink / raw)
To: Venkatesh Srinivas
Cc: Fam Zheng, linux-scsi, James E.J. Bottomley,
Linux Kernel Developers List, Paolo Bonzini, Christoph Hellwig
On Mon, Jan 05, 2015 at 11:48:47AM -0800, Venkatesh Srinivas wrote:
> On Sun, Jan 4, 2015 at 10:04 PM, Fam Zheng <famz@redhat.com> wrote:
>
> There is a race condition in virtscsi_handle_event, when many device
> hotplug/unplug events flush in quickly.
>
> The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> the BUG_ON in scsi_target_reap, because the state is altered behind it,
> probably by scsi_scan_host of another event. I'm able to reproduce it by
> repeatedly plugging and unplugging a scsi disk with the same lun number.
>
> To make is safe, the mutex added in struct virtio_scsi is held in
> virtscsi_handle_event, so that all the events are processed in a
> synchronized way. With this lock, the panic goes away.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> drivers/scsi/virtio_scsi.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index c52bb5d..7f194d4 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -110,6 +110,9 @@ struct virtio_scsi {
> /* CPU hotplug notifier */
> struct notifier_block nb;
>
> + /* Protect the hotplug/unplug event handling */
> + struct mutex scan_lock;
> +
> /* Protected by event_vq lock */
> bool stop_events;
>
> @@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct
> *work)
> struct virtio_scsi *vscsi = event_node->vscsi;
> struct virtio_scsi_event *event = &event_node->event;
>
> + mutex_lock(&vscsi->scan_lock);
> if (event->event &
> cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
> event->event &= ~cpu_to_virtio32(vscsi->vdev,
> @@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct
> *work)
> pr_err("Unsupport virtio scsi event %x\n", event->event);
> }
> virtscsi_kick_event(vscsi, event_node);
> + mutex_unlock(&vscsi->scan_lock);
> }
>
> static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
> @@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
> const char **names;
> struct virtqueue **vqs;
>
> + mutex_init(&vscsi->scan_lock);
> num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE;
> vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
> callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
> --
> 1.9.3
>
>
> Nice find.
>
> This fix does have the effect of serializing all event handling via scan_lock;
> perhaps you want to instead create a singlethreaded workqueue in virtio_scsi
> and queue handle_event there, rather than waiting on scan_lock on the system
> workqueue?
Or use the system single-threaded wq.
> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
>
> -- vs;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event
2015-01-05 22:10 ` Michael S. Tsirkin
@ 2015-01-06 3:57 ` Fam Zheng
2015-01-06 7:15 ` Michael S. Tsirkin
1 sibling, 0 replies; 5+ messages in thread
From: Fam Zheng @ 2015-01-06 3:57 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Venkatesh Srinivas, linux-scsi, James E.J. Bottomley,
Linux Kernel Developers List, Paolo Bonzini, Christoph Hellwig
On Tue, 01/06 00:10, Michael S. Tsirkin wrote:
> On Mon, Jan 05, 2015 at 11:48:47AM -0800, Venkatesh Srinivas wrote:
> > On Sun, Jan 4, 2015 at 10:04 PM, Fam Zheng <famz@redhat.com> wrote:
> >
> > There is a race condition in virtscsi_handle_event, when many device
> > hotplug/unplug events flush in quickly.
> >
> > The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> > the BUG_ON in scsi_target_reap, because the state is altered behind it,
> > probably by scsi_scan_host of another event. I'm able to reproduce it by
> > repeatedly plugging and unplugging a scsi disk with the same lun number.
> >
> > To make is safe, the mutex added in struct virtio_scsi is held in
> > virtscsi_handle_event, so that all the events are processed in a
> > synchronized way. With this lock, the panic goes away.
> >
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> > drivers/scsi/virtio_scsi.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> > index c52bb5d..7f194d4 100644
> > --- a/drivers/scsi/virtio_scsi.c
> > +++ b/drivers/scsi/virtio_scsi.c
> > @@ -110,6 +110,9 @@ struct virtio_scsi {
> > /* CPU hotplug notifier */
> > struct notifier_block nb;
> >
> > + /* Protect the hotplug/unplug event handling */
> > + struct mutex scan_lock;
> > +
> > /* Protected by event_vq lock */
> > bool stop_events;
> >
> > @@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct
> > *work)
> > struct virtio_scsi *vscsi = event_node->vscsi;
> > struct virtio_scsi_event *event = &event_node->event;
> >
> > + mutex_lock(&vscsi->scan_lock);
> > if (event->event &
> > cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
> > event->event &= ~cpu_to_virtio32(vscsi->vdev,
> > @@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct
> > *work)
> > pr_err("Unsupport virtio scsi event %x\n", event->event);
> > }
> > virtscsi_kick_event(vscsi, event_node);
> > + mutex_unlock(&vscsi->scan_lock);
> > }
> >
> > static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
> > @@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
> > const char **names;
> > struct virtqueue **vqs;
> >
> > + mutex_init(&vscsi->scan_lock);
> > num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE;
> > vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
> > callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
> > --
> > 1.9.3
> >
> >
> > Nice find.
> >
> > This fix does have the effect of serializing all event handling via scan_lock;
> > perhaps you want to instead create a singlethreaded workqueue in virtio_scsi
> > and queue handle_event there, rather than waiting on scan_lock on the system
> > workqueue?
>
> Or use the system single-threaded wq.
Good idea, I'll change to that.
Thanks,
Fam
>
>
> > Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
> >
> > -- vs;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event
2015-01-05 22:10 ` Michael S. Tsirkin
2015-01-06 3:57 ` Fam Zheng
@ 2015-01-06 7:15 ` Michael S. Tsirkin
1 sibling, 0 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2015-01-06 7:15 UTC (permalink / raw)
To: Venkatesh Srinivas
Cc: Fam Zheng, linux-scsi, James E.J. Bottomley,
Linux Kernel Developers List, Paolo Bonzini, Christoph Hellwig
On Tue, Jan 06, 2015 at 12:10:59AM +0200, Michael S. Tsirkin wrote:
> On Mon, Jan 05, 2015 at 11:48:47AM -0800, Venkatesh Srinivas wrote:
> > On Sun, Jan 4, 2015 at 10:04 PM, Fam Zheng <famz@redhat.com> wrote:
> >
> > There is a race condition in virtscsi_handle_event, when many device
> > hotplug/unplug events flush in quickly.
> >
> > The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> > the BUG_ON in scsi_target_reap, because the state is altered behind it,
> > probably by scsi_scan_host of another event. I'm able to reproduce it by
> > repeatedly plugging and unplugging a scsi disk with the same lun number.
> >
> > To make is safe, the mutex added in struct virtio_scsi is held in
> > virtscsi_handle_event, so that all the events are processed in a
> > synchronized way. With this lock, the panic goes away.
> >
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> > drivers/scsi/virtio_scsi.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> > index c52bb5d..7f194d4 100644
> > --- a/drivers/scsi/virtio_scsi.c
> > +++ b/drivers/scsi/virtio_scsi.c
> > @@ -110,6 +110,9 @@ struct virtio_scsi {
> > /* CPU hotplug notifier */
> > struct notifier_block nb;
> >
> > + /* Protect the hotplug/unplug event handling */
> > + struct mutex scan_lock;
> > +
> > /* Protected by event_vq lock */
> > bool stop_events;
> >
> > @@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct
> > *work)
> > struct virtio_scsi *vscsi = event_node->vscsi;
> > struct virtio_scsi_event *event = &event_node->event;
> >
> > + mutex_lock(&vscsi->scan_lock);
> > if (event->event &
> > cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
> > event->event &= ~cpu_to_virtio32(vscsi->vdev,
> > @@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct
> > *work)
> > pr_err("Unsupport virtio scsi event %x\n", event->event);
> > }
> > virtscsi_kick_event(vscsi, event_node);
> > + mutex_unlock(&vscsi->scan_lock);
> > }
> >
> > static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
> > @@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
> > const char **names;
> > struct virtqueue **vqs;
> >
> > + mutex_init(&vscsi->scan_lock);
> > num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE;
> > vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
> > callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
> > --
> > 1.9.3
> >
> >
> > Nice find.
> >
> > This fix does have the effect of serializing all event handling via scan_lock;
> > perhaps you want to instead create a singlethreaded workqueue in virtio_scsi
> > and queue handle_event there, rather than waiting on scan_lock on the system
> > workqueue?
>
> Or use the system single-threaded wq.
I was sure we have one, but apparently not :(
Pls ignore the comment, sorry about the noise.
>
> > Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
> >
> > -- vs;
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-01-06 7:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-05 6:04 [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event Fam Zheng
2015-01-05 19:51 ` Venkatesh Srinivas
[not found] ` <CAHdzE-_QmNZVhmChHmy3FgRjJ692Apw2Va-SGoTz25s0fLjbvQ@mail.gmail.com>
2015-01-05 22:10 ` Michael S. Tsirkin
2015-01-06 3:57 ` Fam Zheng
2015-01-06 7:15 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).