All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Shuah Khan <skhan@linuxfoundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Valentina Manea <valentina.manea.m@gmail.com>,
	Shuah Khan <shuah@kernel.org>
Cc: Hillf Danton <hdanton@sina.com>,
	linux-usb@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>
Subject: Re: [PATCH] usb: usbip: fix error handling of kthread_get_run()
Date: Thu, 11 Feb 2021 22:40:28 +0900	[thread overview]
Message-ID: <676d4518-0faa-9fab-15db-0db8d216d7fb@i-love.sakura.ne.jp> (raw)
In-Reply-To: <16608833-02b7-7ba8-e9fc-7e45c87fc7f1@i-love.sakura.ne.jp>

On 2021/02/11 12:01, Tetsuo Handa wrote:
> On 2021/02/11 10:04, Tetsuo Handa wrote:
>> On 2021/02/11 5:15, Shuah Khan wrote:
>>> I you would like to fix this, please send me a complete fix.
>>
>> If you want to handle the unlikely "__kthread_create_on_node() fails without
>> being killed" case, such change ( the drivers/usb/usbip/vhci_sysfs.c portion in
>> https://syzkaller.appspot.com/x/patch.diff?x=16c3c090d00000 ) should be a separate
>> patch. Since this patch declares "Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread")",
>> this patch intends for the minimal change and does not want to do extra things.
>>
> 
> If you want a complete fix, the (untested) diff will become large.

I guess that we need to serialize attach operation and reset/detach operations, for
it seems there is a race window that might result in "general protection fault in
tomoyo_socket_sendmsg_permission". Details follows...

> diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
> index be37aec250c2..0d10021c4186 100644
> --- a/drivers/usb/usbip/vhci_sysfs.c
> +++ b/drivers/usb/usbip/vhci_sysfs.c
> @@ -305,20 +305,22 @@ static ssize_t attach_store(struct device *dev, struct device_attribute *attr,
>  {
>  	struct socket *socket;
>  	int sockfd = 0;
>  	__u32 port = 0, pdev_nr = 0, rhport = 0, devid = 0, speed = 0;
>  	struct usb_hcd *hcd;
>  	struct vhci_hcd *vhci_hcd;
>  	struct vhci_device *vdev;
>  	struct vhci *vhci;
>  	int err;
>  	unsigned long flags;
> +	struct task_struct *tx;
> +	struct task_struct *rx;
>  
>  	/*
>  	 * @rhport: port number of vhci_hcd
>  	 * @sockfd: socket descriptor of an established TCP connection
>  	 * @devid: unique device identifier in a remote host
>  	 * @speed: usb device speed in a remote host
>  	 */
>  	if (sscanf(buf, "%u %u %u %u", &port, &sockfd, &devid, &speed) != 4)
>  		return -EINVAL;
>  	pdev_nr = port_to_pdev_nr(port);
> @@ -345,62 +347,83 @@ static ssize_t attach_store(struct device *dev, struct device_attribute *attr,
>  	if (speed == USB_SPEED_SUPER)
>  		vdev = &vhci->vhci_hcd_ss->vdev[rhport];
>  	else
>  		vdev = &vhci->vhci_hcd_hs->vdev[rhport];
>  
>  	/* Extract socket from fd. */
>  	socket = sockfd_lookup(sockfd, &err);
>  	if (!socket)
>  		return -EINVAL;
>  
> +	/* Create threads now. */
> +	rx = kthread_create(vhci_rx_loop, &vdev->ud, "vhci_rx");
> +	tx = kthread_create(vhci_tx_loop, &vdev->ud, "vhci_tx");
> +	if (IS_ERR(rx) || IS_ERR(tx))
> +		goto thread_error;
> +
>  	/* now need lock until setting vdev status as used */
>  
>  	/* begin a lock */
>  	spin_lock_irqsave(&vhci->lock, flags);
>  	spin_lock(&vdev->ud.lock);
>  
>  	if (vdev->ud.status != VDEV_ST_NULL) {
>  		/* end of the lock */
>  		spin_unlock(&vdev->ud.lock);
>  		spin_unlock_irqrestore(&vhci->lock, flags);
>  
> -		sockfd_put(socket);
> -
>  		dev_err(dev, "port %d already used\n", rhport);
>  		/*
>  		 * Will be retried from userspace
>  		 * if there's another free port.
>  		 */
> -		return -EBUSY;
> +		err = -EBUSY;
> +		goto thread_error;
>  	}
>  
>  	dev_info(dev, "pdev(%u) rhport(%u) sockfd(%d)\n",
>  		 pdev_nr, rhport, sockfd);
>  	dev_info(dev, "devid(%u) speed(%u) speed_str(%s)\n",
>  		 devid, speed, usb_speed_string(speed));
>  
>  	vdev->devid         = devid;
>  	vdev->speed         = speed;
>  	vdev->ud.sockfd     = sockfd;
>  	vdev->ud.tcp_socket = socket;
>  	vdev->ud.status     = VDEV_ST_NOTASSIGNED;
>  
>  	spin_unlock(&vdev->ud.lock);
>  	spin_unlock_irqrestore(&vhci->lock, flags);
>  	/* end the lock */
>  
> -	vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, &vdev->ud, "vhci_rx");
> -	vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, &vdev->ud, "vhci_tx");
> +	/* Start the threads. */
> +	get_task_struct(rx);
> +	vdev->ud.tcp_rx = rx;
> +	wake_up_process(rx);

Even this seemingly complete fix turned out to be incomplete.

kthread_get_run(vhci_rx_loop, &vdev->ud, "vhci_rx") creates a vhci_rx kernel thread
and immediately starts vhci_rx_loop() function. Then, vdev->ud.tcp_rx becomes non-NULL
because the vhci_rx kernel thread was successfully created. Then in vhci_rx_loop(),
vhci_rx_pdu(ud) will be called because kthread_should_stop() is false and
usbip_event_happened() is 0. In vhci_rx_pdu(), sock_recvmsg() is called from usbip_recv().

If the peer side of ud->tcp_socket already called shutdown(SHUT_WR), sock_recvmsg() detects
end of stream and returns 0. Then, vhci_rx_pdu() prints "connection closed" message and
calls usbip_event_add(ud, VDEV_EVENT_DOWN).
(Well, where is the code that verifies that tcp_socket is indeed a SOCK_STREAM socket?
If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect
end of stream...)

Now, kthread_get_run(vhci_tx_loop, &vdev->ud, "vhci_tx") creates a vhci_tx kernel thread
and immediately starts vhci_tx_loop() function. But if current thread which called
kthread_get_run() is preempted at this moment, vdev->ud.tcp_tx remains NULL despite
vhci_tx kernel thread was successfully created.

Then in vhci_tx_loop(), wait_event_interruptible() is called because kthread_should_stop()
is false and vhci_send_cmd_submit() is 0 and vhci_send_cmd_unlink() is 0. But the vhci_tx
kernel thread will call vhci_send_cmd_submit() again when list_empty(&vdev->priv_tx) becomes false.

Now, event_handler() is called by the usbip_event singlethreaded workqueue kernel thread
because the vhci_rx kernel thread called queue_work(usbip_queue, &usbip_work) from
usbip_event_add(). Since VDEV_EVENT_DOWN is defined as USBIP_EH_SHUTDOWN | USBIP_EH_RESET,
firstly ud->eh_ops.shutdown(ud) (which is mapped to vhci_shutdown_connection()) is called
and then ud->eh_ops.reset(ud) (which is mapped to vhci_device_reset()) is called.

In vhci_shutdown_connection(), it tries to shutdown the vhci_tx kernel thread and
the vhci_rx kernel thread before resetting vdev->ud.tcp_socket to NULL. But recall that
vdev->ud.tcp_tx can remains NULL due to preemption. As a result, despite the effort to handle
USBIP_EH_SHUTDOWN before USBIP_EH_RESET, kthread_stop_put(vdev->ud.tcp_tx) won't be called
before vdev->ud.tcp_socket = NULL is called.

When the vhci_tx kernel thread found that list_empty(&vdev->priv_tx) became false, it calls
vhci_send_cmd_submit() and triggers "general protection fault in tomoyo_socket_sendmsg_permission".
Therefore, I think

  "general protection fault in tomoyo_socket_sendmsg_permission" is a NULL pointer dereference
  which can happen if vhci_shutdown_connection() is called before vdev->ud.tcp_tx becomes non-NULL
  due to a race condition

and that the easiest way is to serialize attach operation and reset/detach operations using
a mutex, for event_handler() which calls reset/detach operations is a schedulable context.

> +	get_task_struct(tx);
> +	vdev->ud.tcp_tx = tx;
> +	wake_up_process(tx);

Maybe just grouping assignment of vdev->ud.tcp_socket, vdev->ud.status, vdev->ud.tcp_{rx,tx}
within one spinlock-protected section might be fine? But since vhci_port_disconnect() from
detach_store() seems to be racy with attach_store() (vhci_port_disconnect() can trigger
vhci_shutdown_connection() via usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN) as soon as
attach_store() did vdev->ud.status = VDEV_ST_NOTASSIGNED), I suggest using a killable mutex
for serialization purpose is simpler and safer.

>  
>  	rh_port_connect(vdev, speed);
>  
>  	return count;
> +thread_error:
> +	sockfd_put(socket);
> +	if (IS_ERR(rx))
> +		err = PTR_ERR(rx);
> +	else
> +		kthread_stop(rx);
> +	if (IS_ERR(tx))
> +		err = PTR_ERR(tx);
> +	else
> +		kthread_stop(tx);
> +	return err;
>  }
>  static DEVICE_ATTR_WO(attach);
>  
>  #define MAX_STATUS_NAME 16
>  
>  struct status_attr {
>  	struct device_attribute attr;
>  	char name[MAX_STATUS_NAME+1];
>  };
>  


      reply	other threads:[~2021-02-11 13:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-20 18:44 KASAN: null-ptr-deref Write in vhci_shutdown_connection syzbot
2021-02-05 13:57 ` [PATCH] usb: usbip: fix error handling of kthread_get_run() Tetsuo Handa
2021-02-05 16:27   ` Shuah Khan
2021-02-06  1:08     ` Tetsuo Handa
2021-02-10 18:11       ` Shuah Khan
2021-02-10 18:16         ` Tetsuo Handa
2021-02-10 18:20           ` Shuah Khan
2021-02-10 18:43             ` Tetsuo Handa
2021-02-10 20:15               ` Shuah Khan
2021-02-11  1:04                 ` Tetsuo Handa
2021-02-11  3:01                   ` Tetsuo Handa
2021-02-11 13:40                     ` Tetsuo Handa [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=676d4518-0faa-9fab-15db-0db8d216d7fb@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=arnd@arndb.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=hdanton@sina.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=valentina.manea.m@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.