From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EF75C43381 for ; Fri, 29 Mar 2019 20:02:49 +0000 (UTC) Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1BAEC218A3 for ; Fri, 29 Mar 2019 20:02:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=kroah.com header.i=@kroah.com header.b="NlDJbTuH"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ZkUgqbO/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BAEC218A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kroah.com Authentication-Results: mail.kernel.org; spf=fail smtp.mailfrom=kernelnewbies-bounces@kernelnewbies.org Received: from localhost ([::1] helo=shelob.surriel.com) by shelob.surriel.com with esmtp (Exim 4.91) (envelope-from ) id 1h9xhV-0007Me-B1; Fri, 29 Mar 2019 16:02:05 -0400 Received: from out2-smtp.messagingengine.com ([66.111.4.26]) by shelob.surriel.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from ) id 1h9xhT-0007MW-7Z for kernelnewbies@kernelnewbies.org; Fri, 29 Mar 2019 16:02:03 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 387A92185B; Fri, 29 Mar 2019 16:02:01 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Fri, 29 Mar 2019 16:02:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kroah.com; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=fm3; bh=A2XlcZTdFjDxWDtMLv/bPGEse8P ooMx3YLKmE474ksE=; b=NlDJbTuHlSWoNBWA8Lyeld9ZTgKeI7hXXDPvs1Dhe24 N9G+M0TcYRRBmVUSlVFkVeT3cdjzB/t0d9XBkudAInYnbkyo9oI7PDSaK0+IK782 oLaiD8WfFxNIfgq0k6nWnB9IfOjStLDFe82xxnfEKOdQ6oN4VEpx2TaTukEDE/Wk xZI+x821kN5J1ORSRvZ/29/cJXD+EZR83KHQM4pbjW15B0VoJPIF24tYUF3JSkej 7C8TM8Cj5QMJSH8ReeZavZWfNVtkmy/SimyAB2jnNrfANTzekEMCwPefn3JlRc+0 JdiwJDIogGCzhfhfqY2bM+clxH1tPUlgoAD7q10K05Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=A2XlcZ TdFjDxWDtMLv/bPGEse8PooMx3YLKmE474ksE=; b=ZkUgqbO/rTHeUEW4hiikOg +NgXa0YWU7QPr0lxo+aw11p69koTC5rqvGdVBQt65tHt1/H6+yh26AfbojCjOz9v smMf9SPdjhakcH2lRnzsPa5Uo/WhEQZEMu6uoYMlmh0PuHnCw7v5G59s/TP4Qwts 9DSDEihxu/tBjTg7/ZWtiAn66p6HEJrme4MYAxVqa4M+pDRN2ZNUiGpqkYfkV/Jo n8bVyOWluiu1V8coqihDl1QA6937pH3MJOm+nMgDGOJe+lJAqK5xj9J0BJwnecWZ ecfIPfp9dk9oCHg6cHOlTgtj69dat9laFwYTnzP8rFcWe7a9VaywrHF07i33Ut9g == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrkeejgddutdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfhfgggtuggjfgesthdtredttdervdenucfhrhhomhepifhrvghg ucfmjfcuoehgrhgvgheskhhrohgrhhdrtghomheqnecukfhppeekfedrkeeirdekledrud dtjeenucfrrghrrghmpehmrghilhhfrhhomhepghhrvghgsehkrhhorghhrdgtohhmnecu vehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) by mail.messagingengine.com (Postfix) with ESMTPA id B8766100E5; Fri, 29 Mar 2019 16:01:59 -0400 (EDT) Date: Fri, 29 Mar 2019 21:01:58 +0100 From: Greg KH To: Martin Christian Subject: Re: Unexpected scheduling with mutexes Message-ID: <20190329200158.GF12004@kroah.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.4 (2019-03-13) Cc: kernelnewbies@kernelnewbies.org X-BeenThere: kernelnewbies@kernelnewbies.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Learn about the Linux kernel List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kernelnewbies-bounces@kernelnewbies.org On Wed, Mar 27, 2019 at 11:56:51AM +0100, Martin Christian wrote: > Hi, > > I've written a linux kernel module for an USB device. The USB driver > provides 2 read-only character devices, which can be opened only > exclusively by one process: > - `/dev/cdev_a` > - `/dev/cdev_b` "exclusive" opening really isn't that, unless you go through _HUGE_ gyrations to try to control this. It's not worth it in the end, so just let userspace deal with it. If it wants to interleave data to a device node, let it, it can handle the fallout. As an example of this, serial ports are not "exclusively owned", right? > The USB device can only handle one request at a time. > > The test setup is a follows: > - Processes A reads data from 1st device: `dd if=/dev/cdev_a of=/tmp/a > bs=X` > - Processes B reads data from 2nd device: `dd if=/dev/cdev_b of=/tmp/b > bs=X` > - Process A and B run in parallel > - After 10 seconds both processes are killed and size of both output > files is compared. > > For certain values of `X` there is a significant difference in size > between the two files, which I don't expect. > > A read call to the driver does the following: > 1. `mutex_lock_interruptible(iolock)` > 2. `usb_bulk_msg(dev, pipe, buf, X, timeout)` > 3. `mutex_unlock(iolock)` > 4. `copy_to_user(buf)` What are these values of X that cause differences here? > What I would expect is the following: > 1. Proc A: `mutex_lock_interruptible(iolock)` > 2. Proc A: `usb_bulk_msg(dev, pipe, buf, X, timeout)` > 3. Scheduling: A -> B > 4. Proc B: `mutex_lock_interruptible(iolock)` -> blocks > 5. Scheduling: B -> A > 6. Proc A: `mutex_unlock(iolock)` > 7. Proc A: `copy_to_user(buf)` > 8. Proc A: `mutex_lock_interruptible(iolock)` -> blocks > 9. Scheduling: A -> B > 10. Proc B: `usb_bulk_msg(dev, pipe, buf, X, timeout)` > > But what I see with ftrace is that in step 8, process A still continues. > And it seems that for certain values of X the time inside the critical > region is a multiple of the time slice, so that process B always gets > the time slice when the critical region is blocked. What would be a best > practise solution for this? I was thinking of calling `schedule()` each > time after copying to user space or playing with nice values or using > wait_queues? Step back a second and let me ask what exactly you are trying to solve here? If you are just playing around and want to watch mutexes being grabbed and passed of, that's fine, and fun. You are getting a good education with how scheduling works and of course, the hell^Wmess that USB really is. But if you are trying to somehow create a real api that you have to enforce the passing off of writing data from two different character devices in an interleaved format, you are doing this totally wrong, as this is not going to work with a simple mutex, as you have found out. There are so many different variables in play here, that you are trying to somehow sync up with just a single lock. You have, just off the top of my head: - scheduling issues of the different userspace programs - variability in USB transports (usb_bulk_msg() is just about the most inefficient way of ever sending data on a USB device, you end up wasting loads of time sleeping and waking up and waiting for things to happen, and the USB pipeline is almost totally empty for most of it.) - scheduling issues of the USB wakequeue that handles the data to be sent - sleeping/memory fault issues when copying data to/from userspace can play a huge factor for some systems. and I am sure there are more. A mutex isn't always "fair" in that it instantly gives up and passes control to someone else who is holding it at the same time. Sometimes it can be grabbed by someone else, like the person who just dropped it, based on a whole raft of factors that have been worked out over the years to provide a robust and scalable general purpose operating system. Schedulers too are not always "fair", that depends on a whole raft of things like what CPU is running when, where your task happens to be living at the moment, and of course, what else is happening in the system at the exact same time (I'm sure other things are happening, right?) Again, this is all due to the way CPUs work, and how Linux manages tasks in order to try to keep all resources used best at the moment. So, I really haven't answered your question here except to say, "it's complicated" and "you aren't measuring what you think you are measuring" :) Try to take USB out of the picture as well as userspace, and try running two kernel threads trying to grab a mutex and then print out "A" or "B" to the kernel log and then give it up. Is that output nicely interleaved or is there some duplicated messages.[1] Again, what are you really trying to determine here? Odds are there is a better way to do it, given that your above sequence of events is highly variable for a whole raft of reasons. thanks, greg k-h [1] Extra bonus points for those that recognize this task... _______________________________________________ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies