On Tue, May 24, 2022 at 01:30:48PM +0200, David Jander wrote: > > But that turned out be not working properly: > > > > | https://lore.kernel.org/all/f86eaebb-0359-13be-f4a2-4f2b8832252e@nvidia.com/ > Ah, thanks for sharing. Added Martin to CC here. > I have been struggling with this too. There are definitely dragons somewhere. > I have tried to do tear-down in the same context if possible, similar to this: There's a potential issue there with ending up spending noticable extra time turning the controller on and off, how costly that is might be variable. > I have not yet discovered exactly why. In the occasions the code didn't hit > the race, it seemed to have a notable performance impact though, so removing > this context switch may be worth the effort. Or come up with a mechanism for ensuring we only switch to do the cleanup when we're not otherwise busy. > From what I understand of this, bus_lock_mutex is used to serialize spi_sync > calls for this bus, so there cannot be any concurrency from different threads > doing spi_sync() calls to the same bus. This means, that if spi_sync was the > only path in existence, bus_lock_mutex would suffice, and all the other The bus lock is there because some devices have an unfortunate need to do multiple SPI transfers with no other devices able to generate any traffic on the bus in between. It seems that even more sadly some of the users are using it to protect against multiple calls into themselves so we can't just do the simple thing and turn the bus locks into noops if there's only one client on the bus. However it *is* quite rarely used so I'm thinking that what we could do is default to not having it and then arrange to create it on first use, or just update the clients to do something during initialisation to cause it to be created. That way only controllers with an affected client would pay the cost. I don't *think* it's otherwise needed but would need to go through and verify that. > spinlocks and mutexes could go. Big win. But the async path is what > complicates everything. So I have been thinking... what if we could make the > sync and the async case two separate paths, with two separate message queues? > In fact the sync path doesn't even need a queue really, since it will just > handle one message beginning to end, and not return until the message is done. > It doesn't need the worker thread either AFAICS. Or am I missing something? > In the end, both paths would converge at the io_mutex. I am tempted to try > this route. Any thoughts? The sync path like you say doesn't exactly need a queue itself, it's partly looking at the queue so it can fall back to pushing work through the thread in the case that the controller is busy (hopefully opening up opportunities for us to minimise the latency between completing whatever is going on already and starting the next message) and partly about pushing the work idling the hardware into the thread so that it's deferred a bit and we're less likely to end up spending time bouncing the controller on and off if there's a sequence of synchronous operations going on. That second bit doesn't need us to actually look at the queue though, we just need to kick the thread so it gets run at some point and sees that the queue is empty. Again I need to think this through properly but we can probably arrange things so that > --> __spi_sync() > --> bus_lock_spinlock > --> queue_lock > --> list_add_tail() > --> __spi_pump_messages() (also entered here from WQ) > --> queue_lock > --> list_first_entry() the work we do under the first queue_lock here gets shuffled into __spin_pump_messages() (possibly replace in_kthread with passing in a message? Would need comments.). That'd mean we'd at least only be taking the queue lock once. The other potential issue with eliminating the queue entirely would be if there's clients which are mixing async and sync operations but expecting them to complete in order (eg, start a bunch of async stuff then do a sync operation at the end rather than have a custom wait/completion).