All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukasz Luba <lukasz.luba@arm.com>
To: Cristian Marussi <cristian.marussi@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Cc: sudeep.holla@arm.com
Subject: Re: [PATCH v4 07/13] firmware: arm_scmi: Add notification dispatch and delivery
Date: Wed, 20 May 2020 11:23:21 +0100	[thread overview]
Message-ID: <0234c256-2e29-69f8-99ae-2599f982961e@arm.com> (raw)
In-Reply-To: <20200520070907.GE17648@e119603-lin.cambridge.arm.com>

Hi Cristian,

On 5/20/20 8:09 AM, Cristian Marussi wrote:
> On Mon, Mar 16, 2020 at 02:46:05PM +0000, Cristian Marussi wrote:
>> On Thu, Mar 12, 2020 at 09:43:31PM +0000, Lukasz Luba wrote:
>>>
>>>
> 
> Hi Lukasz,
> 
> I went back looking deeper into the possible race issue you pointed out a
> while ago understanding it a bit better down below.
> 
>>> On 3/12/20 6:34 PM, Cristian Marussi wrote:
>>>> On 12/03/2020 13:51, Lukasz Luba wrote:
>>>>> Hi Cristian,
>>>>>
>> Hi Lukasz
>>
>>>>> just one comment below...
>> [snip]
>>>>>> +	eh.timestamp = ts;
>>>>>> +	eh.evt_id = evt_id;
>>>>>> +	eh.payld_sz = len;
>>>>>> +	kfifo_in(&r_evt->proto->equeue.kfifo, &eh, sizeof(eh));
>>>>>> +	kfifo_in(&r_evt->proto->equeue.kfifo, buf, len);
>>>>>> +	queue_work(r_evt->proto->equeue.wq,
>>>>>> +		   &r_evt->proto->equeue.notify_work);
>>>>>
>>>>> Is it safe to ignore the return value from the queue_work here?
>>>>>
>>>>
>>>> In fact yes, we do not want to care: it returns true or false depending on the
>>>> fact that the specific work was or not already queued, and we just rely on
>>>> this behavior to keep kicking the worker only when needed but never kick
>>>> more than one instance of it per-queue (so that there's only one reader
>>>> wq and one writer here in the scmi_notify)...explaining better:
>>>>
>>>> 1. we push an event (hdr+payld) to the protocol queue if we found that there was
>>>> enough space on the queue
>>>>
>>>> 2a. if at the time of the kfifo_in( ) the worker was already running
>>>> (queue not empty) it will process our new event sooner or later and here
>>>> the queue_work will return false, but we do not care in fact ... we
>>>> tried to kick it just in case
>>>>
>>>> 2b. if instead at the time of the kfifo_in() the queue was empty the worker would
>>>> have probably already gone to the sleep and this queue_work() will return true and
>>>> so this time it will effectively wake up the worker to process our items
>>>>
>>>> The important thing here is that we are sure to wakeup the worker when needed
>>>> but we are equally sure we are never causing the scheduling of more than one worker
>>>> thread consuming from the same queue (because that would break the one reader/one writer
>>>> assumption which let us use the fifo in a lockless manner): this is possible because
>>>> queue_work checks if the required work item is already pending and in such a case backs
>>>> out returning false and we have one work_item (notify_work) defined per-protocol and
>>>> so per-queue.
>>>
>>> I see. That's a good assumption: one work_item per protocol and simplify
>>> the locking. What if there would be an edge case scenario when the
>>> consumer (work_item) has handled the last item (there was NULL from
>>> scmi_process_event_header()), while in meantime scmi_notify put into
>>> the fifo new event but couldn't kick the queue_work. Would it stay there
>>> till the next IRQ which triggers queue_work to consume two events (one
>>> potentially a bit old)? Or we can ignore such race situation assuming
>>> that cleaning of work item is instant and kfifo_in is slow?
>>>
>>
>> In fact, this is a very good point, since between the moment the worker
>> determines that the queue is empty and the moment in which the worker
>> effectively exits (and it's marked as no more pending by the Kernel cmwq)
>> there is a window of opportunity for a race in which the ISR could fill
>> the queue with one more event and then fail to kick with queue_work() since
>> the work is in fact still nominally marked as pending from the point of view
>> of Kernel cmwq, as below:
>>
>> ISR (core N)		|	WQ (core N+1)		cmwq flags	       	queued events
>> ------------------------------------------------------------------------------------------------
>> 			| if (queue_is_empty)		- WORK_PENDING		0 events queued
>> 			+     ...			- WORK_PENDING		0 events queued
>> 			+ } while (scmi_process_event_payload);
>> 			+}// worker function exit
>> kfifo_in()		+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> kfifo_in()		+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> queue_work()		+     ...cmwq backing out	- WORK_PENDING		1 events queued
>>    -> FALSE (pending)	+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> 			+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> 			+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> 			| ---- WORKER THREAD EXIT	- !WORK_PENDING		1 events queued
>> 			| 		 		- !WORK_PENDING		1 events queued
>> kfifo_in()		|     				- !WORK_PENDING		2 events queued
>> kfifo_in()		|  				- !WORK_PENDING		2 events queued
>> queue_work()		|     				- !WORK_PENDING		2 events queued
>>     -> TRUE		| --- WORKER ENTER		- WORK_PENDING		2 events queued
>> 			|  				- WORK_PENDING		2 events consumed
>> 		
>> where effectively the last event queued won't be consumed till the next
>> iteration once another event is queued.
>>
> 
> In summary, looking better at Kernel cmwq code, my explanation above about
> how the possible race could be exposed by a particular tricky limit condition
> and the values assumed by the WORK_STRUCT_PENDING_BIT was ... bullshit :D
> 
> In fact there's no race at all because Kernel cmwq takes care to clear the above
> PENDING flag BEFORE the user-provided worker-function starts to finally run:
> such flag is active only when a work instance is queued pending for execution
> but it is cleared just before execution effctively starts.
> 
> kernel/workqueue.c:process_one_work()
> 
> 	set_work_pool_and_clear_pending(work, pool->id);
> 	....
> 	worker->current_func(work);
> 
> As a consequence in the racy scenario above where the ISR pushes events on the
> queues after the worker has already determined the queue to be empty but while
> the worker func is still being deactivated in terms of Kernel cmwq internal
> handling, it is not a problem since the worker while running is already NO more
> marked pending so the queue_work succeeds and a new work will simply be queued
> and run once the current instance terminates fully and it is removed from pool.

Sounds good, thanks to for digging into this workqueue code and figuring 
it out.

> 
> On the other side in the normal non racy scenario, when the worker is processing
> normally a non-empty queue, we'll end-up anyway queueing new items and a new work
> from the ISR even if the currently executing one will in fact consume already
> naturally the queued items: this will result (it's what I observe in fact) in a
> final un-needed quick worker activation/deactivation processing zero items (empty
> queue) which is in fact harmless.
> 
> Basically the racy condition is taken care by the Kernel cmwq itself, and in fact
> there is an extensive explanation also of the barriers employed to properly
> realize this in the comments around set_work_pool_and_clear_pending()
> 
> I'll add a comment in v8 just to note this behaviour.

Great research.

Regards,
Lukasz

> 
> Thanks
> 
> Cristian
> 

WARNING: multiple messages have this Message-ID (diff)
From: Lukasz Luba <lukasz.luba@arm.com>
To: Cristian Marussi <cristian.marussi@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Cc: sudeep.holla@arm.com
Subject: Re: [PATCH v4 07/13] firmware: arm_scmi: Add notification dispatch and delivery
Date: Wed, 20 May 2020 11:23:21 +0100	[thread overview]
Message-ID: <0234c256-2e29-69f8-99ae-2599f982961e@arm.com> (raw)
In-Reply-To: <20200520070907.GE17648@e119603-lin.cambridge.arm.com>

Hi Cristian,

On 5/20/20 8:09 AM, Cristian Marussi wrote:
> On Mon, Mar 16, 2020 at 02:46:05PM +0000, Cristian Marussi wrote:
>> On Thu, Mar 12, 2020 at 09:43:31PM +0000, Lukasz Luba wrote:
>>>
>>>
> 
> Hi Lukasz,
> 
> I went back looking deeper into the possible race issue you pointed out a
> while ago understanding it a bit better down below.
> 
>>> On 3/12/20 6:34 PM, Cristian Marussi wrote:
>>>> On 12/03/2020 13:51, Lukasz Luba wrote:
>>>>> Hi Cristian,
>>>>>
>> Hi Lukasz
>>
>>>>> just one comment below...
>> [snip]
>>>>>> +	eh.timestamp = ts;
>>>>>> +	eh.evt_id = evt_id;
>>>>>> +	eh.payld_sz = len;
>>>>>> +	kfifo_in(&r_evt->proto->equeue.kfifo, &eh, sizeof(eh));
>>>>>> +	kfifo_in(&r_evt->proto->equeue.kfifo, buf, len);
>>>>>> +	queue_work(r_evt->proto->equeue.wq,
>>>>>> +		   &r_evt->proto->equeue.notify_work);
>>>>>
>>>>> Is it safe to ignore the return value from the queue_work here?
>>>>>
>>>>
>>>> In fact yes, we do not want to care: it returns true or false depending on the
>>>> fact that the specific work was or not already queued, and we just rely on
>>>> this behavior to keep kicking the worker only when needed but never kick
>>>> more than one instance of it per-queue (so that there's only one reader
>>>> wq and one writer here in the scmi_notify)...explaining better:
>>>>
>>>> 1. we push an event (hdr+payld) to the protocol queue if we found that there was
>>>> enough space on the queue
>>>>
>>>> 2a. if at the time of the kfifo_in( ) the worker was already running
>>>> (queue not empty) it will process our new event sooner or later and here
>>>> the queue_work will return false, but we do not care in fact ... we
>>>> tried to kick it just in case
>>>>
>>>> 2b. if instead at the time of the kfifo_in() the queue was empty the worker would
>>>> have probably already gone to the sleep and this queue_work() will return true and
>>>> so this time it will effectively wake up the worker to process our items
>>>>
>>>> The important thing here is that we are sure to wakeup the worker when needed
>>>> but we are equally sure we are never causing the scheduling of more than one worker
>>>> thread consuming from the same queue (because that would break the one reader/one writer
>>>> assumption which let us use the fifo in a lockless manner): this is possible because
>>>> queue_work checks if the required work item is already pending and in such a case backs
>>>> out returning false and we have one work_item (notify_work) defined per-protocol and
>>>> so per-queue.
>>>
>>> I see. That's a good assumption: one work_item per protocol and simplify
>>> the locking. What if there would be an edge case scenario when the
>>> consumer (work_item) has handled the last item (there was NULL from
>>> scmi_process_event_header()), while in meantime scmi_notify put into
>>> the fifo new event but couldn't kick the queue_work. Would it stay there
>>> till the next IRQ which triggers queue_work to consume two events (one
>>> potentially a bit old)? Or we can ignore such race situation assuming
>>> that cleaning of work item is instant and kfifo_in is slow?
>>>
>>
>> In fact, this is a very good point, since between the moment the worker
>> determines that the queue is empty and the moment in which the worker
>> effectively exits (and it's marked as no more pending by the Kernel cmwq)
>> there is a window of opportunity for a race in which the ISR could fill
>> the queue with one more event and then fail to kick with queue_work() since
>> the work is in fact still nominally marked as pending from the point of view
>> of Kernel cmwq, as below:
>>
>> ISR (core N)		|	WQ (core N+1)		cmwq flags	       	queued events
>> ------------------------------------------------------------------------------------------------
>> 			| if (queue_is_empty)		- WORK_PENDING		0 events queued
>> 			+     ...			- WORK_PENDING		0 events queued
>> 			+ } while (scmi_process_event_payload);
>> 			+}// worker function exit
>> kfifo_in()		+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> kfifo_in()		+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> queue_work()		+     ...cmwq backing out	- WORK_PENDING		1 events queued
>>    -> FALSE (pending)	+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> 			+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> 			+     ...cmwq backing out	- WORK_PENDING		1 events queued
>> 			| ---- WORKER THREAD EXIT	- !WORK_PENDING		1 events queued
>> 			| 		 		- !WORK_PENDING		1 events queued
>> kfifo_in()		|     				- !WORK_PENDING		2 events queued
>> kfifo_in()		|  				- !WORK_PENDING		2 events queued
>> queue_work()		|     				- !WORK_PENDING		2 events queued
>>     -> TRUE		| --- WORKER ENTER		- WORK_PENDING		2 events queued
>> 			|  				- WORK_PENDING		2 events consumed
>> 		
>> where effectively the last event queued won't be consumed till the next
>> iteration once another event is queued.
>>
> 
> In summary, looking better at Kernel cmwq code, my explanation above about
> how the possible race could be exposed by a particular tricky limit condition
> and the values assumed by the WORK_STRUCT_PENDING_BIT was ... bullshit :D
> 
> In fact there's no race at all because Kernel cmwq takes care to clear the above
> PENDING flag BEFORE the user-provided worker-function starts to finally run:
> such flag is active only when a work instance is queued pending for execution
> but it is cleared just before execution effctively starts.
> 
> kernel/workqueue.c:process_one_work()
> 
> 	set_work_pool_and_clear_pending(work, pool->id);
> 	....
> 	worker->current_func(work);
> 
> As a consequence in the racy scenario above where the ISR pushes events on the
> queues after the worker has already determined the queue to be empty but while
> the worker func is still being deactivated in terms of Kernel cmwq internal
> handling, it is not a problem since the worker while running is already NO more
> marked pending so the queue_work succeeds and a new work will simply be queued
> and run once the current instance terminates fully and it is removed from pool.

Sounds good, thanks to for digging into this workqueue code and figuring 
it out.

> 
> On the other side in the normal non racy scenario, when the worker is processing
> normally a non-empty queue, we'll end-up anyway queueing new items and a new work
> from the ISR even if the currently executing one will in fact consume already
> naturally the queued items: this will result (it's what I observe in fact) in a
> final un-needed quick worker activation/deactivation processing zero items (empty
> queue) which is in fact harmless.
> 
> Basically the racy condition is taken care by the Kernel cmwq itself, and in fact
> there is an extensive explanation also of the barriers employed to properly
> realize this in the comments around set_work_pool_and_clear_pending()
> 
> I'll add a comment in v8 just to note this behaviour.

Great research.

Regards,
Lukasz

> 
> Thanks
> 
> Cristian
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-05-20 10:23 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04 16:25 [PATCH v4 00/13] SCMI Notifications Core Support Cristian Marussi
2020-03-04 16:25 ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 01/13] firmware: arm_scmi: Add receive buffer support for notifications Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 02/13] firmware: arm_scmi: Update protocol commands and notification list Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 03/13] firmware: arm_scmi: Add notifications support in transport layer Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 04/13] firmware: arm_scmi: Add support for notifications message processing Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 05/13] firmware: arm_scmi: Add notification protocol-registration Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-09 11:33   ` Jonathan Cameron
2020-03-09 11:33     ` Jonathan Cameron
2020-03-09 12:04     ` Cristian Marussi
2020-03-09 12:04       ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 06/13] firmware: arm_scmi: Add notification callbacks-registration Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-09 11:50   ` Jonathan Cameron
2020-03-09 11:50     ` Jonathan Cameron
2020-03-09 12:25     ` Cristian Marussi
2020-03-09 12:25       ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 07/13] firmware: arm_scmi: Add notification dispatch and delivery Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-09 12:26   ` Jonathan Cameron
2020-03-09 12:26     ` Jonathan Cameron
2020-03-09 16:37     ` Cristian Marussi
2020-03-09 16:37       ` Cristian Marussi
2020-03-10 10:01       ` Jonathan Cameron
2020-03-10 10:01         ` Jonathan Cameron
2020-03-12 13:51   ` Lukasz Luba
2020-03-12 13:51     ` Lukasz Luba
2020-03-12 14:06     ` Lukasz Luba
2020-03-12 14:06       ` Lukasz Luba
2020-03-12 19:24       ` Cristian Marussi
2020-03-12 19:24         ` Cristian Marussi
2020-03-12 20:57         ` Lukasz Luba
2020-03-12 20:57           ` Lukasz Luba
2020-03-12 18:34     ` Cristian Marussi
2020-03-12 18:34       ` Cristian Marussi
2020-03-12 21:43       ` Lukasz Luba
2020-03-12 21:43         ` Lukasz Luba
2020-03-16 14:46         ` Cristian Marussi
2020-03-16 14:46           ` Cristian Marussi
2020-03-18  8:26           ` Lukasz Luba
2020-03-18  8:26             ` Lukasz Luba
2020-03-23  8:28             ` Cristian Marussi
2020-03-23  8:28               ` Cristian Marussi
2020-05-20  7:09           ` Cristian Marussi
2020-05-20  7:09             ` Cristian Marussi
2020-05-20 10:23             ` Lukasz Luba [this message]
2020-05-20 10:23               ` Lukasz Luba
2020-03-04 16:25 ` [PATCH v4 08/13] firmware: arm_scmi: Enable notification core Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 09/13] firmware: arm_scmi: Add Power notifications support Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-09 12:28   ` Jonathan Cameron
2020-03-09 12:28     ` Jonathan Cameron
2020-03-09 16:39     ` Cristian Marussi
2020-03-09 16:39       ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 10/13] firmware: arm_scmi: Add Perf " Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 11/13] firmware: arm_scmi: Add Sensor " Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 12/13] firmware: arm_scmi: Add Reset " Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-04 16:25 ` [PATCH v4 13/13] firmware: arm_scmi: Add Base " Cristian Marussi
2020-03-04 16:25   ` Cristian Marussi
2020-03-09 12:33 ` [PATCH v4 00/13] SCMI Notifications Core Support Jonathan Cameron
2020-03-09 12:33   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0234c256-2e29-69f8-99ae-2599f982961e@arm.com \
    --to=lukasz.luba@arm.com \
    --cc=cristian.marussi@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sudeep.holla@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.