All of lore.kernel.org
 help / color / mirror / Atom feed
* libfc mutex under RCU read section error since "libfc: Fixup disc_mutex handling"
@ 2018-07-31 17:56 Chris Leech
       [not found] ` <20180731175607.GA1237244-r8IHplWLGbA5tHQWs+pTeqPFFGjUI2lm2LY78lusg7I@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Leech @ 2018-07-31 17:56 UTC (permalink / raw)
  To: linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	fcoe-devel-s9riP+hp16TNLxjTenLetw, Hannes Reinecke,
	Johannes Thumshirn

The libfc and libfcoe modules look to have issues with mutexs being used
under rcu_read_lock, which triggers "Illegal context switch in RCU read
side critical section" warnings under debug kernels.

I started looking at fc_disc.c where rport discovery code appears to
take a mutex for each rport in fc_rport_login/fc_rport_logoff while
being called under rcu_read_lock from fc_disc_done.

With the rdata->kref held, I know rcu_read_unlock can be safely called
before login fc_rport_login/logoff.  But I don't know if it's safe to
then recall rcu_read_lock and simply continue on with the
list_for_each_entry_rcu or not.

And as I start looking into this one case, I also get the same class of
problem popping up in at least libfcoe:fcoe_ctrl.c from
fcoe_ctrl_timer_work as well :(

- Chris Leech

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: libfc mutex under RCU read section error since "libfc: Fixup disc_mutex handling"
       [not found] ` <20180731175607.GA1237244-r8IHplWLGbA5tHQWs+pTeqPFFGjUI2lm2LY78lusg7I@public.gmane.org>
@ 2018-08-01  6:29   ` Hannes Reinecke
       [not found]     ` <e3bd6958-ca17-bc75-2a66-29d7d2a38ebf-l3A5Bk7waGM@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Hannes Reinecke @ 2018-08-01  6:29 UTC (permalink / raw)
  To: Chris Leech, linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	fcoe-devel-s9riP+hp16TNLxjTenLetw, Johannes Thumshirn

On 07/31/2018 07:56 PM, Chris Leech wrote:
> The libfc and libfcoe modules look to have issues with mutexs being used
> under rcu_read_lock, which triggers "Illegal context switch in RCU read
> side critical section" warnings under debug kernels.
> 
> I started looking at fc_disc.c where rport discovery code appears to
> take a mutex for each rport in fc_rport_login/fc_rport_logoff while
> being called under rcu_read_lock from fc_disc_done.
> 
> With the rdata->kref held, I know rcu_read_unlock can be safely called
> before login fc_rport_login/logoff.  But I don't know if it's safe to
> then recall rcu_read_lock and simply continue on with the
> list_for_each_entry_rcu or not.
> 
> And as I start looking into this one case, I also get the same class of
> problem popping up in at least libfcoe:fcoe_ctrl.c from
> fcoe_ctrl_timer_work as well :(
> 
This should've been fixed by my patch series "libfc/fcoe: disc_mutex fixes".

Can you check if this series is applied in your tree?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
_______________________________________________
fcoe-devel mailing list
fcoe-devel@open-fcoe.org
http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: libfc mutex under RCU read section error since "libfc: Fixup disc_mutex handling"
       [not found]     ` <e3bd6958-ca17-bc75-2a66-29d7d2a38ebf-l3A5Bk7waGM@public.gmane.org>
@ 2018-08-02  1:09       ` Chris Leech
       [not found]         ` <20180802010900.GA2222132-r8IHplWLGbA5tHQWs+pTeqPFFGjUI2lm2LY78lusg7I@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Leech @ 2018-08-02  1:09 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: fcoe-devel-s9riP+hp16TNLxjTenLetw, Johannes Thumshirn,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Wed, Aug 01, 2018 at 08:29:00AM +0200, Hannes Reinecke wrote:
> On 07/31/2018 07:56 PM, Chris Leech wrote:
> > The libfc and libfcoe modules look to have issues with mutexs being used
> > under rcu_read_lock, which triggers "Illegal context switch in RCU read
> > side critical section" warnings under debug kernels.
> > 
> > I started looking at fc_disc.c where rport discovery code appears to
> > take a mutex for each rport in fc_rport_login/fc_rport_logoff while
> > being called under rcu_read_lock from fc_disc_done.
> > 
> > With the rdata->kref held, I know rcu_read_unlock can be safely called
> > before login fc_rport_login/logoff.  But I don't know if it's safe to
> > then recall rcu_read_lock and simply continue on with the
> > list_for_each_entry_rcu or not.
> > 
> > And as I start looking into this one case, I also get the same class of
> > problem popping up in at least libfcoe:fcoe_ctrl.c from
> > fcoe_ctrl_timer_work as well :(
> > 
> This should've been fixed by my patch series "libfc/fcoe: disc_mutex fixes".
> 
> Can you check if this series is applied in your tree?

Thanks, I missed those and they do fix things up!

One question about "Add WARN_ON() when deleting rports"

>	--- a/drivers/scsi/libfc/fc_rport.c
>	+++ b/drivers/scsi/libfc/fc_rport.c
>	@@ -184,6 +184,7 @@ void fc_rport_destroy(struct kref *kref)
>		struct fc_rport_priv *rdata;
>	 
>		rdata = container_of(kref, struct fc_rport_priv, kref);
>	+	WARN_ON(!list_empty(&rdata->peers));
>		kfree_rcu(rdata, rcu);
>	 }
>	 EXPORT_SYMBOL(fc_rport_destroy);

The list_head isn't reinitialized to make list_empty() return true after
list_del_rcu(), as there may still be readers traversing the list.

So doesn't this warning trigger on every deleted rport?

Thanks!
- Chris

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: libfc mutex under RCU read section error since "libfc: Fixup disc_mutex handling"
       [not found]         ` <20180802010900.GA2222132-r8IHplWLGbA5tHQWs+pTeqPFFGjUI2lm2LY78lusg7I@public.gmane.org>
@ 2018-08-02  5:37           ` Hannes Reinecke
  0 siblings, 0 replies; 4+ messages in thread
From: Hannes Reinecke @ 2018-08-02  5:37 UTC (permalink / raw)
  To: Chris Leech, linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	fcoe-devel-s9riP+hp16TNLxjTenLetw, Johannes Thumshirn

On 08/02/2018 03:09 AM, Chris Leech wrote:
> On Wed, Aug 01, 2018 at 08:29:00AM +0200, Hannes Reinecke wrote:
>> On 07/31/2018 07:56 PM, Chris Leech wrote:
>>> The libfc and libfcoe modules look to have issues with mutexs being used
>>> under rcu_read_lock, which triggers "Illegal context switch in RCU read
>>> side critical section" warnings under debug kernels.
>>>
>>> I started looking at fc_disc.c where rport discovery code appears to
>>> take a mutex for each rport in fc_rport_login/fc_rport_logoff while
>>> being called under rcu_read_lock from fc_disc_done.
>>>
>>> With the rdata->kref held, I know rcu_read_unlock can be safely called
>>> before login fc_rport_login/logoff.  But I don't know if it's safe to
>>> then recall rcu_read_lock and simply continue on with the
>>> list_for_each_entry_rcu or not.
>>>
>>> And as I start looking into this one case, I also get the same class of
>>> problem popping up in at least libfcoe:fcoe_ctrl.c from
>>> fcoe_ctrl_timer_work as well :(
>>>
>> This should've been fixed by my patch series "libfc/fcoe: disc_mutex fixes".
>>
>> Can you check if this series is applied in your tree?
> 
> Thanks, I missed those and they do fix things up!
> 
> One question about "Add WARN_ON() when deleting rports"
> 
>> 	--- a/drivers/scsi/libfc/fc_rport.c
>> 	+++ b/drivers/scsi/libfc/fc_rport.c
>> 	@@ -184,6 +184,7 @@ void fc_rport_destroy(struct kref *kref)
>> 		struct fc_rport_priv *rdata;
>> 	 
>> 		rdata = container_of(kref, struct fc_rport_priv, kref);
>> 	+	WARN_ON(!list_empty(&rdata->peers));
>> 		kfree_rcu(rdata, rcu);
>> 	 }
>> 	 EXPORT_SYMBOL(fc_rport_destroy);
> 
> The list_head isn't reinitialized to make list_empty() return true after
> list_del_rcu(), as there may still be readers traversing the list.
> 
> So doesn't this warning trigger on every deleted rport?
> 
Not that I've seen, but I'll check.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
_______________________________________________
fcoe-devel mailing list
fcoe-devel@open-fcoe.org
http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-08-02  5:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-31 17:56 libfc mutex under RCU read section error since "libfc: Fixup disc_mutex handling" Chris Leech
     [not found] ` <20180731175607.GA1237244-r8IHplWLGbA5tHQWs+pTeqPFFGjUI2lm2LY78lusg7I@public.gmane.org>
2018-08-01  6:29   ` Hannes Reinecke
     [not found]     ` <e3bd6958-ca17-bc75-2a66-29d7d2a38ebf-l3A5Bk7waGM@public.gmane.org>
2018-08-02  1:09       ` Chris Leech
     [not found]         ` <20180802010900.GA2222132-r8IHplWLGbA5tHQWs+pTeqPFFGjUI2lm2LY78lusg7I@public.gmane.org>
2018-08-02  5:37           ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.