linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Question on handling managed IRQs when hotplugging CPUs
       [not found]         ` <alpine.DEB.2.21.1901301338170.5537@nanos.tec.linutronix.de>
@ 2019-01-31 17:48           ` John Garry
  2019-02-01 15:56             ` Hannes Reinecke
  0 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2019-01-31 17:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 30/01/2019 12:43, Thomas Gleixner wrote:
> On Wed, 30 Jan 2019, John Garry wrote:
>> On 29/01/2019 17:20, Keith Busch wrote:
>>> On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote:
>>>> On 29/01/2019 15:44, Keith Busch wrote:
>>>>>
>>>>> Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback,
>>>>> which would reap all outstanding commands before the CPU and IRQ are
>>>>> taken offline. That was removed with commit 4b855ad37194f ("blk-mq:
>>>>> Create hctx for each present CPU"). It sounds like we should bring
>>>>> something like that back, but make more fine grain to the per-cpu
>>>>> context.
>>>>>
>>>>
>>>> Seems reasonable. But we would need it to deal with drivers where they
>>>> only
>>>> expose a single queue to BLK MQ, but use many queues internally. I think
>>>> megaraid sas does this, for example.
>>>>
>>>> I would also be slightly concerned with commands being issued from the
>>>> driver unknown to blk mq, like SCSI TMF.
>>>
>>> I don't think either of those descriptions sound like good candidates
>>> for using managed IRQ affinities.
>>
>> I wouldn't say that this behaviour is obvious to the developer. I can't see
>> anything in Documentation/PCI/MSI-HOWTO.txt
>>
>> It also seems that this policy to rely on upper layer to flush+freeze queues
>> would cause issues if managed IRQs are used by drivers in other subsystems.
>> Networks controllers may have multiple queues and unsoliciated interrupts.
>
> It's doesn't matter which part is managing flush/freeze of queues as long
> as something (either common subsystem code, upper layers or the driver
> itself) does it.
>
> So for the megaraid SAS example the BLK MQ layer obviously can't do
> anything because it only sees a single request queue. But the driver could,
> if the the hardware supports it. tell the device to stop queueing
> completions on the completion queue which is associated with a particular
> CPU (or set of CPUs) during offline and then wait for the on flight stuff
> to be finished. If the hardware does not allow that, then managed
> interrupts can't work for it.
>

A rough audit of current SCSI drivers tells that these set 
PCI_IRQ_AFFINITY in some path but don't set Scsi_host.nr_hw_queues at all:
aacraid, be2iscsi, csiostor, megaraid, mpt3sas

I don't know specific driver details, like changing completion queue.

Thanks,
John

> Thanks,
>
> 	tglx
>
> .
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-01-31 17:48           ` Question on handling managed IRQs when hotplugging CPUs John Garry
@ 2019-02-01 15:56             ` Hannes Reinecke
  2019-02-01 21:57               ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Hannes Reinecke @ 2019-02-01 15:56 UTC (permalink / raw)
  To: John Garry, Thomas Gleixner
  Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 1/31/19 6:48 PM, John Garry wrote:
> On 30/01/2019 12:43, Thomas Gleixner wrote:
>> On Wed, 30 Jan 2019, John Garry wrote:
>>> On 29/01/2019 17:20, Keith Busch wrote:
>>>> On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote:
>>>>> On 29/01/2019 15:44, Keith Busch wrote:
>>>>>>
>>>>>> Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback,
>>>>>> which would reap all outstanding commands before the CPU and IRQ are
>>>>>> taken offline. That was removed with commit 4b855ad37194f ("blk-mq:
>>>>>> Create hctx for each present CPU"). It sounds like we should bring
>>>>>> something like that back, but make more fine grain to the per-cpu
>>>>>> context.
>>>>>>
>>>>>
>>>>> Seems reasonable. But we would need it to deal with drivers where they
>>>>> only
>>>>> expose a single queue to BLK MQ, but use many queues internally. I 
>>>>> think
>>>>> megaraid sas does this, for example.
>>>>>
>>>>> I would also be slightly concerned with commands being issued from the
>>>>> driver unknown to blk mq, like SCSI TMF.
>>>>
>>>> I don't think either of those descriptions sound like good candidates
>>>> for using managed IRQ affinities.
>>>
>>> I wouldn't say that this behaviour is obvious to the developer. I 
>>> can't see
>>> anything in Documentation/PCI/MSI-HOWTO.txt
>>>
>>> It also seems that this policy to rely on upper layer to flush+freeze 
>>> queues
>>> would cause issues if managed IRQs are used by drivers in other 
>>> subsystems.
>>> Networks controllers may have multiple queues and unsoliciated 
>>> interrupts.
>>
>> It's doesn't matter which part is managing flush/freeze of queues as long
>> as something (either common subsystem code, upper layers or the driver
>> itself) does it.
>>
>> So for the megaraid SAS example the BLK MQ layer obviously can't do
>> anything because it only sees a single request queue. But the driver 
>> could,
>> if the the hardware supports it. tell the device to stop queueing
>> completions on the completion queue which is associated with a particular
>> CPU (or set of CPUs) during offline and then wait for the on flight stuff
>> to be finished. If the hardware does not allow that, then managed
>> interrupts can't work for it.
>>
> 
> A rough audit of current SCSI drivers tells that these set 
> PCI_IRQ_AFFINITY in some path but don't set Scsi_host.nr_hw_queues at all:
> aacraid, be2iscsi, csiostor, megaraid, mpt3sas
> 
Megaraid and mpt3sas don't have that functionality (or, at least, not 
that I'm aware).
And in general I'm not sure if the above approach is feasible.

Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides 
some means of quiescing the CPU before hotplug) then the whole thing is 
trivial; disable SQ and wait for all outstanding commands to complete.
Then trivially all requests are completed and the issue is resolved.
Even with todays infrastructure.

And I'm not sure if we can handle surprise CPU hotplug at all, given all 
the possible race conditions.
But then I might be wrong.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-01 15:56             ` Hannes Reinecke
@ 2019-02-01 21:57               ` Thomas Gleixner
  2019-02-04  7:12                 ` Hannes Reinecke
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2019-02-01 21:57 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: John Garry, Keith Busch, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On Fri, 1 Feb 2019, Hannes Reinecke wrote:
> Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some
> means of quiescing the CPU before hotplug) then the whole thing is trivial;
> disable SQ and wait for all outstanding commands to complete.
> Then trivially all requests are completed and the issue is resolved.
> Even with todays infrastructure.
> 
> And I'm not sure if we can handle surprise CPU hotplug at all, given all the
> possible race conditions.
> But then I might be wrong.

The kernel would completely fall apart when a CPU would vanish by surprise,
i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be
the least of our problems.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-01 21:57               ` Thomas Gleixner
@ 2019-02-04  7:12                 ` Hannes Reinecke
  2019-02-05 13:24                   ` John Garry
  0 siblings, 1 reply; 15+ messages in thread
From: Hannes Reinecke @ 2019-02-04  7:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: John Garry, Keith Busch, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 2/1/19 10:57 PM, Thomas Gleixner wrote:
> On Fri, 1 Feb 2019, Hannes Reinecke wrote:
>> Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some
>> means of quiescing the CPU before hotplug) then the whole thing is trivial;
>> disable SQ and wait for all outstanding commands to complete.
>> Then trivially all requests are completed and the issue is resolved.
>> Even with todays infrastructure.
>>
>> And I'm not sure if we can handle surprise CPU hotplug at all, given all the
>> possible race conditions.
>> But then I might be wrong.
> 
> The kernel would completely fall apart when a CPU would vanish by surprise,
> i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be
> the least of our problems.
> 
Hehe. As I thought.

So, as the user then has to wait for the system to declars 'ready for 
CPU remove', why can't we just disable the SQ and wait for all I/O to 
complete?
We can make it more fine-grained by just waiting on all outstanding I/O 
on that SQ to complete, but waiting for all I/O should be good as an 
initial try.
With that we wouldn't need to fiddle with driver internals, and could 
make it pretty generic.
And we could always add more detailed logic if the driver has the means 
for doing so.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-04  7:12                 ` Hannes Reinecke
@ 2019-02-05 13:24                   ` John Garry
  2019-02-05 14:52                     ` Keith Busch
  0 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2019-02-05 13:24 UTC (permalink / raw)
  To: Hannes Reinecke, Thomas Gleixner
  Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 04/02/2019 07:12, Hannes Reinecke wrote:
> On 2/1/19 10:57 PM, Thomas Gleixner wrote:
>> On Fri, 1 Feb 2019, Hannes Reinecke wrote:
>>> Thing is, if we have _managed_ CPU hotplug (ie if the hardware
>>> provides some
>>> means of quiescing the CPU before hotplug) then the whole thing is
>>> trivial;
>>> disable SQ and wait for all outstanding commands to complete.
>>> Then trivially all requests are completed and the issue is resolved.
>>> Even with todays infrastructure.
>>>
>>> And I'm not sure if we can handle surprise CPU hotplug at all, given
>>> all the
>>> possible race conditions.
>>> But then I might be wrong.
>>
>> The kernel would completely fall apart when a CPU would vanish by
>> surprise,
>> i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be
>> the least of our problems.
>>
> Hehe. As I thought.

Hi Hannes,

>
> So, as the user then has to wait for the system to declars 'ready for
> CPU remove', why can't we just disable the SQ and wait for all I/O to
> complete?
> We can make it more fine-grained by just waiting on all outstanding I/O
> on that SQ to complete, but waiting for all I/O should be good as an
> initial try.
> With that we wouldn't need to fiddle with driver internals, and could
> make it pretty generic.

I don't fully understand this idea - specifically, at which layer would 
we be waiting for all the IO to complete?

> And we could always add more detailed logic if the driver has the means
> for doing so.
>

Thanks,
John

> Cheers,
>
> Hannes



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 13:24                   ` John Garry
@ 2019-02-05 14:52                     ` Keith Busch
  2019-02-05 15:09                       ` John Garry
  2019-02-05 15:10                       ` Hannes Reinecke
  0 siblings, 2 replies; 15+ messages in thread
From: Keith Busch @ 2019-02-05 14:52 UTC (permalink / raw)
  To: John Garry
  Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, Hannes Reinecke, linux-scsi, linux-block

On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
> On 04/02/2019 07:12, Hannes Reinecke wrote:
> 
> Hi Hannes,
> 
> >
> > So, as the user then has to wait for the system to declars 'ready for
> > CPU remove', why can't we just disable the SQ and wait for all I/O to
> > complete?
> > We can make it more fine-grained by just waiting on all outstanding I/O
> > on that SQ to complete, but waiting for all I/O should be good as an
> > initial try.
> > With that we wouldn't need to fiddle with driver internals, and could
> > make it pretty generic.
> 
> I don't fully understand this idea - specifically, at which layer would 
> we be waiting for all the IO to complete?

Whichever layer dispatched the IO to a CPU specific context should
be the one to wait for its completion. That should be blk-mq for most
block drivers.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 14:52                     ` Keith Busch
@ 2019-02-05 15:09                       ` John Garry
  2019-02-05 15:11                         ` Keith Busch
                                           ` (2 more replies)
  2019-02-05 15:10                       ` Hannes Reinecke
  1 sibling, 3 replies; 15+ messages in thread
From: John Garry @ 2019-02-05 15:09 UTC (permalink / raw)
  To: Keith Busch
  Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, Hannes Reinecke, linux-scsi, linux-block

On 05/02/2019 14:52, Keith Busch wrote:
> On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
>> On 04/02/2019 07:12, Hannes Reinecke wrote:
>>
>> Hi Hannes,
>>
>>>
>>> So, as the user then has to wait for the system to declars 'ready for
>>> CPU remove', why can't we just disable the SQ and wait for all I/O to
>>> complete?
>>> We can make it more fine-grained by just waiting on all outstanding I/O
>>> on that SQ to complete, but waiting for all I/O should be good as an
>>> initial try.
>>> With that we wouldn't need to fiddle with driver internals, and could
>>> make it pretty generic.
>>
>> I don't fully understand this idea - specifically, at which layer would
>> we be waiting for all the IO to complete?
>
> Whichever layer dispatched the IO to a CPU specific context should
> be the one to wait for its completion. That should be blk-mq for most
> block drivers.

For SCSI devices, unfortunately not all IO sent to the HW originates 
from blk-mq or any other single entity.

Thanks,
John

>
> .
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 14:52                     ` Keith Busch
  2019-02-05 15:09                       ` John Garry
@ 2019-02-05 15:10                       ` Hannes Reinecke
  2019-02-05 15:16                         ` Keith Busch
  1 sibling, 1 reply; 15+ messages in thread
From: Hannes Reinecke @ 2019-02-05 15:10 UTC (permalink / raw)
  To: Keith Busch, John Garry
  Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, linux-scsi, linux-block

On 2/5/19 3:52 PM, Keith Busch wrote:
> On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
>> On 04/02/2019 07:12, Hannes Reinecke wrote:
>>
>> Hi Hannes,
>>
>>>
>>> So, as the user then has to wait for the system to declars 'ready for
>>> CPU remove', why can't we just disable the SQ and wait for all I/O to
>>> complete?
>>> We can make it more fine-grained by just waiting on all outstanding I/O
>>> on that SQ to complete, but waiting for all I/O should be good as an
>>> initial try.
>>> With that we wouldn't need to fiddle with driver internals, and could
>>> make it pretty generic.
>>
>> I don't fully understand this idea - specifically, at which layer would
>> we be waiting for all the IO to complete?
> 
> Whichever layer dispatched the IO to a CPU specific context should
> be the one to wait for its completion. That should be blk-mq for most
> block drivers.
> 
Indeed.
But we don't provide any mechanisms for that ATM, right?

Maybe this would be a topic fit for LSF/MM?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 15:09                       ` John Garry
@ 2019-02-05 15:11                         ` Keith Busch
  2019-02-05 15:15                         ` Hannes Reinecke
  2019-02-05 18:23                         ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Keith Busch @ 2019-02-05 15:11 UTC (permalink / raw)
  To: John Garry
  Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, Hannes Reinecke, linux-scsi, linux-block

On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote:
> On 05/02/2019 14:52, Keith Busch wrote:
> > On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
> > > On 04/02/2019 07:12, Hannes Reinecke wrote:
> > > 
> > > Hi Hannes,
> > > 
> > > > 
> > > > So, as the user then has to wait for the system to declars 'ready for
> > > > CPU remove', why can't we just disable the SQ and wait for all I/O to
> > > > complete?
> > > > We can make it more fine-grained by just waiting on all outstanding I/O
> > > > on that SQ to complete, but waiting for all I/O should be good as an
> > > > initial try.
> > > > With that we wouldn't need to fiddle with driver internals, and could
> > > > make it pretty generic.
> > > 
> > > I don't fully understand this idea - specifically, at which layer would
> > > we be waiting for all the IO to complete?
> > 
> > Whichever layer dispatched the IO to a CPU specific context should
> > be the one to wait for its completion. That should be blk-mq for most
> > block drivers.
> 
> For SCSI devices, unfortunately not all IO sent to the HW originates from
> blk-mq or any other single entity.

Then they'll need to register their own CPU notifiers and handle the
ones they dispatched.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 15:09                       ` John Garry
  2019-02-05 15:11                         ` Keith Busch
@ 2019-02-05 15:15                         ` Hannes Reinecke
  2019-02-05 15:27                           ` John Garry
  2019-02-05 18:23                         ` Christoph Hellwig
  2 siblings, 1 reply; 15+ messages in thread
From: Hannes Reinecke @ 2019-02-05 15:15 UTC (permalink / raw)
  To: John Garry, Keith Busch
  Cc: Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 2/5/19 4:09 PM, John Garry wrote:
> On 05/02/2019 14:52, Keith Busch wrote:
>> On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
>>> On 04/02/2019 07:12, Hannes Reinecke wrote:
>>>
>>> Hi Hannes,
>>>
>>>>
>>>> So, as the user then has to wait for the system to declars 'ready for
>>>> CPU remove', why can't we just disable the SQ and wait for all I/O to
>>>> complete?
>>>> We can make it more fine-grained by just waiting on all outstanding I/O
>>>> on that SQ to complete, but waiting for all I/O should be good as an
>>>> initial try.
>>>> With that we wouldn't need to fiddle with driver internals, and could
>>>> make it pretty generic.
>>>
>>> I don't fully understand this idea - specifically, at which layer would
>>> we be waiting for all the IO to complete?
>>
>> Whichever layer dispatched the IO to a CPU specific context should
>> be the one to wait for its completion. That should be blk-mq for most
>> block drivers.
> 
> For SCSI devices, unfortunately not all IO sent to the HW originates 
> from blk-mq or any other single entity.
> 
No, not as such.
But each IO sent to the HW requires a unique identifcation (ie a valid 
tag). And as the tagspace is managed by block-mq (minus management 
commands, but I'm working on that currently) we can easily figure out if 
the device is busy by checking for an empty tag map.

Should be doable for most modern HBAs.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 15:10                       ` Hannes Reinecke
@ 2019-02-05 15:16                         ` Keith Busch
  0 siblings, 0 replies; 15+ messages in thread
From: Keith Busch @ 2019-02-05 15:16 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: John Garry, Hannes Reinecke, Thomas Gleixner, Christoph Hellwig,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, linux-scsi, linux-block

On Tue, Feb 05, 2019 at 04:10:47PM +0100, Hannes Reinecke wrote:
> On 2/5/19 3:52 PM, Keith Busch wrote:
> > Whichever layer dispatched the IO to a CPU specific context should
> > be the one to wait for its completion. That should be blk-mq for most
> > block drivers.
> > 
> Indeed.
> But we don't provide any mechanisms for that ATM, right?
> 
> Maybe this would be a topic fit for LSF/MM?

Right, there's nothing handling this now, and sounds like it'd be a good
discussion to bring to the storage track.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 15:15                         ` Hannes Reinecke
@ 2019-02-05 15:27                           ` John Garry
  0 siblings, 0 replies; 15+ messages in thread
From: John Garry @ 2019-02-05 15:27 UTC (permalink / raw)
  To: Hannes Reinecke, Keith Busch
  Cc: Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe,
	Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 05/02/2019 15:15, Hannes Reinecke wrote:
> On 2/5/19 4:09 PM, John Garry wrote:
>> On 05/02/2019 14:52, Keith Busch wrote:
>>> On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote:
>>>> On 04/02/2019 07:12, Hannes Reinecke wrote:
>>>>
>>>> Hi Hannes,
>>>>
>>>>>
>>>>> So, as the user then has to wait for the system to declars 'ready for
>>>>> CPU remove', why can't we just disable the SQ and wait for all I/O to
>>>>> complete?
>>>>> We can make it more fine-grained by just waiting on all outstanding
>>>>> I/O
>>>>> on that SQ to complete, but waiting for all I/O should be good as an
>>>>> initial try.
>>>>> With that we wouldn't need to fiddle with driver internals, and could
>>>>> make it pretty generic.
>>>>
>>>> I don't fully understand this idea - specifically, at which layer would
>>>> we be waiting for all the IO to complete?
>>>
>>> Whichever layer dispatched the IO to a CPU specific context should
>>> be the one to wait for its completion. That should be blk-mq for most
>>> block drivers.
>>
>> For SCSI devices, unfortunately not all IO sent to the HW originates
>> from blk-mq or any other single entity.
>>
> No, not as such.
> But each IO sent to the HW requires a unique identifcation (ie a valid
> tag). And as the tagspace is managed by block-mq (minus management
> commands, but I'm working on that currently) we can easily figure out if
> the device is busy by checking for an empty tag map.

That sounds like a reasonable starting solution.

Thanks,
John

>
> Should be doable for most modern HBAs.
>
> Cheers,
>
> Hannes



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 15:09                       ` John Garry
  2019-02-05 15:11                         ` Keith Busch
  2019-02-05 15:15                         ` Hannes Reinecke
@ 2019-02-05 18:23                         ` Christoph Hellwig
  2019-02-06  9:21                           ` John Garry
  2 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2019-02-05 18:23 UTC (permalink / raw)
  To: John Garry
  Cc: Keith Busch, Hannes Reinecke, Thomas Gleixner, Christoph Hellwig,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, Hannes Reinecke, linux-scsi, linux-block

On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote:
> For SCSI devices, unfortunately not all IO sent to the HW originates from 
> blk-mq or any other single entity.

Where else would SCSI I/O originate from?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-05 18:23                         ` Christoph Hellwig
@ 2019-02-06  9:21                           ` John Garry
  2019-02-06 13:34                             ` Benjamin Block
  0 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2019-02-06  9:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Hannes Reinecke, Thomas Gleixner, Marc Zyngier,
	axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel,
	Hannes Reinecke, linux-scsi, linux-block

On 05/02/2019 18:23, Christoph Hellwig wrote:
> On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote:
>> For SCSI devices, unfortunately not all IO sent to the HW originates from
>> blk-mq or any other single entity.
>
> Where else would SCSI I/O originate from?

Please note that I was referring to other management IO, like SAS SMP, 
TMFs, and other proprietary commands which the driver may generate for 
the HBA - https://marc.info/?l=linux-scsi&m=154831889001973&w=2 
discusses some of them also.

Thanks,
John

>
> .
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Question on handling managed IRQs when hotplugging CPUs
  2019-02-06  9:21                           ` John Garry
@ 2019-02-06 13:34                             ` Benjamin Block
  0 siblings, 0 replies; 15+ messages in thread
From: Benjamin Block @ 2019-02-06 13:34 UTC (permalink / raw)
  To: John Garry
  Cc: Christoph Hellwig, Keith Busch, Hannes Reinecke, Thomas Gleixner,
	Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm,
	linux-kernel, Hannes Reinecke, linux-scsi, linux-block

On Wed, Feb 06, 2019 at 09:21:40AM +0000, John Garry wrote:
> On 05/02/2019 18:23, Christoph Hellwig wrote:
> > On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote:
> > > For SCSI devices, unfortunately not all IO sent to the HW originates from
> > > blk-mq or any other single entity.
> > 
> > Where else would SCSI I/O originate from?
> 
> Please note that I was referring to other management IO, like SAS SMP, TMFs,
> and other proprietary commands which the driver may generate for the HBA -
> https://marc.info/?l=linux-scsi&m=154831889001973&w=2 discusses some of them
> also.
> 

Especially the TMFs send via SCSI EH are a bit of a pain I guess,
because they are entirely managed by the device drivers, but depending
on the device driver they might not even qualify for the problem Hannes
is seeing.

-- 
With Best Regards, Benjamin Block      /      Linux on IBM Z Kernel Development
IBM Systems & Technology Group   /  IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Matthias Hartmann       /      Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-02-06 13:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <f6f6e031-8b79-439d-c2af-8d3e76f30710@huawei.com>
     [not found] ` <20190129154433.GF15302@localhost.localdomain>
     [not found]   ` <757902fc-a9ea-090b-7853-89944a0ce1b5@huawei.com>
     [not found]     ` <20190129172059.GC17132@localhost.localdomain>
     [not found]       ` <3fe63dab-0791-f476-69c4-9866b70e8520@huawei.com>
     [not found]         ` <alpine.DEB.2.21.1901301338170.5537@nanos.tec.linutronix.de>
2019-01-31 17:48           ` Question on handling managed IRQs when hotplugging CPUs John Garry
2019-02-01 15:56             ` Hannes Reinecke
2019-02-01 21:57               ` Thomas Gleixner
2019-02-04  7:12                 ` Hannes Reinecke
2019-02-05 13:24                   ` John Garry
2019-02-05 14:52                     ` Keith Busch
2019-02-05 15:09                       ` John Garry
2019-02-05 15:11                         ` Keith Busch
2019-02-05 15:15                         ` Hannes Reinecke
2019-02-05 15:27                           ` John Garry
2019-02-05 18:23                         ` Christoph Hellwig
2019-02-06  9:21                           ` John Garry
2019-02-06 13:34                             ` Benjamin Block
2019-02-05 15:10                       ` Hannes Reinecke
2019-02-05 15:16                         ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).