* Question on handling managed IRQs when hotplugging CPUs @ 2019-01-29 11:25 John Garry 2019-01-29 11:54 ` Hannes Reinecke 2019-01-29 15:44 ` Keith Busch 0 siblings, 2 replies; 26+ messages in thread From: John Garry @ 2019-01-29 11:25 UTC (permalink / raw) To: tglx, Christoph Hellwig Cc: Marc Zyngier, axboe, Keith Busch, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke Hi, I have a question on $subject which I hope you can shed some light on. According to commit c5cb83bb337c25 ("genirq/cpuhotplug: Handle managed IRQs on CPU hotplug"), if we offline the last CPU in a managed IRQ affinity mask, the IRQ is shutdown. The reasoning is that this IRQ is thought to be associated with a specific queue on a MQ device, and the CPUs in the IRQ affinity mask are the same CPUs associated with the queue. So, if no CPU is using the queue, then no need for the IRQ. However how does this handle scenario of last CPU in IRQ affinity mask being offlined while IO associated with queue is still in flight? Or if we make the decision to use queue associated with the current CPU, and then that CPU (being the last CPU online in the queue's IRQ afffinity mask) goes offline and we finish the delivery with another CPU? In these cases, when the IO completes, it would not be serviced and timeout. I have actually tried this on my arm64 system and I see IO timeouts. Thanks in advance, John ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 11:25 Question on handling managed IRQs when hotplugging CPUs John Garry @ 2019-01-29 11:54 ` Hannes Reinecke 2019-01-29 12:01 ` Thomas Gleixner 2019-01-29 15:44 ` Keith Busch 1 sibling, 1 reply; 26+ messages in thread From: Hannes Reinecke @ 2019-01-29 11:54 UTC (permalink / raw) To: John Garry, tglx, Christoph Hellwig Cc: Marc Zyngier, axboe, Keith Busch, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, SCSI Mailing List On 1/29/19 12:25 PM, John Garry wrote: > Hi, > > I have a question on $subject which I hope you can shed some light on. > > According to commit c5cb83bb337c25 ("genirq/cpuhotplug: Handle managed > IRQs on CPU hotplug"), if we offline the last CPU in a managed IRQ > affinity mask, the IRQ is shutdown. > > The reasoning is that this IRQ is thought to be associated with a > specific queue on a MQ device, and the CPUs in the IRQ affinity mask are > the same CPUs associated with the queue. So, if no CPU is using the > queue, then no need for the IRQ. > > However how does this handle scenario of last CPU in IRQ affinity mask > being offlined while IO associated with queue is still in flight? > > Or if we make the decision to use queue associated with the current CPU, > and then that CPU (being the last CPU online in the queue's IRQ > afffinity mask) goes offline and we finish the delivery with another CPU? > > In these cases, when the IO completes, it would not be serviced and > timeout. > > I have actually tried this on my arm64 system and I see IO timeouts. > That actually is a very good question, and I have been wondering about this for quite some time. I find it a bit hard to envision a scenario where the IRQ affinity is automatically (and, more importantly, atomically!) re-routed to one of the other CPUs. And even it it were, chances are that there are checks in the driver _preventing_ them from handling those requests, seeing that they should have been handled by another CPU ... I guess the safest bet is to implement a 'cleanup' worker queue which is responsible of looking through all the outstanding commands (on all hardware queues), and then complete those for which no corresponding CPU / irqhandler can be found. But I defer to the higher authorities here; maybe I'm totally wrong and it's already been taken care of. But if there is no generic mechanism this really is a fit topic for LSF/MM, as most other drivers would be affected, too. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 11:54 ` Hannes Reinecke @ 2019-01-29 12:01 ` Thomas Gleixner 2019-01-29 15:27 ` John Garry 0 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2019-01-29 12:01 UTC (permalink / raw) To: Hannes Reinecke Cc: John Garry, Christoph Hellwig, Marc Zyngier, axboe, Keith Busch, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, SCSI Mailing List On Tue, 29 Jan 2019, Hannes Reinecke wrote: > That actually is a very good question, and I have been wondering about this > for quite some time. > > I find it a bit hard to envision a scenario where the IRQ affinity is > automatically (and, more importantly, atomically!) re-routed to one of the > other CPUs. > And even it it were, chances are that there are checks in the driver > _preventing_ them from handling those requests, seeing that they should have > been handled by another CPU ... > > I guess the safest bet is to implement a 'cleanup' worker queue which is > responsible of looking through all the outstanding commands (on all hardware > queues), and then complete those for which no corresponding CPU / irqhandler > can be found. > > But I defer to the higher authorities here; maybe I'm totally wrong and it's > already been taken care of. TBH, I don't know. I merily was involved in the genirq side of this. But yes, in order to make this work correctly the basic contract for CPU hotplug case must be: If the last CPU which is associated to a queue (and the corresponding interrupt) goes offline, then the subsytem/driver code has to make sure that: 1) No more requests can be queued on that queue 2) All outstanding of that queue have been completed or redirected (don't know if that's possible at all) to some other queue. That has to be done in that order obviously. Whether any of the subsystems/drivers actually implements this, I can't tell. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 12:01 ` Thomas Gleixner @ 2019-01-29 15:27 ` John Garry 2019-01-29 16:27 ` Thomas Gleixner 0 siblings, 1 reply; 26+ messages in thread From: John Garry @ 2019-01-29 15:27 UTC (permalink / raw) To: Thomas Gleixner, Hannes Reinecke Cc: Christoph Hellwig, Marc Zyngier, axboe, Keith Busch, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, SCSI Mailing List Hi Hannes, Thomas, On 29/01/2019 12:01, Thomas Gleixner wrote: > On Tue, 29 Jan 2019, Hannes Reinecke wrote: >> That actually is a very good question, and I have been wondering about this >> for quite some time. >> >> I find it a bit hard to envision a scenario where the IRQ affinity is >> automatically (and, more importantly, atomically!) re-routed to one of the >> other CPUs. Isn't this what happens today for non-managed IRQs? >> And even it it were, chances are that there are checks in the driver >> _preventing_ them from handling those requests, seeing that they should have >> been handled by another CPU ... Really? I would not think that it matters which CPU we service the interrupt on. >> >> I guess the safest bet is to implement a 'cleanup' worker queue which is >> responsible of looking through all the outstanding commands (on all hardware >> queues), and then complete those for which no corresponding CPU / irqhandler >> can be found. >> >> But I defer to the higher authorities here; maybe I'm totally wrong and it's >> already been taken care of. > > TBH, I don't know. I merily was involved in the genirq side of this. But > yes, in order to make this work correctly the basic contract for CPU > hotplug case must be: > > If the last CPU which is associated to a queue (and the corresponding > interrupt) goes offline, then the subsytem/driver code has to make sure > that: > > 1) No more requests can be queued on that queue > > 2) All outstanding of that queue have been completed or redirected > (don't know if that's possible at all) to some other queue. This may not be possible. For the HW I deal with, we have symmetrical delivery and completion queues, and a command delivered on DQx will always complete on CQx. Each completion queue has a dedicated IRQ. > > That has to be done in that order obviously. Whether any of the > subsystems/drivers actually implements this, I can't tell. Going back to c5cb83bb337c25, it seems to me that the change was made with the idea that we can maintain the affinity for the IRQ as we're shutting it down as no interrupts should occur. However I don't see why we can't instead keep the IRQ up and set the affinity to all online CPUs in offline path, and restore the original affinity in online path. The reason we set the queue affinity to specific CPUs is for performance, but I would not say that this matters for handling residual IRQs. Thanks, John > > Thanks, > > tglx > > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 15:27 ` John Garry @ 2019-01-29 16:27 ` Thomas Gleixner 2019-01-29 17:23 ` John Garry 0 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2019-01-29 16:27 UTC (permalink / raw) To: John Garry Cc: Hannes Reinecke, Christoph Hellwig, Marc Zyngier, axboe, Keith Busch, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, SCSI Mailing List On Tue, 29 Jan 2019, John Garry wrote: > On 29/01/2019 12:01, Thomas Gleixner wrote: > > If the last CPU which is associated to a queue (and the corresponding > > interrupt) goes offline, then the subsytem/driver code has to make sure > > that: > > > > 1) No more requests can be queued on that queue > > > > 2) All outstanding of that queue have been completed or redirected > > (don't know if that's possible at all) to some other queue. > > This may not be possible. For the HW I deal with, we have symmetrical delivery > and completion queues, and a command delivered on DQx will always complete on > CQx. Each completion queue has a dedicated IRQ. So you can stop queueing on DQx and wait for all outstanding ones to come in on CQx, right? > > That has to be done in that order obviously. Whether any of the > > subsystems/drivers actually implements this, I can't tell. > > Going back to c5cb83bb337c25, it seems to me that the change was made with the > idea that we can maintain the affinity for the IRQ as we're shutting it down > as no interrupts should occur. > > However I don't see why we can't instead keep the IRQ up and set the affinity > to all online CPUs in offline path, and restore the original affinity in > online path. The reason we set the queue affinity to specific CPUs is for > performance, but I would not say that this matters for handling residual IRQs. Oh yes it does. The problem is especially on x86, that if you have a large number of queues and you take a large number of CPUs offline, then you run into vector space exhaustion on the remaining online CPUs. In the worst case a single CPU on x86 has only 186 vectors available for device interrupts. So just take a quad socket machine with 144 CPUs and two multiqueue devices with a queue per cpu. ---> FAIL It probably fails already with one device because there are lots of other devices which have regular interrupt which cannot be shut down. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 16:27 ` Thomas Gleixner @ 2019-01-29 17:23 ` John Garry 0 siblings, 0 replies; 26+ messages in thread From: John Garry @ 2019-01-29 17:23 UTC (permalink / raw) To: Thomas Gleixner Cc: Hannes Reinecke, Christoph Hellwig, Marc Zyngier, axboe, Keith Busch, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, SCSI Mailing List On 29/01/2019 16:27, Thomas Gleixner wrote: > On Tue, 29 Jan 2019, John Garry wrote: >> On 29/01/2019 12:01, Thomas Gleixner wrote: >>> If the last CPU which is associated to a queue (and the corresponding >>> interrupt) goes offline, then the subsytem/driver code has to make sure >>> that: >>> >>> 1) No more requests can be queued on that queue >>> >>> 2) All outstanding of that queue have been completed or redirected >>> (don't know if that's possible at all) to some other queue. >> >> This may not be possible. For the HW I deal with, we have symmetrical delivery >> and completion queues, and a command delivered on DQx will always complete on >> CQx. Each completion queue has a dedicated IRQ. > > So you can stop queueing on DQx and wait for all outstanding ones to come > in on CQx, right? Right, and this sounds like what Keith Busch mentioned in his reply. > >>> That has to be done in that order obviously. Whether any of the >>> subsystems/drivers actually implements this, I can't tell. >> >> Going back to c5cb83bb337c25, it seems to me that the change was made with the >> idea that we can maintain the affinity for the IRQ as we're shutting it down >> as no interrupts should occur. >> >> However I don't see why we can't instead keep the IRQ up and set the affinity >> to all online CPUs in offline path, and restore the original affinity in >> online path. The reason we set the queue affinity to specific CPUs is for >> performance, but I would not say that this matters for handling residual IRQs. > > Oh yes it does. The problem is especially on x86, that if you have a large > number of queues and you take a large number of CPUs offline, then you run > into vector space exhaustion on the remaining online CPUs. > > In the worst case a single CPU on x86 has only 186 vectors available for > device interrupts. So just take a quad socket machine with 144 CPUs and two > multiqueue devices with a queue per cpu. ---> FAIL > > It probably fails already with one device because there are lots of other > devices which have regular interrupt which cannot be shut down. OK, understood. Thanks, John > > Thanks, > > tglx > > > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 11:25 Question on handling managed IRQs when hotplugging CPUs John Garry 2019-01-29 11:54 ` Hannes Reinecke @ 2019-01-29 15:44 ` Keith Busch 2019-01-29 17:12 ` John Garry 1 sibling, 1 reply; 26+ messages in thread From: Keith Busch @ 2019-01-29 15:44 UTC (permalink / raw) To: John Garry Cc: tglx, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke On Tue, Jan 29, 2019 at 03:25:48AM -0800, John Garry wrote: > Hi, > > I have a question on $subject which I hope you can shed some light on. > > According to commit c5cb83bb337c25 ("genirq/cpuhotplug: Handle managed > IRQs on CPU hotplug"), if we offline the last CPU in a managed IRQ > affinity mask, the IRQ is shutdown. > > The reasoning is that this IRQ is thought to be associated with a > specific queue on a MQ device, and the CPUs in the IRQ affinity mask are > the same CPUs associated with the queue. So, if no CPU is using the > queue, then no need for the IRQ. > > However how does this handle scenario of last CPU in IRQ affinity mask > being offlined while IO associated with queue is still in flight? > > Or if we make the decision to use queue associated with the current CPU, > and then that CPU (being the last CPU online in the queue's IRQ > afffinity mask) goes offline and we finish the delivery with another CPU? > > In these cases, when the IO completes, it would not be serviced and timeout. > > I have actually tried this on my arm64 system and I see IO timeouts. Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, which would reap all outstanding commands before the CPU and IRQ are taken offline. That was removed with commit 4b855ad37194f ("blk-mq: Create hctx for each present CPU"). It sounds like we should bring something like that back, but make more fine grain to the per-cpu context. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 15:44 ` Keith Busch @ 2019-01-29 17:12 ` John Garry 2019-01-29 17:20 ` Keith Busch 0 siblings, 1 reply; 26+ messages in thread From: John Garry @ 2019-01-29 17:12 UTC (permalink / raw) To: Keith Busch Cc: tglx, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke On 29/01/2019 15:44, Keith Busch wrote: > On Tue, Jan 29, 2019 at 03:25:48AM -0800, John Garry wrote: >> Hi, >> >> I have a question on $subject which I hope you can shed some light on. >> >> According to commit c5cb83bb337c25 ("genirq/cpuhotplug: Handle managed >> IRQs on CPU hotplug"), if we offline the last CPU in a managed IRQ >> affinity mask, the IRQ is shutdown. >> >> The reasoning is that this IRQ is thought to be associated with a >> specific queue on a MQ device, and the CPUs in the IRQ affinity mask are >> the same CPUs associated with the queue. So, if no CPU is using the >> queue, then no need for the IRQ. >> >> However how does this handle scenario of last CPU in IRQ affinity mask >> being offlined while IO associated with queue is still in flight? >> >> Or if we make the decision to use queue associated with the current CPU, >> and then that CPU (being the last CPU online in the queue's IRQ >> afffinity mask) goes offline and we finish the delivery with another CPU? >> >> In these cases, when the IO completes, it would not be serviced and timeout. >> >> I have actually tried this on my arm64 system and I see IO timeouts. > > Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, > which would reap all outstanding commands before the CPU and IRQ are > taken offline. That was removed with commit 4b855ad37194f ("blk-mq: > Create hctx for each present CPU"). It sounds like we should bring > something like that back, but make more fine grain to the per-cpu context. > Seems reasonable. But we would need it to deal with drivers where they only expose a single queue to BLK MQ, but use many queues internally. I think megaraid sas does this, for example. I would also be slightly concerned with commands being issued from the driver unknown to blk mq, like SCSI TMF. Thanks, John > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 17:12 ` John Garry @ 2019-01-29 17:20 ` Keith Busch 2019-01-30 10:38 ` John Garry 0 siblings, 1 reply; 26+ messages in thread From: Keith Busch @ 2019-01-29 17:20 UTC (permalink / raw) To: John Garry Cc: tglx, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote: > On 29/01/2019 15:44, Keith Busch wrote: > > > > Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, > > which would reap all outstanding commands before the CPU and IRQ are > > taken offline. That was removed with commit 4b855ad37194f ("blk-mq: > > Create hctx for each present CPU"). It sounds like we should bring > > something like that back, but make more fine grain to the per-cpu context. > > > > Seems reasonable. But we would need it to deal with drivers where they only > expose a single queue to BLK MQ, but use many queues internally. I think > megaraid sas does this, for example. > > I would also be slightly concerned with commands being issued from the > driver unknown to blk mq, like SCSI TMF. I don't think either of those descriptions sound like good candidates for using managed IRQ affinities. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-29 17:20 ` Keith Busch @ 2019-01-30 10:38 ` John Garry 2019-01-30 12:43 ` Thomas Gleixner 0 siblings, 1 reply; 26+ messages in thread From: John Garry @ 2019-01-30 10:38 UTC (permalink / raw) To: Keith Busch Cc: tglx, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke On 29/01/2019 17:20, Keith Busch wrote: > On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote: >> On 29/01/2019 15:44, Keith Busch wrote: >>> >>> Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, >>> which would reap all outstanding commands before the CPU and IRQ are >>> taken offline. That was removed with commit 4b855ad37194f ("blk-mq: >>> Create hctx for each present CPU"). It sounds like we should bring >>> something like that back, but make more fine grain to the per-cpu context. >>> >> >> Seems reasonable. But we would need it to deal with drivers where they only >> expose a single queue to BLK MQ, but use many queues internally. I think >> megaraid sas does this, for example. >> >> I would also be slightly concerned with commands being issued from the >> driver unknown to blk mq, like SCSI TMF. > > I don't think either of those descriptions sound like good candidates > for using managed IRQ affinities. I wouldn't say that this behaviour is obvious to the developer. I can't see anything in Documentation/PCI/MSI-HOWTO.txt It also seems that this policy to rely on upper layer to flush+freeze queues would cause issues if managed IRQs are used by drivers in other subsystems. Networks controllers may have multiple queues and unsoliciated interrupts. Thanks, John > > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-30 10:38 ` John Garry @ 2019-01-30 12:43 ` Thomas Gleixner 2019-01-31 17:48 ` John Garry 0 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2019-01-30 12:43 UTC (permalink / raw) To: John Garry Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke On Wed, 30 Jan 2019, John Garry wrote: > On 29/01/2019 17:20, Keith Busch wrote: > > On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote: > > > On 29/01/2019 15:44, Keith Busch wrote: > > > > > > > > Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, > > > > which would reap all outstanding commands before the CPU and IRQ are > > > > taken offline. That was removed with commit 4b855ad37194f ("blk-mq: > > > > Create hctx for each present CPU"). It sounds like we should bring > > > > something like that back, but make more fine grain to the per-cpu > > > > context. > > > > > > > > > > Seems reasonable. But we would need it to deal with drivers where they > > > only > > > expose a single queue to BLK MQ, but use many queues internally. I think > > > megaraid sas does this, for example. > > > > > > I would also be slightly concerned with commands being issued from the > > > driver unknown to blk mq, like SCSI TMF. > > > > I don't think either of those descriptions sound like good candidates > > for using managed IRQ affinities. > > I wouldn't say that this behaviour is obvious to the developer. I can't see > anything in Documentation/PCI/MSI-HOWTO.txt > > It also seems that this policy to rely on upper layer to flush+freeze queues > would cause issues if managed IRQs are used by drivers in other subsystems. > Networks controllers may have multiple queues and unsoliciated interrupts. It's doesn't matter which part is managing flush/freeze of queues as long as something (either common subsystem code, upper layers or the driver itself) does it. So for the megaraid SAS example the BLK MQ layer obviously can't do anything because it only sees a single request queue. But the driver could, if the the hardware supports it. tell the device to stop queueing completions on the completion queue which is associated with a particular CPU (or set of CPUs) during offline and then wait for the on flight stuff to be finished. If the hardware does not allow that, then managed interrupts can't work for it. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-30 12:43 ` Thomas Gleixner @ 2019-01-31 17:48 ` John Garry 2019-02-01 15:56 ` Hannes Reinecke 0 siblings, 1 reply; 26+ messages in thread From: John Garry @ 2019-01-31 17:48 UTC (permalink / raw) To: Thomas Gleixner Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 30/01/2019 12:43, Thomas Gleixner wrote: > On Wed, 30 Jan 2019, John Garry wrote: >> On 29/01/2019 17:20, Keith Busch wrote: >>> On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote: >>>> On 29/01/2019 15:44, Keith Busch wrote: >>>>> >>>>> Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, >>>>> which would reap all outstanding commands before the CPU and IRQ are >>>>> taken offline. That was removed with commit 4b855ad37194f ("blk-mq: >>>>> Create hctx for each present CPU"). It sounds like we should bring >>>>> something like that back, but make more fine grain to the per-cpu >>>>> context. >>>>> >>>> >>>> Seems reasonable. But we would need it to deal with drivers where they >>>> only >>>> expose a single queue to BLK MQ, but use many queues internally. I think >>>> megaraid sas does this, for example. >>>> >>>> I would also be slightly concerned with commands being issued from the >>>> driver unknown to blk mq, like SCSI TMF. >>> >>> I don't think either of those descriptions sound like good candidates >>> for using managed IRQ affinities. >> >> I wouldn't say that this behaviour is obvious to the developer. I can't see >> anything in Documentation/PCI/MSI-HOWTO.txt >> >> It also seems that this policy to rely on upper layer to flush+freeze queues >> would cause issues if managed IRQs are used by drivers in other subsystems. >> Networks controllers may have multiple queues and unsoliciated interrupts. > > It's doesn't matter which part is managing flush/freeze of queues as long > as something (either common subsystem code, upper layers or the driver > itself) does it. > > So for the megaraid SAS example the BLK MQ layer obviously can't do > anything because it only sees a single request queue. But the driver could, > if the the hardware supports it. tell the device to stop queueing > completions on the completion queue which is associated with a particular > CPU (or set of CPUs) during offline and then wait for the on flight stuff > to be finished. If the hardware does not allow that, then managed > interrupts can't work for it. > A rough audit of current SCSI drivers tells that these set PCI_IRQ_AFFINITY in some path but don't set Scsi_host.nr_hw_queues at all: aacraid, be2iscsi, csiostor, megaraid, mpt3sas I don't know specific driver details, like changing completion queue. Thanks, John > Thanks, > > tglx > > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-01-31 17:48 ` John Garry @ 2019-02-01 15:56 ` Hannes Reinecke 2019-02-01 21:57 ` Thomas Gleixner 0 siblings, 1 reply; 26+ messages in thread From: Hannes Reinecke @ 2019-02-01 15:56 UTC (permalink / raw) To: John Garry, Thomas Gleixner Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 1/31/19 6:48 PM, John Garry wrote: > On 30/01/2019 12:43, Thomas Gleixner wrote: >> On Wed, 30 Jan 2019, John Garry wrote: >>> On 29/01/2019 17:20, Keith Busch wrote: >>>> On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote: >>>>> On 29/01/2019 15:44, Keith Busch wrote: >>>>>> >>>>>> Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, >>>>>> which would reap all outstanding commands before the CPU and IRQ are >>>>>> taken offline. That was removed with commit 4b855ad37194f ("blk-mq: >>>>>> Create hctx for each present CPU"). It sounds like we should bring >>>>>> something like that back, but make more fine grain to the per-cpu >>>>>> context. >>>>>> >>>>> >>>>> Seems reasonable. But we would need it to deal with drivers where they >>>>> only >>>>> expose a single queue to BLK MQ, but use many queues internally. I >>>>> think >>>>> megaraid sas does this, for example. >>>>> >>>>> I would also be slightly concerned with commands being issued from the >>>>> driver unknown to blk mq, like SCSI TMF. >>>> >>>> I don't think either of those descriptions sound like good candidates >>>> for using managed IRQ affinities. >>> >>> I wouldn't say that this behaviour is obvious to the developer. I >>> can't see >>> anything in Documentation/PCI/MSI-HOWTO.txt >>> >>> It also seems that this policy to rely on upper layer to flush+freeze >>> queues >>> would cause issues if managed IRQs are used by drivers in other >>> subsystems. >>> Networks controllers may have multiple queues and unsoliciated >>> interrupts. >> >> It's doesn't matter which part is managing flush/freeze of queues as long >> as something (either common subsystem code, upper layers or the driver >> itself) does it. >> >> So for the megaraid SAS example the BLK MQ layer obviously can't do >> anything because it only sees a single request queue. But the driver >> could, >> if the the hardware supports it. tell the device to stop queueing >> completions on the completion queue which is associated with a particular >> CPU (or set of CPUs) during offline and then wait for the on flight stuff >> to be finished. If the hardware does not allow that, then managed >> interrupts can't work for it. >> > > A rough audit of current SCSI drivers tells that these set > PCI_IRQ_AFFINITY in some path but don't set Scsi_host.nr_hw_queues at all: > aacraid, be2iscsi, csiostor, megaraid, mpt3sas > Megaraid and mpt3sas don't have that functionality (or, at least, not that I'm aware). And in general I'm not sure if the above approach is feasible. Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some means of quiescing the CPU before hotplug) then the whole thing is trivial; disable SQ and wait for all outstanding commands to complete. Then trivially all requests are completed and the issue is resolved. Even with todays infrastructure. And I'm not sure if we can handle surprise CPU hotplug at all, given all the possible race conditions. But then I might be wrong. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-01 15:56 ` Hannes Reinecke @ 2019-02-01 21:57 ` Thomas Gleixner 2019-02-04 7:12 ` Hannes Reinecke 0 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2019-02-01 21:57 UTC (permalink / raw) To: Hannes Reinecke Cc: John Garry, Keith Busch, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On Fri, 1 Feb 2019, Hannes Reinecke wrote: > Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some > means of quiescing the CPU before hotplug) then the whole thing is trivial; > disable SQ and wait for all outstanding commands to complete. > Then trivially all requests are completed and the issue is resolved. > Even with todays infrastructure. > > And I'm not sure if we can handle surprise CPU hotplug at all, given all the > possible race conditions. > But then I might be wrong. The kernel would completely fall apart when a CPU would vanish by surprise, i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be the least of our problems. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-01 21:57 ` Thomas Gleixner @ 2019-02-04 7:12 ` Hannes Reinecke 2019-02-05 13:24 ` John Garry 0 siblings, 1 reply; 26+ messages in thread From: Hannes Reinecke @ 2019-02-04 7:12 UTC (permalink / raw) To: Thomas Gleixner Cc: John Garry, Keith Busch, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 2/1/19 10:57 PM, Thomas Gleixner wrote: > On Fri, 1 Feb 2019, Hannes Reinecke wrote: >> Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some >> means of quiescing the CPU before hotplug) then the whole thing is trivial; >> disable SQ and wait for all outstanding commands to complete. >> Then trivially all requests are completed and the issue is resolved. >> Even with todays infrastructure. >> >> And I'm not sure if we can handle surprise CPU hotplug at all, given all the >> possible race conditions. >> But then I might be wrong. > > The kernel would completely fall apart when a CPU would vanish by surprise, > i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be > the least of our problems. > Hehe. As I thought. So, as the user then has to wait for the system to declars 'ready for CPU remove', why can't we just disable the SQ and wait for all I/O to complete? We can make it more fine-grained by just waiting on all outstanding I/O on that SQ to complete, but waiting for all I/O should be good as an initial try. With that we wouldn't need to fiddle with driver internals, and could make it pretty generic. And we could always add more detailed logic if the driver has the means for doing so. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-04 7:12 ` Hannes Reinecke @ 2019-02-05 13:24 ` John Garry 2019-02-05 14:52 ` Keith Busch 0 siblings, 1 reply; 26+ messages in thread From: John Garry @ 2019-02-05 13:24 UTC (permalink / raw) To: Hannes Reinecke, Thomas Gleixner Cc: Keith Busch, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 04/02/2019 07:12, Hannes Reinecke wrote: > On 2/1/19 10:57 PM, Thomas Gleixner wrote: >> On Fri, 1 Feb 2019, Hannes Reinecke wrote: >>> Thing is, if we have _managed_ CPU hotplug (ie if the hardware >>> provides some >>> means of quiescing the CPU before hotplug) then the whole thing is >>> trivial; >>> disable SQ and wait for all outstanding commands to complete. >>> Then trivially all requests are completed and the issue is resolved. >>> Even with todays infrastructure. >>> >>> And I'm not sure if we can handle surprise CPU hotplug at all, given >>> all the >>> possible race conditions. >>> But then I might be wrong. >> >> The kernel would completely fall apart when a CPU would vanish by >> surprise, >> i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be >> the least of our problems. >> > Hehe. As I thought. Hi Hannes, > > So, as the user then has to wait for the system to declars 'ready for > CPU remove', why can't we just disable the SQ and wait for all I/O to > complete? > We can make it more fine-grained by just waiting on all outstanding I/O > on that SQ to complete, but waiting for all I/O should be good as an > initial try. > With that we wouldn't need to fiddle with driver internals, and could > make it pretty generic. I don't fully understand this idea - specifically, at which layer would we be waiting for all the IO to complete? > And we could always add more detailed logic if the driver has the means > for doing so. > Thanks, John > Cheers, > > Hannes ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 13:24 ` John Garry @ 2019-02-05 14:52 ` Keith Busch 2019-02-05 15:09 ` John Garry 2019-02-05 15:10 ` Hannes Reinecke 0 siblings, 2 replies; 26+ messages in thread From: Keith Busch @ 2019-02-05 14:52 UTC (permalink / raw) To: John Garry Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote: > On 04/02/2019 07:12, Hannes Reinecke wrote: > > Hi Hannes, > > > > > So, as the user then has to wait for the system to declars 'ready for > > CPU remove', why can't we just disable the SQ and wait for all I/O to > > complete? > > We can make it more fine-grained by just waiting on all outstanding I/O > > on that SQ to complete, but waiting for all I/O should be good as an > > initial try. > > With that we wouldn't need to fiddle with driver internals, and could > > make it pretty generic. > > I don't fully understand this idea - specifically, at which layer would > we be waiting for all the IO to complete? Whichever layer dispatched the IO to a CPU specific context should be the one to wait for its completion. That should be blk-mq for most block drivers. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 14:52 ` Keith Busch @ 2019-02-05 15:09 ` John Garry 2019-02-05 15:11 ` Keith Busch ` (2 more replies) 2019-02-05 15:10 ` Hannes Reinecke 1 sibling, 3 replies; 26+ messages in thread From: John Garry @ 2019-02-05 15:09 UTC (permalink / raw) To: Keith Busch Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 05/02/2019 14:52, Keith Busch wrote: > On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote: >> On 04/02/2019 07:12, Hannes Reinecke wrote: >> >> Hi Hannes, >> >>> >>> So, as the user then has to wait for the system to declars 'ready for >>> CPU remove', why can't we just disable the SQ and wait for all I/O to >>> complete? >>> We can make it more fine-grained by just waiting on all outstanding I/O >>> on that SQ to complete, but waiting for all I/O should be good as an >>> initial try. >>> With that we wouldn't need to fiddle with driver internals, and could >>> make it pretty generic. >> >> I don't fully understand this idea - specifically, at which layer would >> we be waiting for all the IO to complete? > > Whichever layer dispatched the IO to a CPU specific context should > be the one to wait for its completion. That should be blk-mq for most > block drivers. For SCSI devices, unfortunately not all IO sent to the HW originates from blk-mq or any other single entity. Thanks, John > > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 15:09 ` John Garry @ 2019-02-05 15:11 ` Keith Busch 2019-02-05 15:15 ` Hannes Reinecke 2019-02-05 18:23 ` Christoph Hellwig 2 siblings, 0 replies; 26+ messages in thread From: Keith Busch @ 2019-02-05 15:11 UTC (permalink / raw) To: John Garry Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote: > On 05/02/2019 14:52, Keith Busch wrote: > > On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote: > > > On 04/02/2019 07:12, Hannes Reinecke wrote: > > > > > > Hi Hannes, > > > > > > > > > > > So, as the user then has to wait for the system to declars 'ready for > > > > CPU remove', why can't we just disable the SQ and wait for all I/O to > > > > complete? > > > > We can make it more fine-grained by just waiting on all outstanding I/O > > > > on that SQ to complete, but waiting for all I/O should be good as an > > > > initial try. > > > > With that we wouldn't need to fiddle with driver internals, and could > > > > make it pretty generic. > > > > > > I don't fully understand this idea - specifically, at which layer would > > > we be waiting for all the IO to complete? > > > > Whichever layer dispatched the IO to a CPU specific context should > > be the one to wait for its completion. That should be blk-mq for most > > block drivers. > > For SCSI devices, unfortunately not all IO sent to the HW originates from > blk-mq or any other single entity. Then they'll need to register their own CPU notifiers and handle the ones they dispatched. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 15:09 ` John Garry 2019-02-05 15:11 ` Keith Busch @ 2019-02-05 15:15 ` Hannes Reinecke 2019-02-05 15:27 ` John Garry 2019-02-05 18:23 ` Christoph Hellwig 2 siblings, 1 reply; 26+ messages in thread From: Hannes Reinecke @ 2019-02-05 15:15 UTC (permalink / raw) To: John Garry, Keith Busch Cc: Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 2/5/19 4:09 PM, John Garry wrote: > On 05/02/2019 14:52, Keith Busch wrote: >> On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote: >>> On 04/02/2019 07:12, Hannes Reinecke wrote: >>> >>> Hi Hannes, >>> >>>> >>>> So, as the user then has to wait for the system to declars 'ready for >>>> CPU remove', why can't we just disable the SQ and wait for all I/O to >>>> complete? >>>> We can make it more fine-grained by just waiting on all outstanding I/O >>>> on that SQ to complete, but waiting for all I/O should be good as an >>>> initial try. >>>> With that we wouldn't need to fiddle with driver internals, and could >>>> make it pretty generic. >>> >>> I don't fully understand this idea - specifically, at which layer would >>> we be waiting for all the IO to complete? >> >> Whichever layer dispatched the IO to a CPU specific context should >> be the one to wait for its completion. That should be blk-mq for most >> block drivers. > > For SCSI devices, unfortunately not all IO sent to the HW originates > from blk-mq or any other single entity. > No, not as such. But each IO sent to the HW requires a unique identifcation (ie a valid tag). And as the tagspace is managed by block-mq (minus management commands, but I'm working on that currently) we can easily figure out if the device is busy by checking for an empty tag map. Should be doable for most modern HBAs. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 15:15 ` Hannes Reinecke @ 2019-02-05 15:27 ` John Garry 0 siblings, 0 replies; 26+ messages in thread From: John Garry @ 2019-02-05 15:27 UTC (permalink / raw) To: Hannes Reinecke, Keith Busch Cc: Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 05/02/2019 15:15, Hannes Reinecke wrote: > On 2/5/19 4:09 PM, John Garry wrote: >> On 05/02/2019 14:52, Keith Busch wrote: >>> On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote: >>>> On 04/02/2019 07:12, Hannes Reinecke wrote: >>>> >>>> Hi Hannes, >>>> >>>>> >>>>> So, as the user then has to wait for the system to declars 'ready for >>>>> CPU remove', why can't we just disable the SQ and wait for all I/O to >>>>> complete? >>>>> We can make it more fine-grained by just waiting on all outstanding >>>>> I/O >>>>> on that SQ to complete, but waiting for all I/O should be good as an >>>>> initial try. >>>>> With that we wouldn't need to fiddle with driver internals, and could >>>>> make it pretty generic. >>>> >>>> I don't fully understand this idea - specifically, at which layer would >>>> we be waiting for all the IO to complete? >>> >>> Whichever layer dispatched the IO to a CPU specific context should >>> be the one to wait for its completion. That should be blk-mq for most >>> block drivers. >> >> For SCSI devices, unfortunately not all IO sent to the HW originates >> from blk-mq or any other single entity. >> > No, not as such. > But each IO sent to the HW requires a unique identifcation (ie a valid > tag). And as the tagspace is managed by block-mq (minus management > commands, but I'm working on that currently) we can easily figure out if > the device is busy by checking for an empty tag map. That sounds like a reasonable starting solution. Thanks, John > > Should be doable for most modern HBAs. > > Cheers, > > Hannes ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 15:09 ` John Garry 2019-02-05 15:11 ` Keith Busch 2019-02-05 15:15 ` Hannes Reinecke @ 2019-02-05 18:23 ` Christoph Hellwig 2019-02-06 9:21 ` John Garry 2 siblings, 1 reply; 26+ messages in thread From: Christoph Hellwig @ 2019-02-05 18:23 UTC (permalink / raw) To: John Garry Cc: Keith Busch, Hannes Reinecke, Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote: > For SCSI devices, unfortunately not all IO sent to the HW originates from > blk-mq or any other single entity. Where else would SCSI I/O originate from? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 18:23 ` Christoph Hellwig @ 2019-02-06 9:21 ` John Garry 2019-02-06 13:34 ` Benjamin Block 0 siblings, 1 reply; 26+ messages in thread From: John Garry @ 2019-02-06 9:21 UTC (permalink / raw) To: Christoph Hellwig Cc: Keith Busch, Hannes Reinecke, Thomas Gleixner, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On 05/02/2019 18:23, Christoph Hellwig wrote: > On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote: >> For SCSI devices, unfortunately not all IO sent to the HW originates from >> blk-mq or any other single entity. > > Where else would SCSI I/O originate from? Please note that I was referring to other management IO, like SAS SMP, TMFs, and other proprietary commands which the driver may generate for the HBA - https://marc.info/?l=linux-scsi&m=154831889001973&w=2 discusses some of them also. Thanks, John > > . > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-06 9:21 ` John Garry @ 2019-02-06 13:34 ` Benjamin Block 0 siblings, 0 replies; 26+ messages in thread From: Benjamin Block @ 2019-02-06 13:34 UTC (permalink / raw) To: John Garry Cc: Christoph Hellwig, Keith Busch, Hannes Reinecke, Thomas Gleixner, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, Hannes Reinecke, linux-scsi, linux-block On Wed, Feb 06, 2019 at 09:21:40AM +0000, John Garry wrote: > On 05/02/2019 18:23, Christoph Hellwig wrote: > > On Tue, Feb 05, 2019 at 03:09:28PM +0000, John Garry wrote: > > > For SCSI devices, unfortunately not all IO sent to the HW originates from > > > blk-mq or any other single entity. > > > > Where else would SCSI I/O originate from? > > Please note that I was referring to other management IO, like SAS SMP, TMFs, > and other proprietary commands which the driver may generate for the HBA - > https://marc.info/?l=linux-scsi&m=154831889001973&w=2 discusses some of them > also. > Especially the TMFs send via SCSI EH are a bit of a pain I guess, because they are entirely managed by the device drivers, but depending on the device driver they might not even qualify for the problem Hannes is seeing. -- With Best Regards, Benjamin Block / Linux on IBM Z Kernel Development IBM Systems & Technology Group / IBM Deutschland Research & Development GmbH Vorsitz. AufsR.: Matthias Hartmann / Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 14:52 ` Keith Busch 2019-02-05 15:09 ` John Garry @ 2019-02-05 15:10 ` Hannes Reinecke 2019-02-05 15:16 ` Keith Busch 1 sibling, 1 reply; 26+ messages in thread From: Hannes Reinecke @ 2019-02-05 15:10 UTC (permalink / raw) To: Keith Busch, John Garry Cc: Hannes Reinecke, Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, linux-scsi, linux-block On 2/5/19 3:52 PM, Keith Busch wrote: > On Tue, Feb 05, 2019 at 05:24:11AM -0800, John Garry wrote: >> On 04/02/2019 07:12, Hannes Reinecke wrote: >> >> Hi Hannes, >> >>> >>> So, as the user then has to wait for the system to declars 'ready for >>> CPU remove', why can't we just disable the SQ and wait for all I/O to >>> complete? >>> We can make it more fine-grained by just waiting on all outstanding I/O >>> on that SQ to complete, but waiting for all I/O should be good as an >>> initial try. >>> With that we wouldn't need to fiddle with driver internals, and could >>> make it pretty generic. >> >> I don't fully understand this idea - specifically, at which layer would >> we be waiting for all the IO to complete? > > Whichever layer dispatched the IO to a CPU specific context should > be the one to wait for its completion. That should be blk-mq for most > block drivers. > Indeed. But we don't provide any mechanisms for that ATM, right? Maybe this would be a topic fit for LSF/MM? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Question on handling managed IRQs when hotplugging CPUs 2019-02-05 15:10 ` Hannes Reinecke @ 2019-02-05 15:16 ` Keith Busch 0 siblings, 0 replies; 26+ messages in thread From: Keith Busch @ 2019-02-05 15:16 UTC (permalink / raw) To: Hannes Reinecke Cc: John Garry, Hannes Reinecke, Thomas Gleixner, Christoph Hellwig, Marc Zyngier, axboe, Peter Zijlstra, Michael Ellerman, Linuxarm, linux-kernel, linux-scsi, linux-block On Tue, Feb 05, 2019 at 04:10:47PM +0100, Hannes Reinecke wrote: > On 2/5/19 3:52 PM, Keith Busch wrote: > > Whichever layer dispatched the IO to a CPU specific context should > > be the one to wait for its completion. That should be blk-mq for most > > block drivers. > > > Indeed. > But we don't provide any mechanisms for that ATM, right? > > Maybe this would be a topic fit for LSF/MM? Right, there's nothing handling this now, and sounds like it'd be a good discussion to bring to the storage track. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2019-02-06 13:34 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-29 11:25 Question on handling managed IRQs when hotplugging CPUs John Garry 2019-01-29 11:54 ` Hannes Reinecke 2019-01-29 12:01 ` Thomas Gleixner 2019-01-29 15:27 ` John Garry 2019-01-29 16:27 ` Thomas Gleixner 2019-01-29 17:23 ` John Garry 2019-01-29 15:44 ` Keith Busch 2019-01-29 17:12 ` John Garry 2019-01-29 17:20 ` Keith Busch 2019-01-30 10:38 ` John Garry 2019-01-30 12:43 ` Thomas Gleixner 2019-01-31 17:48 ` John Garry 2019-02-01 15:56 ` Hannes Reinecke 2019-02-01 21:57 ` Thomas Gleixner 2019-02-04 7:12 ` Hannes Reinecke 2019-02-05 13:24 ` John Garry 2019-02-05 14:52 ` Keith Busch 2019-02-05 15:09 ` John Garry 2019-02-05 15:11 ` Keith Busch 2019-02-05 15:15 ` Hannes Reinecke 2019-02-05 15:27 ` John Garry 2019-02-05 18:23 ` Christoph Hellwig 2019-02-06 9:21 ` John Garry 2019-02-06 13:34 ` Benjamin Block 2019-02-05 15:10 ` Hannes Reinecke 2019-02-05 15:16 ` Keith Busch
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.