All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>, Jens Axboe <axboe@fb.com>,
	Hannes Reinecke <hare@suse.com>, Sagi Grimberg <sagi@grimberg.me>,
	linux-scsi@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Long Li <longli@microsoft.com>,
	John Garry <john.garry@huawei.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvme@lists.infradead.org,
	Keith Busch <keith.busch@intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
Date: Thu, 5 Sep 2019 12:37:13 +0200	[thread overview]
Message-ID: <6a36ccc7-24cd-1d92-fef1-2c5e0f798c36@linaro.org> (raw)
In-Reply-To: <20190905090617.GB4432@ming.t460p>


Hi Ming,

On 05/09/2019 11:06, Ming Lei wrote:
> On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
>> Hi,
>>
>> On 04/09/2019 19:07, Bart Van Assche wrote:
>>> On 9/3/19 12:50 AM, Daniel Lezcano wrote:
>>>> On 03/09/2019 09:28, Ming Lei wrote:
>>>>> On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
>>>>>> It is a scheduler problem then ?
>>>>>
>>>>> Scheduler can do nothing if the CPU is taken completely by handling
>>>>> interrupt & softirq, so seems not a scheduler problem, IMO.
>>>>
>>>> Why? If there is a irq pressure on one CPU reducing its capacity, the
>>>> scheduler will balance the tasks on another CPU, no?
>>>
>>> Only if CONFIG_IRQ_TIME_ACCOUNTING has been enabled. However, I don't
>>> know any Linux distro that enables that option. That's probably because
>>> that option introduces two rdtsc() calls in each interrupt. Given the
>>> overhead introduced by this option, I don't think this is the solution
>>> Ming is looking for.
>>
>> Was this overhead reported somewhere ?
> 
> The syscall of gettimeofday() calls ktime_get_real_ts64() which finally
> calls tk_clock_read() which calls rdtsc too.
> 
> But gettimeofday() is often used in fast path, and block IO_STAT needs to
> read it too.
> 
>>
>>> See also irqtime_account_irq() in kernel/sched/cputime.c.
>>
>> From my POV, this framework could be interesting to detect this situation.
> 
> Now we are talking about IRQ_TIME_ACCOUNTING instead of IRQ_TIMINGS, and the
> former one could be used to implement the detection. And the only sharing
> should be the read of timestamp.

You did not share yet the analysis of the problem (the kernel warnings
give the symptoms) and gave the reasoning for the solution. It is hard
to understand what you are looking for exactly and how to connect the dots.

AFAIU, there are fast medium where the responses to requests are faster
than the time to process them, right?

I don't see how detecting IRQ flooding and use a threaded irq is the
solution, can you explain?

If the responses are coming at a very high rate, whatever the solution
(interrupts, threaded interrupts, polling), we are still in the same
situation.

My suggestion was initially to see if the interrupt load will be taken
into accounts in the cpu load and favorize task migration with the
scheduler load balance to a less loaded CPU, thus the CPU processing
interrupts will end up doing only that while other CPUs will handle the
"threaded" side.

Beside that, I'm wondering if the block scheduler should be somehow
involved in that [1]

  -- Daniel

[1]
https://www.linaro.org/blog/io-bandwidth-management-for-production-quality-services/



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


WARNING: multiple messages have this Message-ID (diff)
From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>,
	Hannes Reinecke <hare@suse.com>,
	Bart Van Assche <bvanassche@acm.org>,
	linux-scsi@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Long Li <longli@microsoft.com>,
	John Garry <john.garry@huawei.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvme@lists.infradead.org, Jens Axboe <axboe@fb.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
Date: Thu, 5 Sep 2019 12:37:13 +0200	[thread overview]
Message-ID: <6a36ccc7-24cd-1d92-fef1-2c5e0f798c36@linaro.org> (raw)
In-Reply-To: <20190905090617.GB4432@ming.t460p>


Hi Ming,

On 05/09/2019 11:06, Ming Lei wrote:
> On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
>> Hi,
>>
>> On 04/09/2019 19:07, Bart Van Assche wrote:
>>> On 9/3/19 12:50 AM, Daniel Lezcano wrote:
>>>> On 03/09/2019 09:28, Ming Lei wrote:
>>>>> On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
>>>>>> It is a scheduler problem then ?
>>>>>
>>>>> Scheduler can do nothing if the CPU is taken completely by handling
>>>>> interrupt & softirq, so seems not a scheduler problem, IMO.
>>>>
>>>> Why? If there is a irq pressure on one CPU reducing its capacity, the
>>>> scheduler will balance the tasks on another CPU, no?
>>>
>>> Only if CONFIG_IRQ_TIME_ACCOUNTING has been enabled. However, I don't
>>> know any Linux distro that enables that option. That's probably because
>>> that option introduces two rdtsc() calls in each interrupt. Given the
>>> overhead introduced by this option, I don't think this is the solution
>>> Ming is looking for.
>>
>> Was this overhead reported somewhere ?
> 
> The syscall of gettimeofday() calls ktime_get_real_ts64() which finally
> calls tk_clock_read() which calls rdtsc too.
> 
> But gettimeofday() is often used in fast path, and block IO_STAT needs to
> read it too.
> 
>>
>>> See also irqtime_account_irq() in kernel/sched/cputime.c.
>>
>> From my POV, this framework could be interesting to detect this situation.
> 
> Now we are talking about IRQ_TIME_ACCOUNTING instead of IRQ_TIMINGS, and the
> former one could be used to implement the detection. And the only sharing
> should be the read of timestamp.

You did not share yet the analysis of the problem (the kernel warnings
give the symptoms) and gave the reasoning for the solution. It is hard
to understand what you are looking for exactly and how to connect the dots.

AFAIU, there are fast medium where the responses to requests are faster
than the time to process them, right?

I don't see how detecting IRQ flooding and use a threaded irq is the
solution, can you explain?

If the responses are coming at a very high rate, whatever the solution
(interrupts, threaded interrupts, polling), we are still in the same
situation.

My suggestion was initially to see if the interrupt load will be taken
into accounts in the cpu load and favorize task migration with the
scheduler load balance to a less loaded CPU, thus the CPU processing
interrupts will end up doing only that while other CPUs will handle the
"threaded" side.

Beside that, I'm wondering if the block scheduler should be somehow
involved in that [1]

  -- Daniel

[1]
https://www.linaro.org/blog/io-bandwidth-management-for-production-quality-services/



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2019-09-05 10:37 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-27  8:53 [PATCH 0/4] genirq/nvme: add IRQF_RESCUE_THREAD for avoiding IRQ flood Ming Lei
2019-08-27  8:53 ` [PATCH 1/4] softirq: implement IRQ flood detection mechanism Ming Lei
2019-08-27 14:42   ` Thomas Gleixner
2019-08-27 16:19     ` Thomas Gleixner
2019-08-27 23:04       ` Ming Lei
2019-08-27 23:12         ` Thomas Gleixner
2019-08-27 22:58     ` Ming Lei
2019-08-27 23:09       ` Thomas Gleixner
2019-08-28 11:06         ` Ming Lei
2019-08-28 11:23           ` Thomas Gleixner
2019-08-28 13:50             ` Ming Lei
2019-08-28 14:07               ` Thomas Gleixner
2019-09-03  3:30                 ` Ming Lei
2019-09-03  3:30                   ` Ming Lei
2019-09-03  5:59                   ` Daniel Lezcano
2019-09-03  5:59                     ` Daniel Lezcano
2019-09-03  6:31                     ` Ming Lei
2019-09-03  6:31                       ` Ming Lei
2019-09-03  6:40                       ` Daniel Lezcano
2019-09-03  6:40                         ` Daniel Lezcano
2019-09-03  7:28                         ` Ming Lei
2019-09-03  7:28                           ` Ming Lei
2019-09-03  7:50                           ` Daniel Lezcano
2019-09-03  7:50                             ` Daniel Lezcano
2019-09-03  9:30                             ` Ming Lei
2019-09-03  9:30                               ` Ming Lei
2019-09-04 17:07                             ` Bart Van Assche
2019-09-04 17:07                               ` Bart Van Assche
2019-09-04 17:31                               ` Daniel Lezcano
2019-09-04 17:31                                 ` Daniel Lezcano
2019-09-04 17:38                                 ` Bart Van Assche
2019-09-04 17:38                                   ` Bart Van Assche
2019-09-04 18:02                                   ` Peter Zijlstra
2019-09-04 18:02                                     ` Peter Zijlstra
2019-09-04 19:47                                     ` Bart Van Assche
2019-09-04 19:47                                       ` Bart Van Assche
2019-09-05  9:11                                       ` Ming Lei
2019-09-05  9:11                                         ` Ming Lei
2019-09-05  9:06                                 ` Ming Lei
2019-09-05  9:06                                   ` Ming Lei
2019-09-05 10:37                                   ` Daniel Lezcano [this message]
2019-09-05 10:37                                     ` Daniel Lezcano
2019-09-06  1:22                                     ` Long Li
2019-09-06  1:22                                       ` Long Li
2019-09-06  4:36                                       ` Daniel Lezcano
2019-09-06  4:36                                         ` Daniel Lezcano
2019-09-06  4:44                                         ` Long Li
2019-09-06  4:44                                           ` Long Li
2019-09-06  1:48                                     ` Ming Lei
2019-09-06  1:48                                       ` Ming Lei
2019-09-06  5:14                                       ` Daniel Lezcano
2019-09-06  5:14                                         ` Daniel Lezcano
2019-09-06 18:30                                         ` Sagi Grimberg
2019-09-06 18:30                                           ` Sagi Grimberg
2019-09-06 18:52                                           ` Keith Busch
2019-09-06 18:52                                             ` Keith Busch
2019-09-07  0:01                                           ` Ming Lei
2019-09-07  0:01                                             ` Ming Lei
2019-09-10  3:10                                             ` Sagi Grimberg
2019-09-10  3:10                                               ` Sagi Grimberg
2019-09-18  0:00                                               ` Long Li
2019-09-18  0:00                                                 ` Long Li
2019-09-20 17:14                                                 ` Sagi Grimberg
2019-09-20 17:14                                                   ` Sagi Grimberg
2019-09-20 19:12                                                   ` Long Li
2019-09-20 19:12                                                     ` Long Li
2019-09-20 20:45                                                     ` Sagi Grimberg
2019-09-20 20:45                                                       ` Sagi Grimberg
2019-09-24  0:57                                                       ` Long Li
2019-09-24  0:57                                                         ` Long Li
2019-09-18 14:37                                               ` Ming Lei
2019-09-18 14:37                                                 ` Ming Lei
2019-09-20 17:09                                                 ` Sagi Grimberg
2019-09-20 17:09                                                   ` Sagi Grimberg
2019-09-06 14:18                                       ` Keith Busch
2019-09-06 14:18                                         ` Keith Busch
2019-09-06 17:50                                         ` Long Li
2019-09-06 17:50                                           ` Long Li
2019-09-06 22:19                                           ` Ming Lei
2019-09-06 22:19                                             ` Ming Lei
2019-09-06 22:25                                             ` Keith Busch
2019-09-06 22:25                                               ` Keith Busch
2019-09-06 23:13                                               ` Ming Lei
2019-09-06 23:13                                                 ` Ming Lei
2019-09-10  0:24                                             ` Ming Lei
2019-09-10  0:24                                               ` Ming Lei
2019-09-03  8:09                           ` Thomas Gleixner
2019-09-03  8:09                             ` Thomas Gleixner
2019-09-03  9:24                             ` Ming Lei
2019-09-03  9:24                               ` Ming Lei
2019-08-29  6:15   ` Long Li
2019-08-30  0:55     ` Ming Lei
2019-08-27  8:53 ` [PATCH 2/4] genirq: add IRQF_RESCUE_THREAD Ming Lei
2019-08-27  8:53 ` [PATCH 3/4] nvme: pci: pass IRQF_RESCURE_THREAD to request_threaded_irq Ming Lei
2019-08-27  9:06   ` Johannes Thumshirn
2019-08-27  9:09     ` Ming Lei
2019-08-27  9:12       ` Johannes Thumshirn
2019-08-27 14:34       ` Keith Busch
2019-08-27 14:44         ` Keith Busch
2019-08-27 15:10   ` Bart Van Assche
2019-08-28  1:45     ` Ming Lei
2019-08-27  8:53 ` [PATCH 4/4] genirq: use irq's affinity for threaded irq with IRQF_RESCUE_THREAD Ming Lei
2019-08-27 14:35   ` Keith Busch
2019-09-06  8:50   ` John Garry
2019-09-06  8:50     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6a36ccc7-24cd-1d92-fef1-2c5e0f798c36@linaro.org \
    --to=daniel.lezcano@linaro.org \
    --cc=axboe@fb.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=ming.lei@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sagi@grimberg.me \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.