All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Travis <travis@sgi.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Toshi Kani <toshi.kani@hp.com>,
	tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	akpm@linux-foundation.org, roland@purestorage.com,
	dan.j.williams@intel.com, x86@kernel.org,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org,
	Clive Harding <clive@sgi.com>, Russ Anderson <rja@sgi.com>,
	Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH 2/3] mm, x86: Remove region_is_ram() call from ioremap
Date: Tue, 23 Jun 2015 11:57:24 -0700	[thread overview]
Message-ID: <5589AC14.4080003@sgi.com> (raw)
In-Reply-To: <20150623090154.GA3402@gmail.com>



On 6/23/2015 2:01 AM, Ingo Molnar wrote:
> 
> * Mike Travis <travis@sgi.com> wrote:
> 
>> <<<
>> We have a large university system in the UK that is experiencing
>> very long delays modprobing the driver for a specific I/O device.
>> The delay is from 8-10 minutes per device and there are 31 devices
>> in the system.  This 4 to 5 hour delay in starting up those I/O
>> devices is very much a burden on the customer.
>> ...
>> The problem was tracked down to a very slow IOREMAP operation and
>> the excessively long ioresource lookup to insure that the user is
>> not attempting to ioremap RAM.  These patches provide a speed up
>> to that function.
>>>>>
>>
>> The speed up was pretty dramatic, I think to about 15-20 minutes
>> (the test was done by our local CS person in the UK).  I think this
>> would prove the function was working since it would have fallen
>> back to the previous page_is_ram function and the 4 to 5 hour
>> startup.
> 
> Btw., I think even 15-20 minutes is still in the 'ridiculously slow' category.
> Any chance to fix all of this properly, not just hack by hack?
> 
> Thanks,
> 
> 	Ingo
> 


The current primary cause of the slow start up now lies within
the loading of the kernel and other software to 31 Co-processors
in a serial fashion.  We have suggested to the vendor that they
look at booting and starting these in parallel.

The problem is there are not a whole lot of systems that can
handle more than 4 of them let alone 32.  So it's mostly the
interaction between the customers and the vendor directing
these optimizations.

Any speed up of the kernel startup helps here as well.

[off topic]
Btw, this ~20 minutes time is just for the start up of the
co-processors.  The entire system takes much longer as this is
a huge UV system.  Most of the time is still due to memory
initialization.  Mel's "defer page init" patches help here
tremendously, though it's not clear they will trickle back
down to SLES11 which the customer is running.

Thanks,
Mike

WARNING: multiple messages have this Message-ID (diff)
From: Mike Travis <travis@sgi.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Toshi Kani <toshi.kani@hp.com>,
	tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	akpm@linux-foundation.org, roland@purestorage.com,
	dan.j.williams@intel.com, x86@kernel.org,
	linux-nvdimm@ml01.01.org, linux-kernel@vger.kernel.org,
	Clive Harding <clive@sgi.com>, Russ Anderson <rja@sgi.com>,
	Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH 2/3] mm, x86: Remove region_is_ram() call from ioremap
Date: Tue, 23 Jun 2015 11:57:24 -0700	[thread overview]
Message-ID: <5589AC14.4080003@sgi.com> (raw)
In-Reply-To: <20150623090154.GA3402@gmail.com>



On 6/23/2015 2:01 AM, Ingo Molnar wrote:
> 
> * Mike Travis <travis@sgi.com> wrote:
> 
>> <<<
>> We have a large university system in the UK that is experiencing
>> very long delays modprobing the driver for a specific I/O device.
>> The delay is from 8-10 minutes per device and there are 31 devices
>> in the system.  This 4 to 5 hour delay in starting up those I/O
>> devices is very much a burden on the customer.
>> ...
>> The problem was tracked down to a very slow IOREMAP operation and
>> the excessively long ioresource lookup to insure that the user is
>> not attempting to ioremap RAM.  These patches provide a speed up
>> to that function.
>>>>>
>>
>> The speed up was pretty dramatic, I think to about 15-20 minutes
>> (the test was done by our local CS person in the UK).  I think this
>> would prove the function was working since it would have fallen
>> back to the previous page_is_ram function and the 4 to 5 hour
>> startup.
> 
> Btw., I think even 15-20 minutes is still in the 'ridiculously slow' category.
> Any chance to fix all of this properly, not just hack by hack?
> 
> Thanks,
> 
> 	Ingo
> 


The current primary cause of the slow start up now lies within
the loading of the kernel and other software to 31 Co-processors
in a serial fashion.  We have suggested to the vendor that they
look at booting and starting these in parallel.

The problem is there are not a whole lot of systems that can
handle more than 4 of them let alone 32.  So it's mostly the
interaction between the customers and the vendor directing
these optimizations.

Any speed up of the kernel startup helps here as well.

[off topic]
Btw, this ~20 minutes time is just for the start up of the
co-processors.  The entire system takes much longer as this is
a huge UV system.  Most of the time is still due to memory
initialization.  Mel's "defer page init" patches help here
tremendously, though it's not clear they will trickle back
down to SLES11 which the customer is running.

Thanks,
Mike

  parent reply	other threads:[~2015-06-23 18:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-19 21:44 [PATCH 0/3] mm, x86: Fix ioremap RAM check interfaces Toshi Kani
2015-06-19 21:44 ` Toshi Kani
2015-06-19 21:44 ` [PATCH 1/3] mm, x86: Fix warning in ioremap RAM check Toshi Kani
2015-06-19 21:44   ` Toshi Kani
2015-06-19 21:44 ` [PATCH 2/3] mm, x86: Remove region_is_ram() call from ioremap Toshi Kani
2015-06-19 21:44   ` Toshi Kani
2015-06-22 16:21   ` Mike Travis
2015-06-22 16:21     ` Mike Travis
2015-06-22 17:23     ` Toshi Kani
2015-06-22 17:23       ` Toshi Kani
2015-06-22 18:22       ` Mike Travis
2015-06-22 18:22         ` Mike Travis
2015-06-22 19:06         ` Toshi Kani
2015-06-22 19:06           ` Toshi Kani
2015-06-23  9:01     ` Ingo Molnar
2015-06-23  9:01       ` Ingo Molnar
2015-06-23 15:19       ` Toshi Kani
2015-06-23 15:19         ` Toshi Kani
2015-06-23 18:57       ` Mike Travis [this message]
2015-06-23 18:57         ` Mike Travis
2015-06-19 21:44 ` [PATCH 3/3] mm: Fix bugs in region_is_ram() Toshi Kani
2015-06-19 21:44   ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5589AC14.4080003@sgi.com \
    --to=travis@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=clive@sgi.com \
    --cc=dan.j.williams@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=rja@sgi.com \
    --cc=roland@purestorage.com \
    --cc=tglx@linutronix.de \
    --cc=toshi.kani@hp.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.