All of lore.kernel.org
 help / color / mirror / Atom feed
* New fast(?)-boot results on ARM
@ 2009-08-14 17:02 Robert Schwebel
  2009-08-14 18:19 ` Zan Lynx
                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Robert Schwebel @ 2009-08-14 17:02 UTC (permalink / raw)
  To: linux-kernel, linux-embedded; +Cc: Arjan van de Ven, Tim Bird, kernel

Hi,

On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
> On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
> > > That's bad :-) So there is no room for improvement any more in our
> > > ARM boot sequences ...
> >
> > on x86 we're doing pretty well ;-)
>
> On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
> power-on through the kernel up to "starting init". This is with
>
> - no delay in u-boot-v2
> - rootfs on NAND (UBIFS)
> - quiet
> - precalculated loops-per-jiffy
> - zImage kernel instead of uImage

Here's a little video of our demo system booting:
http://www.youtube.com/watch?v=xDbUnNsj0cI

As you can see there, it needs about 15 s from the release of the reset button
up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
when we fade over from the initial framebuffer to the final one, in order to
hide the qt application startup noise).

And below is the boot log (after turning "quiet" off again). The numbers are
the timestamp and the delta to the last timestamp, measured on the controlling
PC by looking at the serial console output. The ptx_ts script starts when the
regexp was found, so the numbers start basically in the moment when u-boot-v2
has initialized the system up to the point where we can see something.

Result:

- 2.4 s up from u-boot to the end of "Uncompressing Linux"
- 300 ms until ubifs initialization starts
- 3.7 s for ubifs, until "mounted root"

So we basically have 7 s for the kernel. The rest is userspace, which hasn't
seen much optimization yet, other than trying to start the GUI application as
early as possible, while doing all other init stuff in parallel. Adding "quiet"
brings us another 300 ms.

That's factor 70 away from the 110 ms boot time Tim has talked about some days
ago (and he measured on an ARM cpu which had almost half the speed of this
one), and I'm wondering what we can do to improve the boot time.

Robert

rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
[  2.395740] <  2.395740>
[  2.395860] <  0.000120>
[  0.000011] <  0.000011> U-Boot 2.0.0-rc9 (Aug  5 2009 - 10:05:58)
[  0.000059] <  0.000048>
[  0.003823] <  0.003764> Board: Phytec phyCORE-i.MX27
[  0.010753] <  0.006930> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
[  0.018711] <  0.007958> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
[  0.026592] <  0.007881> imxfb@imxfb0: i.MX Framebuffer driver
[  0.178655] <  0.152063> dev_protect: currently broken
[  0.178736] <  0.000081> Using environment in NOR Flash
[  0.182577] <  0.003841> initialising PLLs
[  0.367142] <  0.184565> Malloc space: 0xa3f00000 -> 0xa7f00000 (size 64 MB)
[  0.370568] <  0.003426> Stack space : 0xa3ef8000 -> 0xa3f00000 (size 32 kB)
[  0.445993] <  0.075425> running /env/bin/init...
[  0.870592] <  0.424599>
[  0.874559] <  0.003967> Hit any key to stop autoboot:  0
[  1.326621] <  0.452062> loaded zImage from /dev/nand0.kernel.bb with size 1679656
[  2.009996] <  0.683375> Uncompressing Linux............................................................................................................... done, booting the kernel.
[  2.416999] <  0.407003> Linux version 2.6.31-rc4-g056f82f-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu Aug 6 08:37:19 CEST 2009
[  2.418729] <  0.001730> CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
[  2.423081] <  0.004352> CPU: VIVT data cache, VIVT instruction cache
[  2.426592] <  0.003511> Machine: phyCORE-i.MX27
[  2.430609] <  0.004017> Memory policy: ECC disabled, Data cache writeback
[  2.439704] <  0.009095> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
[  2.463977] <  0.024273> Kernel command line: console=ttymxc0,115200 mt9v022.sensor_type=color ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: ubi.mtd=7 root=ubi0:root rootfstype=ubifs mtdparts="physmap-flash.0:256k(uboot)ro,128k(ubootenv),3M(kernel),-(root);mxc_nand:256k(uboot)ro,128k(ubootenv),3M(kernel),-(root)"
[  2.467580] <  0.003603> Unknown boot option `mt9v022.sensor_type=color': ignoring
[  2.471632] <  0.004052> PID hash table entries: 512 (order: 9, 2048 bytes)
[  2.479971] <  0.008339> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[  2.485485] <  0.005514> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[  2.485560] <  0.000075> Memory: 128MB = 128MB total
[  2.490595] <  0.005035> Memory: 126108KB available (3104K code, 350K data, 100K init, 0K highmem)
[  2.494609] <  0.004014> NR_IRQS:272
[  2.494654] <  0.000045> MXC GPIO hardware
[  2.498595] <  0.003941> MXC IRQ initialized
[  2.502600] <  0.004005> Console: colour dummy device 80x30
[  2.506591] <  0.003991> Calibrating delay loop... 199.06 BogoMIPS (lpj=995328)
[  2.506641] <  0.000050> Mount-cache hash table entries: 512
[  2.510651] <  0.004010> CPU: Testing write buffer coherency: ok
[  2.514594] <  0.003943> NET: Registered protocol family 16
[  2.518678] <  0.004084> bio: create slab <bio-0> at 0
[  2.522584] <  0.003906> NET: Registered protocol family 2
[  2.526639] <  0.004055> IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
[  2.530592] <  0.003953> TCP established hash table entries: 4096 (order: 3, 32768 bytes)
[  2.539728] <  0.009136> TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
[  2.542647] <  0.002919> TCP: Hash tables configured (established 4096 bind 4096)
[  2.542696] <  0.000049> TCP reno registered
[  2.546573] <  0.003877> NET: Registered protocol family 1
[  2.555898] <  0.009325> NetWinder Floating Point Emulator V0.97 (extended precision)
[  2.555969] <  0.000071> msgmni has been set to 246
[  2.560063] <  0.004094> io scheduler noop registered (default)
[  2.560109] <  0.000046> i.MX Framebuffer driver
[  2.564237] <  0.004128> Serial: IMX driver
[  2.567840] <  0.003603> Platform driver 'imx-uart' needs updating - please use dev_pm_ops
[  2.571898] <  0.004058> imx-uart.0: ttymxc0 at MMIO 0x1000a000 (irq = 20) is a IMX
[  2.576220] <  0.004322> console [ttymxc0] enabled
[  2.583937] <  0.007717> imx-uart.1: ttymxc1 at MMIO 0x1000b000 (irq = 19) is a IMX
[  2.590616] <  0.006679> imx-uart.2: ttymxc2 at MMIO 0x1000c000 (irq = 18) is a IMX
[  2.590734] <  0.000118> FEC Ethernet Driver
[  2.599694] <  0.008960> Platform driver 'fec' needs updating - please use dev_pm_ops
[  2.599745] <  0.000051> fec: PHY @ 0x0, ID 0x00221613 -- KS8721BL
[  2.612095] <  0.012350> physmap platform flash device: 02000000 at c0000000
[  2.615634] <  0.003539> physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
[  2.620073] <  0.004439>  Intel/Sharp Extended Query Table at 0x010A
[  2.624187] <  0.004114>  Intel/Sharp Extended Query Table at 0x010A
[  2.627835] <  0.003648>  Intel/Sharp Extended Query Table at 0x010A
[  2.631976] <  0.004141>  Intel/Sharp Extended Query Table at 0x010A
[  2.636182] <  0.004206>  Intel/Sharp Extended Query Table at 0x010A
[  2.640245] <  0.004063> Using buffer write method
[  2.642569] <  0.002324> Using auto-unlock on power-up/resume
[  2.646576] <  0.004007> cfi_cmdset_0001: Erase suspend on write enabled
[  2.650581] <  0.004005> 4 cmdlinepart partitions found on MTD device physmap-flash.0
[  2.654627] <  0.004046> Creating 4 MTD partitions on "physmap-flash.0":
[  2.658594] <  0.003967> 0x000000000000-0x000000040000 : "uboot"
[  2.666646] <  0.008052> 0x000000040000-0x000000060000 : "ubootenv"
[  2.674574] <  0.007928> 0x000000060000-0x000000360000 : "kernel"
[  2.678570] <  0.003996> 0x000000360000-0x000002000000 : "root"
[  2.690623] <  0.012053> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
[  2.694572] <  0.003949> RedBoot partition parsing not available
[  2.698575] <  0.004003> 4 cmdlinepart partitions found on MTD device mxc_nand
[  2.706596] <  0.008021> Creating 4 MTD partitions on "mxc_nand":
[  2.706645] <  0.000049> 0x000000000000-0x000000040000 : "uboot"
[  2.717895] <  0.011250> 0x000000040000-0x000000060000 : "ubootenv"
[  2.726578] <  0.008683> 0x000000060000-0x000000360000 : "kernel"
[  2.742628] <  0.016050> 0x000000360000-0x000004000000 : "root"
[  3.058610] <  0.315982> UBI: attaching mtd7 to ubi0
[  3.062878] <  0.004268> UBI: physical eraseblock size:   16384 bytes (16 KiB)
[  3.070601] <  0.007723> UBI: logical eraseblock size:    15360 bytes
[  3.070665] <  0.000064> UBI: smallest flash I/O unit:    512
[  3.078564] <  0.007899> UBI: VID header offset:          512 (aligned 512)
[  3.078609] <  0.000045> UBI: data offset:                1024
[  5.006609] <  1.928000> UBI: attached mtd7 to ubi0
[  5.013157] <  0.006548> UBI: MTD device name:            "root"
[  5.014566] <  0.001409> UBI: MTD device size:            60 MiB
[  5.018660] <  0.004094> UBI: number of good PEBs:        3880
[  5.022585] <  0.003925> UBI: number of bad PEBs:         0
[  5.026797] <  0.004212> UBI: max. allowed volumes:       89
[  5.026849] <  0.000052> UBI: wear-leveling threshold:    4096
[  5.030779] <  0.003930> UBI: number of internal volumes: 1
[  5.034583] <  0.003804> UBI: number of user volumes:     1
[  5.046572] <  0.011989> UBI: available PEBs:             0
[  5.046622] <  0.000050> UBI: total number of reserved PEBs: 3880
[  5.046657] <  0.000035> UBI: number of PEBs reserved for bad PEB handling: 38
[  5.050606] <  0.003949> UBI: max/mean erase counter: 2/0
[  5.050668] <  0.000062> UBI: image sequence number: 0
[  5.058619] <  0.007951> UBI: background thread "ubi_bgt0d" started, PID 215
[  5.062620] <  0.004001> oprofile: using timer interrupt.
[  5.070584] <  0.007964> TCP cubic registered
[  5.070637] <  0.000053> NET: Registered protocol family 17
[  5.074624] <  0.003987> RPC: Registered udp transport module.
[  5.082616] <  0.007992> RPC: Registered tcp transport module.
[  5.605159] <  0.522543> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
[  6.602621] <  0.997462> IP-Config: Complete:
[  6.606638] <  0.004017>      device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2,
[  6.614588] <  0.007950>      host=192.168.23.197, domain=, nis-domain=(none),
[  6.618652] <  0.004064>      bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath=
[  6.630579] <  0.011927> UBIFS: recovery needed
[  6.662655] <  0.032076> UBIFS: recovery completed
[  6.666587] <  0.003932> UBIFS: mounted UBI device 0, volume 1, name "root"
[  6.670570] <  0.003983> UBIFS: file system size:   58490880 bytes (57120 KiB, 55 MiB, 3808 LEBs)
[  6.678572] <  0.008002> UBIFS: journal size:       7741440 bytes (7560 KiB, 7 MiB, 504 LEBs)
[  6.682573] <  0.004001> UBIFS: media format:       w4/r0 (latest is w4/r0)
[  6.686572] <  0.003999> UBIFS: default compressor: lzo
[  6.690562] <  0.003990> UBIFS: reserved for root:  0 bytes (0 KiB)
[  6.694599] <  0.004037> VFS: Mounted root (ubifs filesystem) on device 0:12.
[  6.702568] <  0.007969> Freeing init memory: 100K
init started: BusyBox v1.13.4 (2009-08-06 08:30:14 CEST)
[  7.050625] <  0.187504> mounting filesystems...done.
[  7.078608] <  0.027983> running rc.d services...
[  7.137924] <  0.059316> starting udev
[  7.147925] <  0.010001> mounting tmpfs at /dev
[  7.182299] <  0.034374> creating static nodes
[  7.410613] <  0.228314> starting udevd...done
[  8.811097] <  1.400484> waiting for devices...done
[  8.918710] <  0.107613> syslogd starting
[  9.050585] <  0.131875> tweaking ondemand scaling governor
[ 10.010600] <  0.960015> Starting system message bus: dbus.
[ 10.118607] <  0.108007> /etc/rc.d/S16openssh: .: line 11: can't open /lib/init/initmethod-bbinit-functions.sh
[ 10.122561] <  0.003954> run-parts: /etc/rc.d/S16openssh exited with code 2
[ 10.246641] <  0.124080> Starting telnetd...
[ 10.442761] <  0.196120> sound not supported, skipping mixer state
[ 10.756354] <  0.313593> sound:  restoring
[ 10.940567] <  0.184213> alsactl: load_state:1608: No soundcards found...
[ 11.046578] <  0.106011> starting network interfaces...
[ 11.370586] <  0.324008> ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
[ 11.377491] <  0.006905> ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
[ 11.384742] <  0.007251> ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
[ 11.392825] <  0.008083> ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
[ 11.398567] <  0.005742> ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
[ 11.410599] <  0.012032> ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
[ 11.418625] <  0.008026> ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
[ 11.422605] <  0.003980> ALSA lib pcm.c:2202:(snd_pcm_open_noupdate) Unknown PCM default
[ 11.428764] <  0.006159> aplay: main:590: audio open error: No such file or directory
[ 11.490669] <  0.061905> ip: RTNETLINK answers: File exists
[ 12.150608] <  0.659939> ip: cannot find device "can0"
[ 12.213228] <  0.062620> ip: SIOCGIFFLAGS: No such device
[ 12.314609] <  0.101381> lighttpd: starting
[ 12.767898] <  0.453289> lighttpd: done
[ 12.846591] <  0.078693> pure-ftpd: no /etc/pure-ftpd.defaults found.
[ 12.914630] <  0.068039> /usr/sbin/pure-ftpd
[ 13.035649] <  0.121019> pure-ftpd: starting pure-ftpd: /usr/sbin/pure-ftpd
[ 13.082624] <  0.046975> pure-ftpd: no upload script defined, skipping
[ 13.090595] <  0.007971> done
[ 13.242901] <  0.152306> loading modules
[ 13.291592] <  0.048691>     mx27_camera
[ 13.334611] <  0.043019> FATAL: Module mx27_camera not found.
[ 13.354552] <  0.019941>     pca953x
[ 13.414614] <  0.060062> FATAL: Module pca953x not found.
[ 13.434597] <  0.019983>     plat-ram
[ 13.479436] <  0.044839> FATAL: Module plat_ram not found.
[ 13.522625] <  0.043189>
[ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
[ 13.558613] <  0.011986>
[ 13.690643] <  0.132030>        _            ____ ___  ____  _____
[ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
[ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
[ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
[ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
[ 13.706573] <  0.003992> |_|          |___/
[ 13.706622] <  0.000049>
[ 13.725043] <  0.018421>
[ 14.742608] <  1.017565>

Robert
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 17:02 New fast(?)-boot results on ARM Robert Schwebel
@ 2009-08-14 18:19 ` Zan Lynx
  2009-08-14 18:46   ` Jamie Lokier
  2009-08-14 18:57   ` Robert Schwebel
  2009-08-14 20:04 ` Denys Vlasenko
  2009-08-18 14:06 ` Sascha Hauer
  2 siblings, 2 replies; 43+ messages in thread
From: Zan Lynx @ 2009-08-14 18:19 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

Robert Schwebel wrote:

> - 2.4 s up from u-boot to the end of "Uncompressing Linux"
> - 300 ms until ubifs initialization starts
> - 3.7 s for ubifs, until "mounted root"
> 
> So we basically have 7 s for the kernel. The rest is userspace, which hasn't
> seen much optimization yet, other than trying to start the GUI application as
> early as possible, while doing all other init stuff in parallel. Adding "quiet"
> brings us another 300 ms.
> 
> That's factor 70 away from the 110 ms boot time Tim has talked about some days
> ago (and he measured on an ARM cpu which had almost half the speed of this
> one), and I'm wondering what we can do to improve the boot time.

2.4s in uncompression? That seems like an obvious target for improvement.

Your kernel seems awfully large. 3104K code? You should definitely find 
out what is making it that big and cut out everything you do not need. 
You might even try some of the embedded system scripts that rip out all 
the printk strings.

If you get the kernel size way down then use a uncompressed kernel and 
it should boot a lot faster if the bottleneck is CPU speed.

However, it is probably IO speed. There could be something really wrong 
and slow with your MTD. Does it DMA or is it doing something crazy like 
using the CPU to read a byte at a time?

Or maybe its cheap and slow flash. In that case I think your only hope 
is to make all the code as small as possible and/or find a different 
flash filesystem that does not have to read so much of the device to 
mount. Perhaps use a read-only compressed filesystem for the system 
binaries and reflash it for software upgrades. Only init and mount the 
writable flash for user-storable data well after system boot has finished.
-- 
Zan Lynx
zlynx@acm.org

"Knowledge is Power.  Power Corrupts.  Study Hard.  Be Evil."

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 18:19 ` Zan Lynx
@ 2009-08-14 18:46   ` Jamie Lokier
  2009-08-14 18:58     ` Robert Schwebel
  2009-08-14 18:57   ` Robert Schwebel
  1 sibling, 1 reply; 43+ messages in thread
From: Jamie Lokier @ 2009-08-14 18:46 UTC (permalink / raw)
  To: Zan Lynx
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

Zan Lynx wrote:
> Or maybe its cheap and slow flash. In that case I think your only hope 
> is to make all the code as small as possible and/or find a different 
> flash filesystem that does not have to read so much of the device to 
> mount. Perhaps use a read-only compressed filesystem for the system 
> binaries and reflash it for software upgrades. Only init and mount the 
> writable flash for user-storable data well after system boot has finished.

Fwiw, logfs claims to mount quickly, but I haven't heard much about it
in recent months and http://logfs.org/logfs/ implies it's not really
stable yet.  But maybe if you're working on a prototype that doesn't
matter so much.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 18:19 ` Zan Lynx
  2009-08-14 18:46   ` Jamie Lokier
@ 2009-08-14 18:57   ` Robert Schwebel
  2009-08-14 21:01     ` Linus Walleij
  1 sibling, 1 reply; 43+ messages in thread
From: Robert Schwebel @ 2009-08-14 18:57 UTC (permalink / raw)
  To: Zan Lynx; +Cc: linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

Zan,

On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote:
> > That's factor 70 away from the 110 ms boot time Tim has talked about
> > some days ago (and he measured on an ARM cpu which had almost half
> > the speed of this one), and I'm wondering what we can do to improve
> > the boot time.
>
> 2.4s in uncompression? That seems like an obvious target for
> improvement.

Indeed, we'll check that.

However, I have a little bit the impression that most systems which are
hyped as "fast boot" out there are optimized so aggressively that they
are not really usable in real life applications any more. So we try to
configure the systems in a "realistic" way. I know that we won't get the
last milliseconds that way - but I'd like to find out how far we can go.

> Your kernel seems awfully large. 3104K code? You should definitely find
> out what is making it that big and cut out everything you do not need.

Definitely, will audit again.

> You might even try some of the embedded system scripts that rip out
> all the printk strings.

Hmm, that's definitely in the "last-minute-before-product" category.

> If you get the kernel size way down then use a uncompressed kernel and
> it should boot a lot faster if the bottleneck is CPU speed.

I'll try that.

> However, it is probably IO speed. There could be something really wrong
> and slow with your MTD. Does it DMA or is it doing something crazy like
> using the CPU to read a byte at a time?

Will check.

> Or maybe its cheap and slow flash. In that case I think your only hope
> is to make all the code as small as possible and/or find a different
> flash filesystem that does not have to read so much of the device to
> mount. Perhaps use a read-only compressed filesystem for the system
> binaries and reflash it for software upgrades. Only init and mount the
> writable flash for user-storable data well after system boot has
> finished.

That would be also a last-minute change, but surely worth to be
evaluated.

We recently changed from jffs2 to ubifs and hoped to gain speed during
that step.

Thanks for your feedback!

rsc
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 18:46   ` Jamie Lokier
@ 2009-08-14 18:58     ` Robert Schwebel
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Schwebel @ 2009-08-14 18:58 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Zan Lynx, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

On Fri, Aug 14, 2009 at 07:46:51PM +0100, Jamie Lokier wrote:
> Zan Lynx wrote:
> > Or maybe its cheap and slow flash. In that case I think your only
> > hope is to make all the code as small as possible and/or find a
> > different flash filesystem that does not have to read so much of the
> > device to mount. Perhaps use a read-only compressed filesystem for
> > the system binaries and reflash it for software upgrades. Only init
> > and mount the writable flash for user-storable data well after
> > system boot has finished.
>
> Fwiw, logfs claims to mount quickly, but I haven't heard much about it
> in recent months and http://logfs.org/logfs/ implies it's not really
> stable yet. But maybe if you're working on a prototype that doesn't
> matter so much.

Is logfs ready for production in the meantime? Last time I checked it
was still more or less Jörn's pet project and ubifs seemed much more
mature.

rsc
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 17:02 New fast(?)-boot results on ARM Robert Schwebel
  2009-08-14 18:19 ` Zan Lynx
@ 2009-08-14 20:04 ` Denys Vlasenko
  2009-08-14 20:43   ` Robert Schwebel
  2009-08-15  6:14   ` New fast(?)-boot results on ARM Artem Bityutskiy
  2009-08-18 14:06 ` Sascha Hauer
  2 siblings, 2 replies; 43+ messages in thread
From: Denys Vlasenko @ 2009-08-14 20:04 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Fri, Aug 14, 2009 at 7:02 PM, Robert
Schwebel<r.schwebel@pengutronix.de> wrote:
> So we basically have 7 s for the kernel. The rest is userspace, which hasn't
> seen much optimization yet, other than trying to start the GUI application as
> early as possible, while doing all other init stuff in parallel. Adding "quiet"
> brings us another 300 ms.
>
> That's factor 70 away from the 110 ms boot time Tim has talked about some days
> ago (and he measured on an ARM cpu which had almost half the speed of this
> one), and I'm wondering what we can do to improve the boot time.
>
> Robert
>
> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
> [  2.395740] <  2.395740>
> [  2.395860] <  0.000120>
> [  0.000011] <  0.000011> U-Boot 2.0.0-rc9 (Aug  5 2009 - 10:05:58)
> [  0.000059] <  0.000048>
> [  0.003823] <  0.003764> Board: Phytec phyCORE-i.MX27
> [  0.010753] <  0.006930> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
> [  0.018711] <  0.007958> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
> [  0.026592] <  0.007881> imxfb@imxfb0: i.MX Framebuffer driver
> [  0.178655] <  0.152063> dev_protect: currently broken
> [  0.178736] <  0.000081> Using environment in NOR Flash
> [  0.182577] <  0.003841> initialising PLLs
> [  0.367142] <  0.184565> Malloc space: 0xa3f00000 -> 0xa7f00000 (size 64 MB)
> [  0.370568] <  0.003426> Stack space : 0xa3ef8000 -> 0xa3f00000 (size 32 kB)
> [  0.445993] <  0.075425> running /env/bin/init...
> [  0.870592] <  0.424599>
> [  0.874559] <  0.003967> Hit any key to stop autoboot:  0

boot loader is not fast. considering its simple task,
it can be made faster.

> [  1.326621] <  0.452062> loaded zImage from /dev/nand0.kernel.bb with size 1679656
> [  2.009996] <  0.683375> Uncompressing Linux............................................................................................................... done, booting the kernel.
> [  2.416999] <  0.407003> Linux version 2.6.31-rc4-g056f82f-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu Aug 6 08:37:19 CEST 2009

Other people already commented on this (kernel is too big)

> [  2.418729] <  0.001730> CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
> [  2.423081] <  0.004352> CPU: VIVT data cache, VIVT instruction cache
> [  2.426592] <  0.003511> Machine: phyCORE-i.MX27
...
> [  2.742628] <  0.016050> 0x000000360000-0x000004000000 : "root"
> [  3.058610] <  0.315982> UBI: attaching mtd7 to ubi0
> [  3.062878] <  0.004268> UBI: physical eraseblock size:   16384 bytes (16 KiB)
> [  3.070601] <  0.007723> UBI: logical eraseblock size:    15360 bytes
> [  3.070665] <  0.000064> UBI: smallest flash I/O unit:    512
> [  3.078564] <  0.007899> UBI: VID header offset:          512 (aligned 512)
> [  3.078609] <  0.000045> UBI: data offset:                1024
> [  5.006609] <  1.928000> UBI: attached mtd7 to ubi0
> [  5.013157] <  0.006548> UBI: MTD device name:            "root"

As others commented, ubi looks slow and you probably need to find out why.

> [  5.014566] <  0.001409> UBI: MTD device size:            60 MiB
> [  5.018660] <  0.004094> UBI: number of good PEBs:        3880
> [  5.022585] <  0.003925> UBI: number of bad PEBs:         0
> [  5.026797] <  0.004212> UBI: max. allowed volumes:       89
> [  5.026849] <  0.000052> UBI: wear-leveling threshold:    4096
> [  5.030779] <  0.003930> UBI: number of internal volumes: 1
> [  5.034583] <  0.003804> UBI: number of user volumes:     1
> [  5.046572] <  0.011989> UBI: available PEBs:             0
> [  5.046622] <  0.000050> UBI: total number of reserved PEBs: 3880
> [  5.046657] <  0.000035> UBI: number of PEBs reserved for bad PEB handling: 38
> [  5.050606] <  0.003949> UBI: max/mean erase counter: 2/0
> [  5.050668] <  0.000062> UBI: image sequence number: 0
> [  5.058619] <  0.007951> UBI: background thread "ubi_bgt0d" started, PID 215
> [  5.062620] <  0.004001> oprofile: using timer interrupt.
> [  5.070584] <  0.007964> TCP cubic registered
> [  5.070637] <  0.000053> NET: Registered protocol family 17
> [  5.074624] <  0.003987> RPC: Registered udp transport module.
> [  5.082616] <  0.007992> RPC: Registered tcp transport module.
> [  5.605159] <  0.522543> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
> [  6.602621] <  0.997462> IP-Config: Complete:
> [  6.606638] <  0.004017>      device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2,
> [  6.614588] <  0.007950>      host=192.168.23.197, domain=, nis-domain=(none),
> [  6.618652] <  0.004064>      bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath=

Well, this ~1 second is not really kernel's fault, it's DHCP delay.
But, do you need to do it at this moment?
You do not seem to be using networking filesystems.
You can run DHCP client in userspace.

> [  6.630579] <  0.011927> UBIFS: recovery needed
> [  6.662655] <  0.032076> UBIFS: recovery completed
> [  6.666587] <  0.003932> UBIFS: mounted UBI device 0, volume 1, name "root"
> [  6.670570] <  0.003983> UBIFS: file system size:   58490880 bytes (57120 KiB, 55 MiB, 3808 LEBs)
> [  6.678572] <  0.008002> UBIFS: journal size:       7741440 bytes (7560 KiB, 7 MiB, 504 LEBs)
> [  6.682573] <  0.004001> UBIFS: media format:       w4/r0 (latest is w4/r0)
> [  6.686572] <  0.003999> UBIFS: default compressor: lzo
> [  6.690562] <  0.003990> UBIFS: reserved for root:  0 bytes (0 KiB)
> [  6.694599] <  0.004037> VFS: Mounted root (ubifs filesystem) on device 0:12.
> [  6.702568] <  0.007969> Freeing init memory: 100K

So, about 4 seconds for kernel init (I subtracted DHCP and boot loader times).

And now userspace takes 7 seconds, mostly because it does not
parallelize boot process:

> init started: BusyBox v1.13.4 (2009-08-06 08:30:14 CEST)
> [  7.050625] <  0.187504> mounting filesystems...done.
> [  7.078608] <  0.027983> running rc.d services...

these services seem to start one by one:

> [  7.137924] <  0.059316> starting udev
> [  7.147925] <  0.010001> mounting tmpfs at /dev
> [  7.182299] <  0.034374> creating static nodes
> [  7.410613] <  0.228314> starting udevd...done
> [  8.811097] <  1.400484> waiting for devices...done
> [  8.918710] <  0.107613> syslogd starting
> [  9.050585] <  0.131875> tweaking ondemand scaling governor
> [ 10.010600] <  0.960015> Starting system message bus: dbus.
...
> [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
> [ 13.558613] <  0.011986>
> [ 13.690643] <  0.132030>        _            ____ ___  ____  _____
> [ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
> [ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
> [ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
> [ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
> [ 13.706573] <  0.003992> |_|          |___/
> [ 13.706622] <  0.000049>
> [ 13.725043] <  0.018421>
> [ 14.742608] <  1.017565>

-- 
vda

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 20:04 ` Denys Vlasenko
@ 2009-08-14 20:43   ` Robert Schwebel
  2009-08-15  5:59     ` Dirk Behme
                       ` (2 more replies)
  2009-08-15  6:14   ` New fast(?)-boot results on ARM Artem Bityutskiy
  1 sibling, 3 replies; 43+ messages in thread
From: Robert Schwebel @ 2009-08-14 20:43 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
> > rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
> > [  2.395740] <  2.395740>
> > [  2.395860] <  0.000120>
> > [  0.000011] <  0.000011> U-Boot 2.0.0-rc9 (Aug  5 2009 - 10:05:58)
> > [  0.000059] <  0.000048>
> > [  0.003823] <  0.003764> Board: Phytec phyCORE-i.MX27
> > [  0.010753] <  0.006930> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
> > [  0.018711] <  0.007958> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
> > [  0.026592] <  0.007881> imxfb@imxfb0: i.MX Framebuffer driver
> > [  0.178655] <  0.152063> dev_protect: currently broken
> > [  0.178736] <  0.000081> Using environment in NOR Flash
> > [  0.182577] <  0.003841> initialising PLLs
> > [  0.367142] <  0.184565> Malloc space: 0xa3f00000 -> 0xa7f00000 (size 64 MB)
> > [  0.370568] <  0.003426> Stack space : 0xa3ef8000 -> 0xa3f00000 (size 32 kB)
> > [  0.445993] <  0.075425> running /env/bin/init...
> > [  0.870592] <  0.424599>
> > [  0.874559] <  0.003967> Hit any key to stop autoboot:  0
>
> boot loader is not fast. considering its simple task, it can be made
> faster.

Yup, will check. Almost 1 s seems really long.

> > [  1.326621] <  0.452062> loaded zImage from /dev/nand0.kernel.bb with size 1679656
> > [  2.009996] <  0.683375> Uncompressing Linux............................................................................................................... done, booting the kernel.
> > [  2.416999] <  0.407003> Linux version 2.6.31-rc4-g056f82f-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu Aug 6 08:37:19 CEST 2009
> 
> Other people already commented on this (kernel is too big)

Not sure (the kernel is already customized for the board), but I'll take
a look again.

> > [  2.418729] <  0.001730> CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
> > [  2.423081] <  0.004352> CPU: VIVT data cache, VIVT instruction cache
> > [  2.426592] <  0.003511> Machine: phyCORE-i.MX27
> ...
> > [  2.742628] <  0.016050> 0x000000360000-0x000004000000 : "root"
> > [  3.058610] <  0.315982> UBI: attaching mtd7 to ubi0
> > [  3.062878] <  0.004268> UBI: physical eraseblock size:   16384 bytes (16 KiB)
> > [  3.070601] <  0.007723> UBI: logical eraseblock size:    15360 bytes
> > [  3.070665] <  0.000064> UBI: smallest flash I/O unit:    512
> > [  3.078564] <  0.007899> UBI: VID header offset:          512 (aligned 512)
> > [  3.078609] <  0.000045> UBI: data offset:                1024
> > [  5.006609] <  1.928000> UBI: attached mtd7 to ubi0
> > [  5.013157] <  0.006548> UBI: MTD device name:            "root"
> 
> As others commented, ubi looks slow and you probably need to find out why.

So it seems like our UBS is much slower than usual?

> > [  5.014566] <  0.001409> UBI: MTD device size:            60 MiB
> > [  5.018660] <  0.004094> UBI: number of good PEBs:        3880
> > [  5.022585] <  0.003925> UBI: number of bad PEBs:         0
> > [  5.026797] <  0.004212> UBI: max. allowed volumes:       89
> > [  5.026849] <  0.000052> UBI: wear-leveling threshold:    4096
> > [  5.030779] <  0.003930> UBI: number of internal volumes: 1
> > [  5.034583] <  0.003804> UBI: number of user volumes:     1
> > [  5.046572] <  0.011989> UBI: available PEBs:             0
> > [  5.046622] <  0.000050> UBI: total number of reserved PEBs: 3880
> > [  5.046657] <  0.000035> UBI: number of PEBs reserved for bad PEB handling: 38
> > [  5.050606] <  0.003949> UBI: max/mean erase counter: 2/0
> > [  5.050668] <  0.000062> UBI: image sequence number: 0
> > [  5.058619] <  0.007951> UBI: background thread "ubi_bgt0d" started, PID 215
> > [  5.062620] <  0.004001> oprofile: using timer interrupt.
> > [  5.070584] <  0.007964> TCP cubic registered
> > [  5.070637] <  0.000053> NET: Registered protocol family 17
> > [  5.074624] <  0.003987> RPC: Registered udp transport module.
> > [  5.082616] <  0.007992> RPC: Registered tcp transport module.
> > [  5.605159] <  0.522543> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
> > [  6.602621] <  0.997462> IP-Config: Complete:
> > [  6.606638] <  0.004017>      device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2,
> > [  6.614588] <  0.007950>      host=192.168.23.197, domain=, nis-domain=(none),
> > [  6.618652] <  0.004064>      bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath=
> 
> Well, this ~1 second is not really kernel's fault, it's DHCP delay.
> But, do you need to do it at this moment?
> You do not seem to be using networking filesystems.
> You can run DHCP client in userspace.

The board has ip autoconfig configured in, because we also use tftp/nfs
boot for development. But it had been disabled on the commandline:

ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0:::

That shouldn't do dhcp, right?

> So, about 4 seconds for kernel init (I subtracted DHCP and boot loader times).
> 
> And now userspace takes 7 seconds, mostly because it does not
> parallelize boot process:
> 
> > init started: BusyBox v1.13.4 (2009-08-06 08:30:14 CEST)
> > [  7.050625] <  0.187504> mounting filesystems...done.
> > [  7.078608] <  0.027983> running rc.d services...
> 
> these services seem to start one by one:
> 
> > [  7.137924] <  0.059316> starting udev
> > [  7.147925] <  0.010001> mounting tmpfs at /dev
> > [  7.182299] <  0.034374> creating static nodes
> > [  7.410613] <  0.228314> starting udevd...done
> > [  8.811097] <  1.400484> waiting for devices...done
> > [  8.918710] <  0.107613> syslogd starting
> > [  9.050585] <  0.131875> tweaking ondemand scaling governor
> > [ 10.010600] <  0.960015> Starting system message bus: dbus.
> ...
> > [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)

Well, the application is started in the very beginning, so all the
services should start while the app is already gaining speed.

Thanks,

rsc
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 18:57   ` Robert Schwebel
@ 2009-08-14 21:01     ` Linus Walleij
  2009-08-14 21:15       ` Robert Schwebel
  2009-08-14 21:35       ` Zan Lynx
  0 siblings, 2 replies; 43+ messages in thread
From: Linus Walleij @ 2009-08-14 21:01 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Zan Lynx, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

2009/8/14 Robert Schwebel <r.schwebel@pengutronix.de>:
> On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote:

>> > That's factor 70 away from the 110 ms boot time Tim has talked about
>> > some days ago (and he measured on an ARM cpu which had almost half
>> > the speed of this one), and I'm wondering what we can do to improve
>> > the boot time.
>>
>> 2.4s in uncompression? That seems like an obvious target for
>> improvement.
>
> Indeed, we'll check that.

We got rid of uncompression on a flash-based system vastly improving
boot time. The reason is that compressed kernels are faster only when
the throughput to the persistent storage is lower than the decompression
throughput, and on typical embedded systems with DMA the throughput to
memory outperforms the CPU-based decompression.

Of course it depends on a lot of stuff like performance of flash controller,
kernel storage filesystem performance, DMA controller performance,
cache architecture etc so it's individual per-system.

Linus Walleij

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 21:01     ` Linus Walleij
@ 2009-08-14 21:15       ` Robert Schwebel
  2009-08-14 21:35       ` Zan Lynx
  1 sibling, 0 replies; 43+ messages in thread
From: Robert Schwebel @ 2009-08-14 21:15 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Robert Schwebel, Zan Lynx, linux-kernel, linux-embedded,
	Arjan van de Ven, Tim Bird, kernel

On Fri, Aug 14, 2009 at 11:01:58PM +0200, Linus Walleij wrote:
> >> > That's factor 70 away from the 110 ms boot time Tim has talked about
> >> > some days ago (and he measured on an ARM cpu which had almost half
> >> > the speed of this one), and I'm wondering what we can do to improve
> >> > the boot time.
> >>
> >> 2.4s in uncompression? That seems like an obvious target for
> >> improvement.
> >
> > Indeed, we'll check that.
> 
> We got rid of uncompression on a flash-based system vastly improving
> boot time. The reason is that compressed kernels are faster only when
> the throughput to the persistent storage is lower than the decompression
> throughput, and on typical embedded systems with DMA the throughput to
> memory outperforms the CPU-based decompression.
> 
> Of course it depends on a lot of stuff like performance of flash controller,
> kernel storage filesystem performance, DMA controller performance,
> cache architecture etc so it's individual per-system.

We have also done that on NOR based systems, but I'm not sure if it will
work out for NAND as well.

Thanks,

rsc
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 21:01     ` Linus Walleij
  2009-08-14 21:15       ` Robert Schwebel
@ 2009-08-14 21:35       ` Zan Lynx
  2009-08-15  6:21         ` Artem Bityutskiy
  1 sibling, 1 reply; 43+ messages in thread
From: Zan Lynx @ 2009-08-14 21:35 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

Linus Walleij wrote:
> 2009/8/14 Robert Schwebel <r.schwebel@pengutronix.de>:
>> On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote:
> 
>>>> That's factor 70 away from the 110 ms boot time Tim has talked about
>>>> some days ago (and he measured on an ARM cpu which had almost half
>>>> the speed of this one), and I'm wondering what we can do to improve
>>>> the boot time.
>>> 2.4s in uncompression? That seems like an obvious target for
>>> improvement.
>> Indeed, we'll check that.
> 
> We got rid of uncompression on a flash-based system vastly improving
> boot time. The reason is that compressed kernels are faster only when
> the throughput to the persistent storage is lower than the decompression
> throughput, and on typical embedded systems with DMA the throughput to
> memory outperforms the CPU-based decompression.

I thought of another thing to check related to slow decompression. If 
the kernel, bootloader or hardware is in charge of setting CPU power and 
speed scaling, then you should check that it boots with the CPU set at 
maximum speed instead of slowest.

-- 
Zan Lynx
zlynx@acm.org

"Knowledge is Power.  Power Corrupts.  Study Hard.  Be Evil."

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 20:43   ` Robert Schwebel
@ 2009-08-15  5:59     ` Dirk Behme
  2009-08-15 10:35     ` Johannes Stezenbach
  2009-08-17 19:15     ` Tim Bird
  2 siblings, 0 replies; 43+ messages in thread
From: Dirk Behme @ 2009-08-15  5:59 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Denys Vlasenko, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

Robert Schwebel wrote:
> On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
>>> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
>>> [  2.395740] <  2.395740>
>>> [  2.395860] <  0.000120>
>>> [  0.000011] <  0.000011> U-Boot 2.0.0-rc9 (Aug  5 2009 - 10:05:58)
>>> [  0.000059] <  0.000048>
>>> [  0.003823] <  0.003764> Board: Phytec phyCORE-i.MX27
>>> [  0.010753] <  0.006930> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
>>> [  0.018711] <  0.007958> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
>>> [  0.026592] <  0.007881> imxfb@imxfb0: i.MX Framebuffer driver
>>> [  0.178655] <  0.152063> dev_protect: currently broken
>>> [  0.178736] <  0.000081> Using environment in NOR Flash
>>> [  0.182577] <  0.003841> initialising PLLs
>>> [  0.367142] <  0.184565> Malloc space: 0xa3f00000 -> 0xa7f00000 (size 64 MB)
>>> [  0.370568] <  0.003426> Stack space : 0xa3ef8000 -> 0xa3f00000 (size 32 kB)
>>> [  0.445993] <  0.075425> running /env/bin/init...
>>> [  0.870592] <  0.424599>
>>> [  0.874559] <  0.003967> Hit any key to stop autoboot:  0
>> boot loader is not fast. considering its simple task, it can be made
>> faster.
> 
> Yup, will check. Almost 1 s seems really long.

Some things to check regarding this and kernel uncompression (copy):

- How often is (compressed/uncompressed) kernel data copied? Once the 
compressed one from storage (NOR/NAND?) to RAM by boot loader? Then by 
kernel's uncompression from RAM to it's final location in RAM?

- For boot loader and uncompression, is D-Cache enabled?

- Is data (image) copy done by optimized functions? Using (a) DMA or 
at least (b) some optimized memcpy using ARM's ldmia/stmia?

Best regards

Dirk

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 20:04 ` Denys Vlasenko
  2009-08-14 20:43   ` Robert Schwebel
@ 2009-08-15  6:14   ` Artem Bityutskiy
  1 sibling, 0 replies; 43+ messages in thread
From: Artem Bityutskiy @ 2009-08-15  6:14 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

On 08/14/2009 11:04 PM, Denys Vlasenko wrote:
>> [  2.742628]<    0.016050>  0x000000360000-0x000004000000 : "root"
>> [  3.058610]<    0.315982>  UBI: attaching mtd7 to ubi0
>> [  3.062878]<    0.004268>  UBI: physical eraseblock size:   16384 bytes (16 KiB)
>> [  3.070601]<    0.007723>  UBI: logical eraseblock size:    15360 bytes
>> [  3.070665]<    0.000064>  UBI: smallest flash I/O unit:    512
>> [  3.078564]<    0.007899>  UBI: VID header offset:          512 (aligned 512)
>> [  3.078609]<    0.000045>  UBI: data offset:                1024
>> [  5.006609]<    1.928000>  UBI: attached mtd7 to ubi0
>> [  5.013157]<    0.006548>  UBI: MTD device name:            "root"
>
> As others commented, ubi looks slow and you probably need to find out why.

Right. UBI is rather slow in attaching MTD devices. Everything is explained
here:

http://www.linux-mtd.infradead.org/doc/ubi.html#L_scalability
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_scalability

There is not very much you can do to speed it up but implement UBI2. UBIFS
would stay intact. There were discussions about this and it does not seem
to be impossibly difficult to do UBI2. Here are few ideas:

http://www.linux-mtd.infradead.org/faq/ubi.html#L_attach_faster

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 21:35       ` Zan Lynx
@ 2009-08-15  6:21         ` Artem Bityutskiy
  0 siblings, 0 replies; 43+ messages in thread
From: Artem Bityutskiy @ 2009-08-15  6:21 UTC (permalink / raw)
  To: Zan Lynx
  Cc: Linus Walleij, Robert Schwebel, linux-kernel, linux-embedded,
	Arjan van de Ven, Tim Bird, kernel

On 08/15/2009 12:35 AM, Zan Lynx wrote:
> Linus Walleij wrote:
>> 2009/8/14 Robert Schwebel <r.schwebel@pengutronix.de>:
>>> On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote:
>>
>>>>> That's factor 70 away from the 110 ms boot time Tim has talked about
>>>>> some days ago (and he measured on an ARM cpu which had almost half
>>>>> the speed of this one), and I'm wondering what we can do to improve
>>>>> the boot time.
>>>> 2.4s in uncompression? That seems like an obvious target for
>>>> improvement.
>>> Indeed, we'll check that.
>>
>> We got rid of uncompression on a flash-based system vastly improving
>> boot time. The reason is that compressed kernels are faster only when
>> the throughput to the persistent storage is lower than the decompression
>> throughput, and on typical embedded systems with DMA the throughput to
>> memory outperforms the CPU-based decompression.
>
> I thought of another thing to check related to slow decompression. If
> the kernel, bootloader or hardware is in charge of setting CPU power and
> speed scaling, then you should check that it boots with the CPU set at
> maximum speed instead of slowest.

zlib is slow on decompression, and lzo is much faster. So if you implement
lzo compression, you'll probably speed things up a little as well. I saw
some discussions about this on lkml. Having no compression at all may also
be a good thing to try.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 20:43   ` Robert Schwebel
  2009-08-15  5:59     ` Dirk Behme
@ 2009-08-15 10:35     ` Johannes Stezenbach
  2009-08-18 10:06       ` Marco Stornelli
  2009-09-04 16:16       ` Wolfram Sang
  2009-08-17 19:15     ` Tim Bird
  2 siblings, 2 replies; 43+ messages in thread
From: Johannes Stezenbach @ 2009-08-15 10:35 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Denys Vlasenko, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

On Fri, Aug 14, 2009 at 10:43:05PM +0200, Robert Schwebel wrote:
> On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
> > > rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"

Now that microcom is in Debian sid (thanks!), where can I find ptx_ts?
It seems to be quite useful.


> > > [  0.874559] <  0.003967> Hit any key to stop autoboot:  0
> >
> > boot loader is not fast. considering its simple task, it can be made
> > faster.
> 
> Yup, will check. Almost 1 s seems really long.


I'm working on a SoC with a 200MHz ARM926EJ-S.  We managed to get
to 1.5sec from power-on to "starting init". The main difference to
your platform seems to be that we use NOR flash.  The kernel is
not optimized, it still has some debug options turned on and
is used during development. (however, the 1.5sec is with "quiet")
The root fs is cramfs. The kernel version is 2.6.20.

For u-boot we enabled the D-cache which gave a decent speed up
(on ARM926EJ-S this requires one to set up page tables and enable
MMU, but it's not that difficult). I don't have the numbers here
but I think it still takes ~300ms in u-boot, and ~1.2s for the kernel boot.


> > > [  1.326621] <  0.452062> loaded zImage from /dev/nand0.kernel.bb with size 1679656
> > > [  2.009996] <  0.683375> Uncompressing Linux............................................................................................................... done, booting the kernel.
> > > [  2.416999] <  0.407003> Linux version 2.6.31-rc4-g056f82f-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu Aug 6 08:37:19 CEST 2009
> > 
> > Other people already commented on this (kernel is too big)
> 
> Not sure (the kernel is already customized for the board), but I'll take
> a look again.

We are booting an uncomressed kernel (~2.8MB).  Uncompressing (running the uncompressor
XIP in NOR flash) took ~0.5s longer than copying 2.8MB from flash to RAM.
BTW, we are using uImage and set verify=no in u-boot. We use u-boot-1.3.0.


> > > [  5.082616] <  0.007992> RPC: Registered tcp transport module.
> > > [  5.605159] <  0.522543> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.

What is happening here? Waiting for eth link negotiation?

> > > [  6.602621] <  0.997462> IP-Config: Complete:
> > > [  6.606638] <  0.004017>      device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2,
> > > [  6.614588] <  0.007950>      host=192.168.23.197, domain=, nis-domain=(none),
> > > [  6.618652] <  0.004064>      bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath=
> > 
> > Well, this ~1 second is not really kernel's fault, it's DHCP delay.
> > But, do you need to do it at this moment?
> > You do not seem to be using networking filesystems.
> > You can run DHCP client in userspace.
> 
> The board has ip autoconfig configured in, because we also use tftp/nfs
> boot for development. But it had been disabled on the commandline:
> 
> ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0:::
> 
> That shouldn't do dhcp, right?

Try to boot with eth cable unplugged, see if it hangs in IP-config.
If it were doing static configuration it would be faster.

However, unless you need ethernet to boot (NFS root) I'd suggest
doing eth config in userspace.


> > > [  7.137924] <  0.059316> starting udev
> > > [  7.147925] <  0.010001> mounting tmpfs at /dev
> > > [  7.182299] <  0.034374> creating static nodes
> > > [  7.410613] <  0.228314> starting udevd...done
> > > [  8.811097] <  1.400484> waiting for devices...done

And suddenly devtmpfs sounds like a good idea ;-)

We use static device nodes during boot, and later
setup busybox mdev for hotplug.


Johannes

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 20:43   ` Robert Schwebel
  2009-08-15  5:59     ` Dirk Behme
  2009-08-15 10:35     ` Johannes Stezenbach
@ 2009-08-17 19:15     ` Tim Bird
  2009-08-17 22:35       ` new ipdelay= option for faster netboot (was Re: New fast(?)-boot results on ARM) Tim Bird
  2 siblings, 1 reply; 43+ messages in thread
From: Tim Bird @ 2009-08-17 19:15 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Denys Vlasenko, linux-kernel, linux-embedded, Arjan van de Ven, kernel

Robert Schwebel wrote:
> On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
>>> [ �5.082616] < �0.007992> RPC: Registered tcp transport module.
>>> [ �5.605159] < �0.522543> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
>>> [ �6.602621] < �0.997462> IP-Config: Complete:
>>> [ �6.606638] < �0.004017> � � �device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2,
>>> [ �6.614588] < �0.007950> � � �host=192.168.23.197, domain=, nis-domain=(none),
>>> [ �6.618652] < �0.004064> � � �bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath=
>> Well, this ~1 second is not really kernel's fault, it's DHCP delay.
>> But, do you need to do it at this moment?
>> You do not seem to be using networking filesystems.
>> You can run DHCP client in userspace.
> 
> The board has ip autoconfig configured in, because we also use tftp/nfs
> boot for development. But it had been disabled on the commandline:
> 
> ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0:::
> 
> That shouldn't do dhcp, right?

I think it doesn't, but I'm not positive.  The DHCP trasmissions
themselves don't take very long.  There are some very long timeouts
in the network code paths, which appear to be used whether you specify
a static address or not.

See the definitions of CONF_PRE_OPEN and CON_POST_OPEN
in net/ipv4/ipconfig.c

They are set to ridiculously long values.  In my experience,
you can cut them down considerably with no dangerous side
effects (but I haven't asked the network guys about the
possible downsides).

Here's a patch which I've used in the past.  (Sorry
if it doesn't apply cleanly, I just extracted it from
a PDF and the whitespace may have gotten messed up.
It's short enough that you can hand-edit the files if
there's a problem.)

I'd like to hear back, if you apply this, whether it shortens
the network startup time for you.

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 42065ff..e42d83f 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -86,8 +86,10 @@
 #endif

 /* Define the friendly delay before and after opening net devices */
-#define CONF_PRE_OPEN 500 /* Before opening: 1/2 second */
-#define CONF_POST_OPEN 1 /* After opening: 1 second */
+/*#define CONF_PRE_OPEN 500 /* Before opening: 1/2 second */
+/*#define CONF_POST_OPEN 1 /* After opening: 1 second */
+#define CONF_PRE_OPEN 5 /* Before opening: 5 milli seconds */
+#define CONF_POST_OPEN 10 /* After opening: 10 milli seconds */

 /* Define the timeout for waiting for a DHCP/BOOTP/RARP reply */
 #define CONF_OPEN_RETRIES 2 /* (Re)open devices twice */
@@ -1292,7 +1294,7 @@ static int __init ip_auto_config(void)
 		return -1;
 	/* Give drivers a chance to settle */
-	ssleep(CONF_POST_OPEN);
+	msleep(CONF_POST_OPEN);

 	/*
 	 * If the config information is insufficient (e.g., our IP address or

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* new ipdelay= option for faster netboot (was Re: New fast(?)-boot results on ARM)
  2009-08-17 19:15     ` Tim Bird
@ 2009-08-17 22:35       ` Tim Bird
  2009-08-18  1:03         ` new ipdelay= option for faster netboot David Miller
  0 siblings, 1 reply; 43+ messages in thread
From: Tim Bird @ 2009-08-17 22:35 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Denys Vlasenko, linux-kernel, linux-embedded, Arjan van de Ven, kernel

Tim Bird wrote:
> See the definitions of CONF_PRE_OPEN and CON_POST_OPEN
> in net/ipv4/ipconfig.c
> 
> They are set to ridiculously long values.  In my experience,
> you can cut them down considerably with no dangerous side
> effects (but I haven't asked the network guys about the
> possible downsides).

It turns out that others have seen this delay.  Simon
Arlott recently posted a patch to make the delay avoidable
at boot time from the kernel command line.

See http://patchwork.kernel.org/patch/31678/
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-17 22:35       ` new ipdelay= option for faster netboot (was Re: New fast(?)-boot results on ARM) Tim Bird
@ 2009-08-18  1:03         ` David Miller
  2009-08-18  1:24           ` Tim Bird
  2009-08-18  1:31           ` Rick Jones
  0 siblings, 2 replies; 43+ messages in thread
From: David Miller @ 2009-08-18  1:03 UTC (permalink / raw)
  To: tim.bird
  Cc: r.schwebel, vda.linux, linux-kernel, linux-embedded, arjan,
	kernel, netdev

From: Tim Bird <tim.bird@am.sony.com>
Date: Mon, 17 Aug 2009 15:35:01 -0700

> Tim Bird wrote:
>> See the definitions of CONF_PRE_OPEN and CON_POST_OPEN
>> in net/ipv4/ipconfig.c
>> 
>> They are set to ridiculously long values.  In my experience,
>> you can cut them down considerably with no dangerous side
>> effects (but I haven't asked the network guys about the
>> possible downsides).
> 
> It turns out that others have seen this delay.  Simon
> Arlott recently posted a patch to make the delay avoidable
> at boot time from the kernel command line.
> 
> See http://patchwork.kernel.org/patch/31678/

"Rediculiously long" is a relative term.

I have card/switch combinations that take up to 10 seconds to
negotiate a proper link.

So what's there now is actually a quite agressive setting.

And BTW, discussions about stuff like this belong on
netdev@vger.kernel.org, which has been added to the CC:

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:03         ` new ipdelay= option for faster netboot David Miller
@ 2009-08-18  1:24           ` Tim Bird
  2009-08-18  1:27             ` David Miller
  2009-08-18  1:31           ` Rick Jones
  1 sibling, 1 reply; 43+ messages in thread
From: Tim Bird @ 2009-08-18  1:24 UTC (permalink / raw)
  To: David Miller
  Cc: r.schwebel, vda.linux, linux-kernel, linux-embedded, arjan,
	kernel, netdev

David Miller wrote:
> From: Tim Bird <tim.bird@am.sony.com>
> Date: Mon, 17 Aug 2009 15:35:01 -0700
> 
>> Tim Bird wrote:
>>> See the definitions of CONF_PRE_OPEN and CON_POST_OPEN
>>> in net/ipv4/ipconfig.c
>>>
>>> They are set to ridiculously long values.  In my experience,
>>> you can cut them down considerably with no dangerous side
>>> effects (but I haven't asked the network guys about the
>>> possible downsides).
>> It turns out that others have seen this delay.  Simon
>> Arlott recently posted a patch to make the delay avoidable
>> at boot time from the kernel command line.
>>
>> See http://patchwork.kernel.org/patch/31678/
> 
> "Rediculiously long" is a relative term.
No offense intended.  I could have phrased this
better.  The delays were a few orders of
magnitude longer than apparently needed, on my
embedded test systems with ethernet.  I didn't
try eliminating them completely, as in the Arlott patch.

1.5 seconds is a long time for me.  My bootup time budget for
the kernel ranges from 0.5 to 3.0 seconds, depending on the
product.

> I have card/switch combinations that take up to 10 seconds to
> negotiate a proper link.

What types of delays are these timeouts supposed to
cover?  Networking delays or hardware bring-up delays?
(Or both)?  If for networking delays, is this for all
types of networks, or just some (e.g. ones that create
virtual circuits)?

I'm trying to get a sense for whether the card/switch
combinations that would take this long would be encountered
in the types of embedded devices I code for.  (TVs, camcorders,
etc.)

>
> So what's there now is actually a quite agressive setting.
> 
> And BTW, discussions about stuff like this belong on
> netdev@vger.kernel.org, which has been added to the CC:

I was going to wait to see if this solved Robert's
problem, before widening the discussion.  But I'm happy
to find out more about these delays now.

Thanks,
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:24           ` Tim Bird
@ 2009-08-18  1:27             ` David Miller
  2009-08-18  1:40               ` Tim Bird
  2009-08-18  4:56               ` Denys Vlasenko
  0 siblings, 2 replies; 43+ messages in thread
From: David Miller @ 2009-08-18  1:27 UTC (permalink / raw)
  To: tim.bird
  Cc: r.schwebel, vda.linux, linux-kernel, linux-embedded, arjan,
	kernel, netdev

From: Tim Bird <tim.bird@am.sony.com>
Date: Mon, 17 Aug 2009 18:24:26 -0700

> David Miller wrote:
>> I have card/switch combinations that take up to 10 seconds to
>> negotiate a proper link.
> 
> What types of delays are these timeouts supposed to
> cover?

The problem is that if you don't first give at least some time for the
link to come up, the remaining time it takes the link to come up will
end up chewing into the actual bootp/dhcp protocol timeouts.  And
that's what we're trying to avoid.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:03         ` new ipdelay= option for faster netboot David Miller
  2009-08-18  1:24           ` Tim Bird
@ 2009-08-18  1:31           ` Rick Jones
  2009-08-18  2:45             ` david
  1 sibling, 1 reply; 43+ messages in thread
From: Rick Jones @ 2009-08-18  1:31 UTC (permalink / raw)
  To: David Miller
  Cc: tim.bird, r.schwebel, vda.linux, linux-kernel, linux-embedded,
	arjan, kernel, netdev

David Miller wrote:
> I have card/switch combinations that take up to 10 seconds to
> negotiate a proper link.

Gotta love it when things adhere to specs...

rick jones
has also experienced nic/whatnot combinations that are far from IEEE specs... :(

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:27             ` David Miller
@ 2009-08-18  1:40               ` Tim Bird
  2009-08-18  1:56                 ` David Miller
  2009-08-19 11:57                 ` Jamie Lokier
  2009-08-18  4:56               ` Denys Vlasenko
  1 sibling, 2 replies; 43+ messages in thread
From: Tim Bird @ 2009-08-18  1:40 UTC (permalink / raw)
  To: David Miller
  Cc: r.schwebel, vda.linux, linux-kernel, linux-embedded, arjan,
	kernel, netdev

David Miller wrote:
> From: Tim Bird <tim.bird@am.sony.com>
> Date: Mon, 17 Aug 2009 18:24:26 -0700
>
>> David Miller wrote:
>>> I have card/switch combinations that take up to 10 seconds to
>>> negotiate a proper link.
>> What types of delays are these timeouts supposed to
>> cover?
>
> The problem is that if you don't first give at least some time for the
> link to come up, the remaining time it takes the link to come up will
> end up chewing into the actual bootp/dhcp protocol timeouts.  And
> that's what we're trying to avoid.

What link?  I'm not that familiar with networking.

Assuming I'm using ethernet, what link needs to come up?
Is this something to do with power propagation to the
physical wire?  Is there some MAC layer negotiation
between the card and the switch?  Is it the time for
the switch to do speed detection?

And, can any of this be more accurately determined
or guessed-at with knowledge of the onboard hardware?
Or is it dependent on external conditions?

Where would be a good place to find out more about
startup delays for networking chips and/or protocols?

Our usual solution is to kick the can down the road
and let user-space initialize anything that takes a long
time,  while we do other stuff like focus the camera or display
the TV picture.  It would be good to learn more about
this.
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:40               ` Tim Bird
@ 2009-08-18  1:56                 ` David Miller
  2009-08-19 11:57                 ` Jamie Lokier
  1 sibling, 0 replies; 43+ messages in thread
From: David Miller @ 2009-08-18  1:56 UTC (permalink / raw)
  To: tim.bird
  Cc: r.schwebel, vda.linux, linux-kernel, linux-embedded, arjan,
	kernel, netdev

From: Tim Bird <tim.bird@am.sony.com>
Date: Mon, 17 Aug 2009 18:40:48 -0700

> David Miller wrote:
>> From: Tim Bird <tim.bird@am.sony.com>
>> Date: Mon, 17 Aug 2009 18:24:26 -0700
>>
>>> David Miller wrote:
>>>> I have card/switch combinations that take up to 10 seconds to
>>>> negotiate a proper link.
>>> What types of delays are these timeouts supposed to
>>> cover?
>>
>> The problem is that if you don't first give at least some time for the
>> link to come up, the remaining time it takes the link to come up will
>> end up chewing into the actual bootp/dhcp protocol timeouts.  And
>> that's what we're trying to avoid.
> 
> What link?  I'm not that familiar with networking.

The speed and duplex settings which are negotiated or forced between
the ethernet card and whatever is at the other end of the cable.

> Assuming I'm using ethernet, what link needs to come up?

All modern ethernet cards do autonegotiation of link parameters
with whatever is at the other end of the ethernet cable.  Cards
created ages ago which only support 10MB half-duplex typically
do not support autonegotiation at all.

This autonegotiation works like a protocol where the two link partners
go back and forth trying to figure out the best speed and duplex
settings to use.  There are advertisements of link capabilities and
stuff like that.

It should happen almost instantaneously, but there are millions
upon millions of cruddy parts out there, and some of them take
a long time to go through this negotiation process.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:31           ` Rick Jones
@ 2009-08-18  2:45             ` david
  2009-08-18  4:56               ` Willy Tarreau
  0 siblings, 1 reply; 43+ messages in thread
From: david @ 2009-08-18  2:45 UTC (permalink / raw)
  To: Rick Jones
  Cc: David Miller, tim.bird, r.schwebel, vda.linux, linux-kernel,
	linux-embedded, arjan, kernel, netdev

On Mon, 17 Aug 2009, Rick Jones wrote:

> David Miller wrote:
>> I have card/switch combinations that take up to 10 seconds to
>> negotiate a proper link.
>
> Gotta love it when things adhere to specs...

the default on Cisco switches is to wait 30 seconds before fully enabling 
the port so that it can listen for spanning tree broadcasts.

for windows systems this doesn't cause any problems (they take long enough 
to boot), but for a tightly configured linux box it can be fully booted 
long before the switch decides to enable the port (for that matter, I have 
a bare-metal install process that takes about 90 seconds from hitting 
power, from a CD. I could probably speed things up by switching to USB 
boot media)

David Lang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  2:45             ` david
@ 2009-08-18  4:56               ` Willy Tarreau
  0 siblings, 0 replies; 43+ messages in thread
From: Willy Tarreau @ 2009-08-18  4:56 UTC (permalink / raw)
  To: david
  Cc: Rick Jones, David Miller, tim.bird, r.schwebel, vda.linux,
	linux-kernel, linux-embedded, arjan, kernel, netdev

On Mon, Aug 17, 2009 at 07:45:33PM -0700, david@lang.hm wrote:
> On Mon, 17 Aug 2009, Rick Jones wrote:
> 
> >David Miller wrote:
> >>I have card/switch combinations that take up to 10 seconds to
> >>negotiate a proper link.
> >
> >Gotta love it when things adhere to specs...
> 
> the default on Cisco switches is to wait 30 seconds before fully enabling 
> the port so that it can listen for spanning tree broadcasts.

And this causes a lot of trouble in high availability environments,
because the link is up but unusable. So if you're using it as a primary
bond link you can lose connectivity for that time. Fortunately, you can
configure the port in "switchport mode access", "portfast" mode to avoid
this annoying delay.

Willy


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:27             ` David Miller
  2009-08-18  1:40               ` Tim Bird
@ 2009-08-18  4:56               ` Denys Vlasenko
  2009-08-18  5:00                 ` David Miller
  1 sibling, 1 reply; 43+ messages in thread
From: Denys Vlasenko @ 2009-08-18  4:56 UTC (permalink / raw)
  To: David Miller
  Cc: tim.bird, r.schwebel, linux-kernel, linux-embedded, arjan,
	kernel, netdev

On Tuesday 18 August 2009 03:27, David Miller wrote:
> From: Tim Bird <tim.bird@am.sony.com>
> Date: Mon, 17 Aug 2009 18:24:26 -0700
> 
> > David Miller wrote:
> >> I have card/switch combinations that take up to 10 seconds to
> >> negotiate a proper link.
> > 
> > What types of delays are these timeouts supposed to
> > cover?
> 
> The problem is that if you don't first give at least some time for the
> link to come up, the remaining time it takes the link to come up will
> end up chewing into the actual bootp/dhcp protocol timeouts.  And
> that's what we're trying to avoid.

But in this case, they assign a static IP. They do not use DHCP.
So they pay this time penalty even if they wouldn't use networking
until sometime later (or never).

Since DHCP and any other networking activity like TCP connects
accomodate packet loss, things should work even without any delay
in kernel IP config code. The delay will be just shifted to the
moment when first DHCP/TCP/whatever negotiation happens.

If dropping delays altogether sounds too big a change,
then it makes sense at least to allow people to tweak it with
ipdelay=TIME_IN_MS

-- 
vda


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  4:56               ` Denys Vlasenko
@ 2009-08-18  5:00                 ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2009-08-18  5:00 UTC (permalink / raw)
  To: vda.linux
  Cc: tim.bird, r.schwebel, linux-kernel, linux-embedded, arjan,
	kernel, netdev

From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Tue, 18 Aug 2009 06:56:53 +0200

> Since DHCP and any other networking activity like TCP connects
> accomodate packet loss, things should work even without any delay
> in kernel IP config code. The delay will be just shifted to the
> moment when first DHCP/TCP/whatever negotiation happens.

Until the link is up, the packet scheduler just holds onto packets
and queued them up.

When the link comes up, this queue is released and the packets
sent out.

That's why it's beneficial to wait some time until the link
comes up before we start sending stuff out.  Because otherwise
any timeouts used will be inaccurate.

This could code be spiffed up to wait for the link up indication
on devices it cares about.  Feel free to code that up :-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-15 10:35     ` Johannes Stezenbach
@ 2009-08-18 10:06       ` Marco Stornelli
  2009-08-18 10:21         ` Robert Schwebel
  2009-09-04 16:16       ` Wolfram Sang
  1 sibling, 1 reply; 43+ messages in thread
From: Marco Stornelli @ 2009-08-18 10:06 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Johannes Stezenbach, Denys Vlasenko, linux-kernel,
	linux-embedded, Arjan van de Ven, Tim Bird, kernel

Johannes Stezenbach wrote:
> On Fri, Aug 14, 2009 at 10:43:05PM +0200, Robert Schwebel wrote:
>> On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
>>>> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
> 
>>>> [  7.137924] <  0.059316> starting udev
>>>> [  7.147925] <  0.010001> mounting tmpfs at /dev
>>>> [  7.182299] <  0.034374> creating static nodes
>>>> [  7.410613] <  0.228314> starting udevd...done
>>>> [  8.811097] <  1.400484> waiting for devices...done
> 
> And suddenly devtmpfs sounds like a good idea ;-)
> 
> We use static device nodes during boot, and later
> setup busybox mdev for hotplug.
> 
> 
> Johannes
> 

Yeah, I agree, do you really need udevd, device file creation at every
start-up in /dev? Usually static devices nodes and mdev for hotplug are
enough or at least you could use a simple script to create only once
time the devices file (mdev -s). About the fs, do you really need a
rootfs with ubifs? I mean, you could "split" your fs. You could use a
read-only fs (SquashFS for example) for your root-fs, ubifs for
permanent storage data (mounted under /data for example) and a ram fs
for volatile data.

Marco

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 10:06       ` Marco Stornelli
@ 2009-08-18 10:21         ` Robert Schwebel
  2009-08-18 10:34           ` Alex Riesen
  0 siblings, 1 reply; 43+ messages in thread
From: Robert Schwebel @ 2009-08-18 10:21 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: Johannes Stezenbach, Denys Vlasenko, linux-kernel,
	linux-embedded, Arjan van de Ven, Tim Bird, kernel

Marco,

On Tue, Aug 18, 2009 at 12:06:48PM +0200, Marco Stornelli wrote:
> Yeah, I agree, do you really need udevd, device file creation at every
> start-up in /dev? Usually static devices nodes and mdev for hotplug are
> enough or at least you could use a simple script to create only once
> time the devices file (mdev -s). About the fs, do you really need a
> rootfs with ubifs? I mean, you could "split" your fs. You could use a
> read-only fs (SquashFS for example) for your root-fs, ubifs for
> permanent storage data (mounted under /data for example) and a ram fs
> for volatile data.

Well, we try to find out what is possible with a fast booting Linux
system which *still* is as "vanilla" as possible.

All the "boot-in-one-second" systems out there are highly squeezed,
which is surely good if you have a scenario with high production
volumes. You can do the optimization in the last steps then and it
doesn't really matter how much time you spend with testing to come from
a system that works for a developer to a production system.

For most of our use cases here at Pengutronix, we see that:

- Customers want in-system upgradability on a per-packet base; so the
  flash filesystems should be normally r/o, but may be remounted r/w.

- Development systems should be close to production systems, in order to
  be able to have more "early testing"; so things like printk-ripout or
  special non-mainline patches/tweaks should be avoided as far as
  possible.

- In general we want to have our systems close to what the mainline
  does; Automation & Embedded is only a small market, and anything
  which is *not* specific to these markets but mainline is good.

So let's see what we'll reach while trying what people have suggested.

Thanks,
rsc
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 10:21         ` Robert Schwebel
@ 2009-08-18 10:34           ` Alex Riesen
  2009-08-18 10:44             ` Robert Schwebel
  0 siblings, 1 reply; 43+ messages in thread
From: Alex Riesen @ 2009-08-18 10:34 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Marco Stornelli, Johannes Stezenbach, Denys Vlasenko,
	linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Tue, Aug 18, 2009 at 12:21, Robert Schwebel<r.schwebel@pengutronix.de> wrote:
> - In general we want to have our systems close to what the mainline
>  does; Automation & Embedded is only a small market, and anything
>  which is *not* specific to these markets but mainline is good.

BTW, what is your mainline (or it looks like you mean "upstream")?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 10:34           ` Alex Riesen
@ 2009-08-18 10:44             ` Robert Schwebel
  2009-08-18 10:48               ` Alex Riesen
  0 siblings, 1 reply; 43+ messages in thread
From: Robert Schwebel @ 2009-08-18 10:44 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Marco Stornelli, Johannes Stezenbach, Denys Vlasenko,
	linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Tue, Aug 18, 2009 at 12:34:52PM +0200, Alex Riesen wrote:
> On Tue, Aug 18, 2009 at 12:21, Robert Schwebel<r.schwebel@pengutronix.de> wrote:
> > - In general we want to have our systems close to what the mainline
> >  does; Automation & Embedded is only a small market, and anything
> >  which is *not* specific to these markets but mainline is good.
>
> BTW, what is your mainline (or it looks like you mean "upstream")?

unpatched kernel.org

rsc
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 10:44             ` Robert Schwebel
@ 2009-08-18 10:48               ` Alex Riesen
  2009-08-18 10:53                 ` Robert Schwebel
  0 siblings, 1 reply; 43+ messages in thread
From: Alex Riesen @ 2009-08-18 10:48 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Marco Stornelli, Johannes Stezenbach, Denys Vlasenko,
	linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Tue, Aug 18, 2009 at 12:44, Robert Schwebel<r.schwebel@pengutronix.de> wrote:
> On Tue, Aug 18, 2009 at 12:34:52PM +0200, Alex Riesen wrote:
>> On Tue, Aug 18, 2009 at 12:21, Robert Schwebel<r.schwebel@pengutronix.de> wrote:
>> > - In general we want to have our systems close to what the mainline
>> >  does; Automation & Embedded is only a small market, and anything
>> >  which is *not* specific to these markets but mainline is good.
>>
>> BTW, what is your mainline (or it looks like you mean "upstream")?
>
> unpatched kernel.org
>

But many of the problems you described and suggested solutions
point at userspace, right? (like pre-defined static /dev, mdev script,
or using of specially designed rootfs)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 10:48               ` Alex Riesen
@ 2009-08-18 10:53                 ` Robert Schwebel
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Schwebel @ 2009-08-18 10:53 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Marco Stornelli, Johannes Stezenbach, Denys Vlasenko,
	linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Tue, Aug 18, 2009 at 12:48:50PM +0200, Alex Riesen wrote:
> But many of the problems you described and suggested solutions
> point at userspace, right? (like pre-defined static /dev, mdev script,
> or using of specially designed rootfs)

Yes, right. But even there, mdev is more in the "embedded special"
league than udev, for example, and highly specialized read-only root
filesystems as well.

rsc 
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-14 17:02 New fast(?)-boot results on ARM Robert Schwebel
  2009-08-14 18:19 ` Zan Lynx
  2009-08-14 20:04 ` Denys Vlasenko
@ 2009-08-18 14:06 ` Sascha Hauer
  2009-08-18 15:31   ` Dirk Behme
  2 siblings, 1 reply; 43+ messages in thread
From: Sascha Hauer @ 2009-08-18 14:06 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: linux-kernel, linux-embedded, Arjan van de Ven, Tim Bird, kernel

On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
> Hi,
> 
> On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
> > On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
> > > > That's bad :-) So there is no room for improvement any more in our
> > > > ARM boot sequences ...
> > >
> > > on x86 we're doing pretty well ;-)
> >
> > On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
> > power-on through the kernel up to "starting init". This is with
> >
> > - no delay in u-boot-v2
> > - rootfs on NAND (UBIFS)
> > - quiet
> > - precalculated loops-per-jiffy
> > - zImage kernel instead of uImage
> 
> Here's a little video of our demo system booting:
> http://www.youtube.com/watch?v=xDbUnNsj0cI
> 
> As you can see there, it needs about 15 s from the release of the reset button
> up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
> when we fade over from the initial framebuffer to the final one, in order to
> hide the qt application startup noise).
> 
> And below is the boot log (after turning "quiet" off again). The numbers are
> the timestamp and the delta to the last timestamp, measured on the controlling
> PC by looking at the serial console output. The ptx_ts script starts when the
> regexp was found, so the numbers start basically in the moment when u-boot-v2
> has initialized the system up to the point where we can see something.
> 
> Result:
> 
> - 2.4 s up from u-boot to the end of "Uncompressing Linux"
> - 300 ms until ubifs initialization starts
> - 3.7 s for ubifs, until "mounted root"
> 
> So we basically have 7 s for the kernel. The rest is userspace, which hasn't
> seen much optimization yet, other than trying to start the GUI application as
> early as possible, while doing all other init stuff in parallel. Adding "quiet"
> brings us another 300 ms.
> 
> That's factor 70 away from the 110 ms boot time Tim has talked about some days
> ago (and he measured on an ARM cpu which had almost half the speed of this
> one), and I'm wondering what we can do to improve the boot time.
> 
> Robert
> 
> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
> [ 13.522625] <  0.043189>
> [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
> [ 13.558613] <  0.011986>
> [ 13.690643] <  0.132030>        _            ____ ___  ____  _____
> [ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
> [ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
> [ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
> [ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
> [ 13.706573] <  0.003992> |_|          |___/
> [ 13.706622] <  0.000049>
> [ 13.725043] <  0.018421>
> [ 14.742608] <  1.017565>

I made some changes suggested in this thread:

- enable MMU in the bootloader
- use assembler optimized memcpy/memset in the bootloader
- start an uncompressed image
- disable IP autoconfiguration in the Kernel
- use lpj= command line parameter
- use static device nodes instead of udev
- skip some init scripts
- made the kernel smaller (I do not have both configs handy, so I do not
  know what exactly I changed)

Already looks much better:

[  0.000005] <  0.000005> U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25)
[  0.000026] <  0.000021>
[  0.000041] <  0.000015> Board: Phytec phyCORE-i.MX27
[  0.000054] <  0.000013> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
[  0.000067] <  0.000013> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
[  0.000080] <  0.000013> imxfb@imxfb0: i.MX Framebuffer driver
[  0.000092] <  0.000012> dma_alloc: 0xa6f56e40 0x10000000
[  0.000105] <  0.000013> dma_alloc: 0xa6f57088 0x10000000
[  0.000118] <  0.000013> dev_protect: currently broken
[  0.000129] <  0.000011> Using environment in NOR Flash
[  0.000141] <  0.000012> initialising PLLs
[  0.128972] <  0.128831> Malloc space: 0xa6f00000 -> 0xa7f00000 (size 16 MB)
[  0.128995] <  0.000023> Stack space : 0xa6ef8000 -> 0xa6f00000 (size 32 kB)
[  0.129008] <  0.000013> running /env/bin/init...
[  0.224963] <  0.095955>
[  0.224984] <  0.000021> Hit any key to stop autoboot:  0
[  0.224999] <  0.000015> copy
[  0.592964] <  0.367965> done
[  0.652010] <  0.059046> Linux version 2.6.31-rc4-00004-g05786f8-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009
[  0.652030] <  0.000020> CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
[  0.652044] <  0.000014> CPU: VIVT data cache, VIVT instruction cache
[  0.652057] <  0.000013> Machine: phyCORE-i.MX27
[  0.652069] <  0.000012> Memory policy: ECC disabled, Data cache writeback
[  0.652082] <  0.000013> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
[  0.706012] <  0.053930> Kernel command line: console=ttymxc0,115200 earlyprintk lpj=995328 mt9v022.sensor_type=color ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: ubi.mtd=7 root=ubi0:root rootfstype=ubifs mtdparts="physmap-flash.0:256k(uboot)ro,128k(ubootenv),3M(kernel),-(root);mxc_nand:256k(uboot)ro,128k(ubootenv),3M(kernel),-(root)"
[  0.706034] <  0.000022> console [earlyser0] enabled
[  0.706049] <  0.000015> Unknown boot option `mt9v022.sensor_type=color': ignoring
[  0.706062] <  0.000013> PID hash table entries: 512 (order: 9, 2048 bytes)
[  0.706075] <  0.000013> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[  0.706087] <  0.000012> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[  0.755997] <  0.049910> Memory: 128MB = 128MB total
[  0.756016] <  0.000019> Memory: 127004KB available (2404K code, 205K data, 80K init, 0K highmem)
[  0.756030] <  0.000014> NR_IRQS:272
[  0.756042] <  0.000012> MXC GPIO hardware
[  0.756055] <  0.000013> MXC IRQ initialized
[  0.756067] <  0.000012> Console: colour dummy device 80x30
[  0.756079] <  0.000012> Calibrating delay loop (skipped) preset value.. 199.06 BogoMIPS (lpj=995328)
[  0.756092] <  0.000013> Mount-cache hash table entries: 512
[  0.756104] <  0.000012> CPU: Testing write buffer coherency: ok
[  0.771968] <  0.015864> NET: Registered protocol family 16
[  0.803967] <  0.031999> bio: create slab <bio-0> at 0
[  0.869007] <  0.065040> NET: Registered protocol family 2
[  0.869025] <  0.000018> IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
[  0.869040] <  0.000015> TCP established hash table entries: 4096 (order: 3, 32768 bytes)
[  0.869053] <  0.000013> TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
[  0.869066] <  0.000013> TCP: Hash tables configured (established 4096 bind 4096)
[  0.869078] <  0.000012> TCP reno registered
[  0.869090] <  0.000012> NET: Registered protocol family 1
[  0.869103] <  0.000013> msgmni has been set to 248
[  0.869115] <  0.000012> io scheduler noop registered (default)
[  0.869127] <  0.000012> i.MX Framebuffer driver
[  0.884970] <  0.015843> Console: switching to colour frame buffer device 30x40
[  0.974022] <  0.089052> Serial: IMX driver
[  0.974127] <  0.000105> Platform driver 'imx-uart' needs updating - please use dev_pm_ops
[  0.974217] <  0.000090> imx-uart.0: ttymxc0 at MMIO 0x1000a000 (irq = 20) is a IMX
[  0.974306] <  0.000089> console handover: boot [earlyser0] -> real [ttymxc0]
[  0.974392] <  0.000086> imx-uart.1: ttymxc1 at MMIO 0x1000b000 (irq = 19) is a IMX
[  0.974481] <  0.000089> imx-uart.2: ttymxc2 at MMIO 0x1000c000 (irq = 18) is a IMX
[  0.974569] <  0.000088> FEC Ethernet Driver
[  0.974651] <  0.000082> Platform driver 'fec' needs updating - please use dev_pm_ops
[  0.974737] <  0.000086> fec: PHY @ 0x0, ID 0x00221613 -- KS8721BL
[  1.019018] <  0.044281> physmap platform flash device: 02000000 at c0000000
[  1.019118] <  0.000100> physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
[  1.019207] <  0.000089>  Intel/Sharp Extended Query Table at 0x010A
[  1.019293] <  0.000086>  Intel/Sharp Extended Query Table at 0x010A
[  1.019377] <  0.000084>  Intel/Sharp Extended Query Table at 0x010A
[  1.019460] <  0.000083>  Intel/Sharp Extended Query Table at 0x010A
[  1.019544] <  0.000084>  Intel/Sharp Extended Query Table at 0x010A
[  1.019627] <  0.000083> Using buffer write method
[  1.019714] <  0.000087> Using auto-unlock on power-up/resume
[  1.019797] <  0.000083> cfi_cmdset_0001: Erase suspend on write enabled
[  1.019881] <  0.000084> 4 cmdlinepart partitions found on MTD device physmap-flash.0
[  1.082018] <  0.062137> Creating 4 MTD partitions on "physmap-flash.0":
[  1.082112] <  0.000094> 0x000000000000-0x000000040000 : "uboot"
[  1.082199] <  0.000087> 0x000000040000-0x000000060000 : "ubootenv"
[  1.082287] <  0.000088> 0x000000060000-0x000000360000 : "kernel"
[  1.082371] <  0.000084> 0x000000360000-0x000002000000 : "root"
[  1.082453] <  0.000082> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
[  1.082543] <  0.000090> RedBoot partition parsing not available
[  1.082627] <  0.000084> 4 cmdlinepart partitions found on MTD device mxc_nand
[  1.082715] <  0.000088> Creating 4 MTD partitions on "mxc_nand":
[  1.082798] <  0.000083> 0x000000000000-0x000000040000 : "uboot"
[  1.082882] <  0.000084> 0x000000040000-0x000000060000 : "ubootenv"
[  1.097976] <  0.015094> 0x000000060000-0x000000360000 : "kernel"
[  1.113978] <  0.016002> 0x000000360000-0x000004000000 : "root"
[  1.425012] <  0.311034> UBI: attaching mtd7 to ubi0
[  1.425043] <  0.000031> UBI: physical eraseblock size:   16384 bytes (16 KiB)
[  1.425057] <  0.000014> UBI: logical eraseblock size:    15360 bytes
[  1.425071] <  0.000014> UBI: smallest flash I/O unit:    512
[  1.425083] <  0.000012> UBI: VID header offset:          512 (aligned 512)
[  1.425096] <  0.000013> UBI: data offset:                1024
[  3.008058] <  1.582962> UBI: attached mtd7 to ubi0
[  3.008090] <  0.000032> UBI: MTD device name:            "root"
[  3.008105] <  0.000015> UBI: MTD device size:            60 MiB
[  3.008119] <  0.000014> UBI: number of good PEBs:        3880
[  3.008132] <  0.000013> UBI: number of bad PEBs:         0
[  3.008145] <  0.000013> UBI: max. allowed volumes:       89
[  3.008159] <  0.000014> UBI: wear-leveling threshold:    4096
[  3.008172] <  0.000013> UBI: number of internal volumes: 1
[  3.008185] <  0.000013> UBI: number of user volumes:     1
[  3.008199] <  0.000014> UBI: available PEBs:             0
[  3.008212] <  0.000013> UBI: total number of reserved PEBs: 3880
[  3.008226] <  0.000014> UBI: number of PEBs reserved for bad PEB handling: 38
[  3.051029] <  0.042803> UBI: max/mean erase counter: 2/0
[  3.051052] <  0.000023> UBI: image sequence number: 0
[  3.051066] <  0.000014> UBI: background thread "ubi_bgt0d" started, PID 218
[  3.051081] <  0.000015> i2c /dev entries driver
[  3.051094] <  0.000013> rtc-pcf8563 1-0051: chip found, driver version 0.4.3
[  3.051108] <  0.000014> rtc-pcf8563 1-0051: rtc core: registered rtc-pcf8563 as rtc0
[  3.051122] <  0.000014> Driver for 1-wire Dallas network protocol.
[  3.148042] <  0.096920> i.MX SDHC driver
[  3.148067] <  0.000025> mxc-mmc: probe of mxc-mmc.1 failed with error -16
[  3.148082] <  0.000015> TCP cubic registered
[  3.148095] <  0.000013> NET: Registered protocol family 17
[  3.148107] <  0.000012> RPC: Registered udp transport module.
[  3.148119] <  0.000012> RPC: Registered tcp transport module.
[  3.148132] <  0.000013> rtc-pcf8563 1-0051: low voltage detected, date/time is not reliable.
[  3.148145] <  0.000013> rtc-pcf8563 1-0051: retrieved date/time is not valid.
[  3.148157] <  0.000012> rtc-pcf8563 1-0051: hctosys: invalid date/time
[  3.148170] <  0.000013> UBIFS: recovery needed
[  3.211043] <  0.062873> UBIFS: recovery completed
[  3.211064] <  0.000021> UBIFS: mounted UBI device 0, volume 1, name "root"
[  3.211080] <  0.000016> UBIFS: file system size:   58490880 bytes (57120 KiB, 55 MiB, 3808 LEBs)
[  3.211093] <  0.000013> UBIFS: journal size:       7741440 bytes (7560 KiB, 7 MiB, 504 LEBs)
[  3.211105] <  0.000012> UBIFS: media format:       w4/r0 (latest is w4/r0)
[  3.211118] <  0.000013> UBIFS: default compressor: lzo
[  3.211130] <  0.000012> UBIFS: reserved for root:  0 bytes (0 KiB)
[  3.211143] <  0.000013> VFS: Mounted root (ubifs filesystem) on device 0:12.
[  3.211155] <  0.000012> Freeing init memory: 80K
init started: BusyBox v1.13.4 (2009-08-06 08:30:14 CEST)
[  3.514007] <  0.159993> mounting filesystems...done.
[  3.546005] <  0.031998> running rc.d services...
[  3.626007] <  0.080002> syslogd starting
[  3.786013] <  0.160006> Starting telnetd...
[  3.962014] <  0.176001> starting network interfaces...
[  4.818032] <  0.856018> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
[  5.058038] <  0.240006> ip: cannot find device "can0"
[  5.250040] <  0.192002> ip: SIOCGIFFLAGS: No such device
[  5.298033] <  0.047993>
[  5.336039] <  0.038006> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
[  5.368028] <  0.031989>
[  5.840066] <  0.472038>        _            ____ ___  ____  _____
[  5.840090] <  0.000024>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
[  5.840104] <  0.000014> | '_ \| '_ \| | | | |  | | | | |_) |  _|
[  5.840116] <  0.000012> | |_) | | | | |_| | |__| |_| |  _ <| |___
[  5.840129] <  0.000013> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
[  5.840141] <  0.000012> |_|          |___/
[  5.840154] <  0.000013>

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 14:06 ` Sascha Hauer
@ 2009-08-18 15:31   ` Dirk Behme
  2009-08-18 16:34     ` Marco Stornelli
                       ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Dirk Behme @ 2009-08-18 15:31 UTC (permalink / raw)
  To: Sascha Hauer
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

Sascha Hauer wrote:
> On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
>> Hi,
>>
>> On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
>>> On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
>>>>> That's bad :-) So there is no room for improvement any more in our
>>>>> ARM boot sequences ...
>>>> on x86 we're doing pretty well ;-)
>>> On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
>>> power-on through the kernel up to "starting init". This is with
>>>
>>> - no delay in u-boot-v2
>>> - rootfs on NAND (UBIFS)
>>> - quiet
>>> - precalculated loops-per-jiffy
>>> - zImage kernel instead of uImage
>> Here's a little video of our demo system booting:
>> http://www.youtube.com/watch?v=xDbUnNsj0cI
>>
>> As you can see there, it needs about 15 s from the release of the reset button
>> up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
>> when we fade over from the initial framebuffer to the final one, in order to
>> hide the qt application startup noise).
>>
>> And below is the boot log (after turning "quiet" off again). The numbers are
>> the timestamp and the delta to the last timestamp, measured on the controlling
>> PC by looking at the serial console output. The ptx_ts script starts when the
>> regexp was found, so the numbers start basically in the moment when u-boot-v2
>> has initialized the system up to the point where we can see something.
>>
>> Result:
>>
>> - 2.4 s up from u-boot to the end of "Uncompressing Linux"
>> - 300 ms until ubifs initialization starts
>> - 3.7 s for ubifs, until "mounted root"
>>
>> So we basically have 7 s for the kernel. The rest is userspace, which hasn't
>> seen much optimization yet, other than trying to start the GUI application as
>> early as possible, while doing all other init stuff in parallel. Adding "quiet"
>> brings us another 300 ms.
>>
>> That's factor 70 away from the 110 ms boot time Tim has talked about some days
>> ago (and he measured on an ARM cpu which had almost half the speed of this
>> one), and I'm wondering what we can do to improve the boot time.
>>
>> Robert
>>
>> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
>> [ 13.522625] <  0.043189>
>> [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
>> [ 13.558613] <  0.011986>
>> [ 13.690643] <  0.132030>        _            ____ ___  ____  _____
>> [ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
>> [ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
>> [ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
>> [ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
>> [ 13.706573] <  0.003992> |_|          |___/
>> [ 13.706622] <  0.000049>
>> [ 13.725043] <  0.018421>
>> [ 14.742608] <  1.017565>
> 
> I made some changes suggested in this thread:
> 
> - enable MMU in the bootloader
> - use assembler optimized memcpy/memset in the bootloader
> - start an uncompressed image
> - disable IP autoconfiguration in the Kernel
> - use lpj= command line parameter
> - use static device nodes instead of udev
> - skip some init scripts
> - made the kernel smaller (I do not have both configs handy, so I do not
>   know what exactly I changed)
> 
> Already looks much better:
> 
> [  0.000005] <  0.000005> U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25)
> [  0.000026] <  0.000021>
> [  0.000041] <  0.000015> Board: Phytec phyCORE-i.MX27
> [  0.000054] <  0.000013> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
> [  0.000067] <  0.000013> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
> [  0.000080] <  0.000013> imxfb@imxfb0: i.MX Framebuffer driver
> [  0.000092] <  0.000012> dma_alloc: 0xa6f56e40 0x10000000
> [  0.000105] <  0.000013> dma_alloc: 0xa6f57088 0x10000000
> [  0.000118] <  0.000013> dev_protect: currently broken
> [  0.000129] <  0.000011> Using environment in NOR Flash
> [  0.000141] <  0.000012> initialising PLLs
> [  0.128972] <  0.128831> Malloc space: 0xa6f00000 -> 0xa7f00000 (size 16 MB)
> [  0.128995] <  0.000023> Stack space : 0xa6ef8000 -> 0xa6f00000 (size 32 kB)
> [  0.129008] <  0.000013> running /env/bin/init...
> [  0.224963] <  0.095955>
> [  0.224984] <  0.000021> Hit any key to stop autoboot:  0
> [  0.224999] <  0.000015> copy
> [  0.592964] <  0.367965> done
> [  0.652010] <  0.059046> Linux version 2.6.31-rc4-00004-g05786f8-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009

So, this are ~0.6 s in boot loader and kernel copy until kernel 
starts, correct?

What's the size of the uncompressed kernel copied here?

Best regards

Dirk

Btw.: I tried to summarize some hints given in this thread in

http://elinux.org/Boot_Time#Boot_time_check_list

Please feel free to add and correct stuff!

> [  0.652030] <  0.000020> CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
> [  0.652044] <  0.000014> CPU: VIVT data cache, VIVT instruction cache
> [  0.652057] <  0.000013> Machine: phyCORE-i.MX27
> [  0.652069] <  0.000012> Memory policy: ECC disabled, Data cache writeback
> [  0.652082] <  0.000013> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
> [  0.706012] <  0.053930> Kernel command line: console=ttymxc0,115200 earlyprintk lpj=995328 mt9v022.sensor_type=color ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: ubi.mtd=7 root=ubi0:root rootfstype=ubifs mtdparts="physmap-flash.0:256k(uboot)ro,128k(ubootenv),3M(kernel),-(root);mxc_nand:256k(uboot)ro,128k(ubootenv),3M(kernel),-(root)"
> [  0.706034] <  0.000022> console [earlyser0] enabled
> [  0.706049] <  0.000015> Unknown boot option `mt9v022.sensor_type=color': ignoring
> [  0.706062] <  0.000013> PID hash table entries: 512 (order: 9, 2048 bytes)
> [  0.706075] <  0.000013> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
> [  0.706087] <  0.000012> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
> [  0.755997] <  0.049910> Memory: 128MB = 128MB total
> [  0.756016] <  0.000019> Memory: 127004KB available (2404K code, 205K data, 80K init, 0K highmem)
> [  0.756030] <  0.000014> NR_IRQS:272
> [  0.756042] <  0.000012> MXC GPIO hardware
> [  0.756055] <  0.000013> MXC IRQ initialized
> [  0.756067] <  0.000012> Console: colour dummy device 80x30
> [  0.756079] <  0.000012> Calibrating delay loop (skipped) preset value.. 199.06 BogoMIPS (lpj=995328)
> [  0.756092] <  0.000013> Mount-cache hash table entries: 512
> [  0.756104] <  0.000012> CPU: Testing write buffer coherency: ok
> [  0.771968] <  0.015864> NET: Registered protocol family 16
> [  0.803967] <  0.031999> bio: create slab <bio-0> at 0
> [  0.869007] <  0.065040> NET: Registered protocol family 2
> [  0.869025] <  0.000018> IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
> [  0.869040] <  0.000015> TCP established hash table entries: 4096 (order: 3, 32768 bytes)
> [  0.869053] <  0.000013> TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
> [  0.869066] <  0.000013> TCP: Hash tables configured (established 4096 bind 4096)
> [  0.869078] <  0.000012> TCP reno registered
> [  0.869090] <  0.000012> NET: Registered protocol family 1
> [  0.869103] <  0.000013> msgmni has been set to 248
> [  0.869115] <  0.000012> io scheduler noop registered (default)
> [  0.869127] <  0.000012> i.MX Framebuffer driver
> [  0.884970] <  0.015843> Console: switching to colour frame buffer device 30x40
> [  0.974022] <  0.089052> Serial: IMX driver
> [  0.974127] <  0.000105> Platform driver 'imx-uart' needs updating - please use dev_pm_ops
> [  0.974217] <  0.000090> imx-uart.0: ttymxc0 at MMIO 0x1000a000 (irq = 20) is a IMX
> [  0.974306] <  0.000089> console handover: boot [earlyser0] -> real [ttymxc0]
> [  0.974392] <  0.000086> imx-uart.1: ttymxc1 at MMIO 0x1000b000 (irq = 19) is a IMX
> [  0.974481] <  0.000089> imx-uart.2: ttymxc2 at MMIO 0x1000c000 (irq = 18) is a IMX
> [  0.974569] <  0.000088> FEC Ethernet Driver
> [  0.974651] <  0.000082> Platform driver 'fec' needs updating - please use dev_pm_ops
> [  0.974737] <  0.000086> fec: PHY @ 0x0, ID 0x00221613 -- KS8721BL
> [  1.019018] <  0.044281> physmap platform flash device: 02000000 at c0000000
> [  1.019118] <  0.000100> physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
> [  1.019207] <  0.000089>  Intel/Sharp Extended Query Table at 0x010A
> [  1.019293] <  0.000086>  Intel/Sharp Extended Query Table at 0x010A
> [  1.019377] <  0.000084>  Intel/Sharp Extended Query Table at 0x010A
> [  1.019460] <  0.000083>  Intel/Sharp Extended Query Table at 0x010A
> [  1.019544] <  0.000084>  Intel/Sharp Extended Query Table at 0x010A
> [  1.019627] <  0.000083> Using buffer write method
> [  1.019714] <  0.000087> Using auto-unlock on power-up/resume
> [  1.019797] <  0.000083> cfi_cmdset_0001: Erase suspend on write enabled
> [  1.019881] <  0.000084> 4 cmdlinepart partitions found on MTD device physmap-flash.0
> [  1.082018] <  0.062137> Creating 4 MTD partitions on "physmap-flash.0":
> [  1.082112] <  0.000094> 0x000000000000-0x000000040000 : "uboot"
> [  1.082199] <  0.000087> 0x000000040000-0x000000060000 : "ubootenv"
> [  1.082287] <  0.000088> 0x000000060000-0x000000360000 : "kernel"
> [  1.082371] <  0.000084> 0x000000360000-0x000002000000 : "root"
> [  1.082453] <  0.000082> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
> [  1.082543] <  0.000090> RedBoot partition parsing not available
> [  1.082627] <  0.000084> 4 cmdlinepart partitions found on MTD device mxc_nand
> [  1.082715] <  0.000088> Creating 4 MTD partitions on "mxc_nand":
> [  1.082798] <  0.000083> 0x000000000000-0x000000040000 : "uboot"
> [  1.082882] <  0.000084> 0x000000040000-0x000000060000 : "ubootenv"
> [  1.097976] <  0.015094> 0x000000060000-0x000000360000 : "kernel"
> [  1.113978] <  0.016002> 0x000000360000-0x000004000000 : "root"
> [  1.425012] <  0.311034> UBI: attaching mtd7 to ubi0
> [  1.425043] <  0.000031> UBI: physical eraseblock size:   16384 bytes (16 KiB)
> [  1.425057] <  0.000014> UBI: logical eraseblock size:    15360 bytes
> [  1.425071] <  0.000014> UBI: smallest flash I/O unit:    512
> [  1.425083] <  0.000012> UBI: VID header offset:          512 (aligned 512)
> [  1.425096] <  0.000013> UBI: data offset:                1024
> [  3.008058] <  1.582962> UBI: attached mtd7 to ubi0
> [  3.008090] <  0.000032> UBI: MTD device name:            "root"
> [  3.008105] <  0.000015> UBI: MTD device size:            60 MiB
> [  3.008119] <  0.000014> UBI: number of good PEBs:        3880
> [  3.008132] <  0.000013> UBI: number of bad PEBs:         0
> [  3.008145] <  0.000013> UBI: max. allowed volumes:       89
> [  3.008159] <  0.000014> UBI: wear-leveling threshold:    4096
> [  3.008172] <  0.000013> UBI: number of internal volumes: 1
> [  3.008185] <  0.000013> UBI: number of user volumes:     1
> [  3.008199] <  0.000014> UBI: available PEBs:             0
> [  3.008212] <  0.000013> UBI: total number of reserved PEBs: 3880
> [  3.008226] <  0.000014> UBI: number of PEBs reserved for bad PEB handling: 38
> [  3.051029] <  0.042803> UBI: max/mean erase counter: 2/0
> [  3.051052] <  0.000023> UBI: image sequence number: 0
> [  3.051066] <  0.000014> UBI: background thread "ubi_bgt0d" started, PID 218
> [  3.051081] <  0.000015> i2c /dev entries driver
> [  3.051094] <  0.000013> rtc-pcf8563 1-0051: chip found, driver version 0.4.3
> [  3.051108] <  0.000014> rtc-pcf8563 1-0051: rtc core: registered rtc-pcf8563 as rtc0
> [  3.051122] <  0.000014> Driver for 1-wire Dallas network protocol.
> [  3.148042] <  0.096920> i.MX SDHC driver
> [  3.148067] <  0.000025> mxc-mmc: probe of mxc-mmc.1 failed with error -16
> [  3.148082] <  0.000015> TCP cubic registered
> [  3.148095] <  0.000013> NET: Registered protocol family 17
> [  3.148107] <  0.000012> RPC: Registered udp transport module.
> [  3.148119] <  0.000012> RPC: Registered tcp transport module.
> [  3.148132] <  0.000013> rtc-pcf8563 1-0051: low voltage detected, date/time is not reliable.
> [  3.148145] <  0.000013> rtc-pcf8563 1-0051: retrieved date/time is not valid.
> [  3.148157] <  0.000012> rtc-pcf8563 1-0051: hctosys: invalid date/time
> [  3.148170] <  0.000013> UBIFS: recovery needed
> [  3.211043] <  0.062873> UBIFS: recovery completed
> [  3.211064] <  0.000021> UBIFS: mounted UBI device 0, volume 1, name "root"
> [  3.211080] <  0.000016> UBIFS: file system size:   58490880 bytes (57120 KiB, 55 MiB, 3808 LEBs)
> [  3.211093] <  0.000013> UBIFS: journal size:       7741440 bytes (7560 KiB, 7 MiB, 504 LEBs)
> [  3.211105] <  0.000012> UBIFS: media format:       w4/r0 (latest is w4/r0)
> [  3.211118] <  0.000013> UBIFS: default compressor: lzo
> [  3.211130] <  0.000012> UBIFS: reserved for root:  0 bytes (0 KiB)
> [  3.211143] <  0.000013> VFS: Mounted root (ubifs filesystem) on device 0:12.
> [  3.211155] <  0.000012> Freeing init memory: 80K
> init started: BusyBox v1.13.4 (2009-08-06 08:30:14 CEST)
> [  3.514007] <  0.159993> mounting filesystems...done.
> [  3.546005] <  0.031998> running rc.d services...
> [  3.626007] <  0.080002> syslogd starting
> [  3.786013] <  0.160006> Starting telnetd...
> [  3.962014] <  0.176001> starting network interfaces...
> [  4.818032] <  0.856018> eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
> [  5.058038] <  0.240006> ip: cannot find device "can0"
> [  5.250040] <  0.192002> ip: SIOCGIFFLAGS: No such device
> [  5.298033] <  0.047993>
> [  5.336039] <  0.038006> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
> [  5.368028] <  0.031989>
> [  5.840066] <  0.472038>        _            ____ ___  ____  _____
> [  5.840090] <  0.000024>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
> [  5.840104] <  0.000014> | '_ \| '_ \| | | | |  | | | | |_) |  _|
> [  5.840116] <  0.000012> | |_) | | | | |_| | |__| |_| |  _ <| |___
> [  5.840129] <  0.000013> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
> [  5.840141] <  0.000012> |_|          |___/
> [  5.840154] <  0.000013>
> 
> Sascha
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 15:31   ` Dirk Behme
@ 2009-08-18 16:34     ` Marco Stornelli
  2009-08-18 18:23     ` Tim Bird
  2009-08-19  7:21     ` Sascha Hauer
  2 siblings, 0 replies; 43+ messages in thread
From: Marco Stornelli @ 2009-08-18 16:34 UTC (permalink / raw)
  To: Dirk Behme
  Cc: Sascha Hauer, Robert Schwebel, linux-kernel, linux-embedded,
	Arjan van de Ven, Tim Bird, kernel

Dirk Behme wrote:
> Sascha Hauer wrote:
>> On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
>>> Hi,
>>>
>>> On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
>>>> On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
>>>>>> That's bad :-) So there is no room for improvement any more in our
>>>>>> ARM boot sequences ...
>>>>> on x86 we're doing pretty well ;-)
>>>> On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
>>>> power-on through the kernel up to "starting init". This is with
>>>>
>>>> - no delay in u-boot-v2
>>>> - rootfs on NAND (UBIFS)
>>>> - quiet
>>>> - precalculated loops-per-jiffy
>>>> - zImage kernel instead of uImage
>>> Here's a little video of our demo system booting:
>>> http://www.youtube.com/watch?v=xDbUnNsj0cI
>>>
>>> As you can see there, it needs about 15 s from the release of the
>>> reset button
>>> up to the moment where the application shows it's Qt 4.5.2 based GUI
>>> (which is
>>> when we fade over from the initial framebuffer to the final one, in
>>> order to
>>> hide the qt application startup noise).
>>>
>>> And below is the boot log (after turning "quiet" off again). The
>>> numbers are
>>> the timestamp and the delta to the last timestamp, measured on the
>>> controlling
>>> PC by looking at the serial console output. The ptx_ts script starts
>>> when the
>>> regexp was found, so the numbers start basically in the moment when
>>> u-boot-v2
>>> has initialized the system up to the point where we can see something.
>>>
>>> Result:
>>>
>>> - 2.4 s up from u-boot to the end of "Uncompressing Linux"
>>> - 300 ms until ubifs initialization starts
>>> - 3.7 s for ubifs, until "mounted root"
>>>
>>> So we basically have 7 s for the kernel. The rest is userspace, which
>>> hasn't
>>> seen much optimization yet, other than trying to start the GUI
>>> application as
>>> early as possible, while doing all other init stuff in parallel.
>>> Adding "quiet"
>>> brings us another 300 ms.
>>>
>>> That's factor 70 away from the 110 ms boot time Tim has talked about
>>> some days
>>> ago (and he measured on an ARM cpu which had almost half the speed of
>>> this
>>> one), and I'm wondering what we can do to improve the boot time.
>>>
>>> Robert
>>>
>>> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
>>> [ 13.522625] <  0.043189>
>>> [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk
>>> (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
>>> [ 13.558613] <  0.011986>
>>> [ 13.690643] <  0.132030>        _            ____ ___  ____  _____
>>> [ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
>>> [ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
>>> [ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
>>> [ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
>>> [ 13.706573] <  0.003992> |_|          |___/
>>> [ 13.706622] <  0.000049>
>>> [ 13.725043] <  0.018421>
>>> [ 14.742608] <  1.017565>
>>
>> I made some changes suggested in this thread:
>>
>> - enable MMU in the bootloader
>> - use assembler optimized memcpy/memset in the bootloader
>> - start an uncompressed image
>> - disable IP autoconfiguration in the Kernel
>> - use lpj= command line parameter
>> - use static device nodes instead of udev
>> - skip some init scripts
>> - made the kernel smaller (I do not have both configs handy, so I do not
>>   know what exactly I changed)
>>
>> Already looks much better:
>>
>> [  0.000005] <  0.000005> U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug
>> 18 2009 - 13:29:25)
>> [  0.000026] <  0.000021>
>> [  0.000041] <  0.000015> Board: Phytec phyCORE-i.MX27
>> [  0.000054] <  0.000013> cfi_probe: cfi_flash base: 0xc0000000 size:
>> 0x02000000
>> [  0.000067] <  0.000013> NAND device: Manufacturer ID: 0x20, Chip ID:
>> 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
>> [  0.000080] <  0.000013> imxfb@imxfb0: i.MX Framebuffer driver
>> [  0.000092] <  0.000012> dma_alloc: 0xa6f56e40 0x10000000
>> [  0.000105] <  0.000013> dma_alloc: 0xa6f57088 0x10000000
>> [  0.000118] <  0.000013> dev_protect: currently broken
>> [  0.000129] <  0.000011> Using environment in NOR Flash
>> [  0.000141] <  0.000012> initialising PLLs
>> [  0.128972] <  0.128831> Malloc space: 0xa6f00000 -> 0xa7f00000 (size
>> 16 MB)
>> [  0.128995] <  0.000023> Stack space : 0xa6ef8000 -> 0xa6f00000 (size
>> 32 kB)
>> [  0.129008] <  0.000013> running /env/bin/init...
>> [  0.224963] <  0.095955>
>> [  0.224984] <  0.000021> Hit any key to stop autoboot:  0
>> [  0.224999] <  0.000015> copy
>> [  0.592964] <  0.367965> done
>> [  0.652010] <  0.059046> Linux version
>> 2.6.31-rc4-00004-g05786f8-dirty (sha@octopus) (gcc version 4.3.2
>> (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009
> 
> So, this are ~0.6 s in boot loader and kernel copy until kernel starts,
> correct?
> 
> What's the size of the uncompressed kernel copied here?
> 
> Best regards
> 
> Dirk
> 
> Btw.: I tried to summarize some hints given in this thread in
> 
> http://elinux.org/Boot_Time#Boot_time_check_list
> 
> Please feel free to add and correct stuff!
> 

It's a good documentation, good work. From 14s to 5s I think it's a very
 good result. In reference to the previous response of Robert, I think
that it's a good thing to use a vanilla kernel and avoid strange and
specific or not mature solutions, but it needs to use the "right" tool
for the "right" platform. SquashFS is in mainline, mdev is part of
busybox and it's used in several projects. You cannot think to have a
normal desktop, imho some tools and some solutions must be very
specific, it's the embedded world. However your problems are very common
in the production environment.

Marco

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 15:31   ` Dirk Behme
  2009-08-18 16:34     ` Marco Stornelli
@ 2009-08-18 18:23     ` Tim Bird
  2009-08-19  7:21     ` Sascha Hauer
  2 siblings, 0 replies; 43+ messages in thread
From: Tim Bird @ 2009-08-18 18:23 UTC (permalink / raw)
  To: Dirk Behme
  Cc: Sascha Hauer, Robert Schwebel, linux-kernel, linux-embedded,
	Arjan van de Ven, kernel

Dirk Behme wrote
> Btw.: I tried to summarize some hints given in this thread in
> 
> http://elinux.org/Boot_Time#Boot_time_check_list
> 
> Please feel free to add and correct stuff!

That's a great summary of the points raised in the discussion.
It's good to organize the information and save it in an
easy-to-read format.

Thanks very much for doing that!
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-18 15:31   ` Dirk Behme
  2009-08-18 16:34     ` Marco Stornelli
  2009-08-18 18:23     ` Tim Bird
@ 2009-08-19  7:21     ` Sascha Hauer
  2009-08-19 16:20       ` Dirk Behme
  2 siblings, 1 reply; 43+ messages in thread
From: Sascha Hauer @ 2009-08-19  7:21 UTC (permalink / raw)
  To: Dirk Behme
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

On Tue, Aug 18, 2009 at 05:31:42PM +0200, Dirk Behme wrote:
> Sascha Hauer wrote:
>> On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
>>> Hi,
>>>
>>> On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
>>>> On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
>>>>>> That's bad :-) So there is no room for improvement any more in our
>>>>>> ARM boot sequences ...
>>>>> on x86 we're doing pretty well ;-)
>>>> On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
>>>> power-on through the kernel up to "starting init". This is with
>>>>
>>>> - no delay in u-boot-v2
>>>> - rootfs on NAND (UBIFS)
>>>> - quiet
>>>> - precalculated loops-per-jiffy
>>>> - zImage kernel instead of uImage
>>> Here's a little video of our demo system booting:
>>> http://www.youtube.com/watch?v=xDbUnNsj0cI
>>>
>>> As you can see there, it needs about 15 s from the release of the reset button
>>> up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
>>> when we fade over from the initial framebuffer to the final one, in order to
>>> hide the qt application startup noise).
>>>
>>> And below is the boot log (after turning "quiet" off again). The numbers are
>>> the timestamp and the delta to the last timestamp, measured on the controlling
>>> PC by looking at the serial console output. The ptx_ts script starts when the
>>> regexp was found, so the numbers start basically in the moment when u-boot-v2
>>> has initialized the system up to the point where we can see something.
>>>
>>> Result:
>>>
>>> - 2.4 s up from u-boot to the end of "Uncompressing Linux"
>>> - 300 ms until ubifs initialization starts
>>> - 3.7 s for ubifs, until "mounted root"
>>>
>>> So we basically have 7 s for the kernel. The rest is userspace, which hasn't
>>> seen much optimization yet, other than trying to start the GUI application as
>>> early as possible, while doing all other init stuff in parallel. Adding "quiet"
>>> brings us another 300 ms.
>>>
>>> That's factor 70 away from the 110 ms boot time Tim has talked about some days
>>> ago (and he measured on an ARM cpu which had almost half the speed of this
>>> one), and I'm wondering what we can do to improve the boot time.
>>>
>>> Robert
>>>
>>> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
>>> [ 13.522625] <  0.043189>
>>> [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
>>> [ 13.558613] <  0.011986>
>>> [ 13.690643] <  0.132030>        _            ____ ___  ____  _____
>>> [ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
>>> [ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
>>> [ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
>>> [ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
>>> [ 13.706573] <  0.003992> |_|          |___/
>>> [ 13.706622] <  0.000049>
>>> [ 13.725043] <  0.018421>
>>> [ 14.742608] <  1.017565>
>>
>> I made some changes suggested in this thread:
>>
>> - enable MMU in the bootloader
>> - use assembler optimized memcpy/memset in the bootloader
>> - start an uncompressed image
>> - disable IP autoconfiguration in the Kernel
>> - use lpj= command line parameter
>> - use static device nodes instead of udev
>> - skip some init scripts
>> - made the kernel smaller (I do not have both configs handy, so I do not
>>   know what exactly I changed)
>>
>> Already looks much better:
>>
>> [  0.000005] <  0.000005> U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25)
>> [  0.000026] <  0.000021>
>> [  0.000041] <  0.000015> Board: Phytec phyCORE-i.MX27
>> [  0.000054] <  0.000013> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
>> [  0.000067] <  0.000013> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
>> [  0.000080] <  0.000013> imxfb@imxfb0: i.MX Framebuffer driver
>> [  0.000092] <  0.000012> dma_alloc: 0xa6f56e40 0x10000000
>> [  0.000105] <  0.000013> dma_alloc: 0xa6f57088 0x10000000
>> [  0.000118] <  0.000013> dev_protect: currently broken
>> [  0.000129] <  0.000011> Using environment in NOR Flash
>> [  0.000141] <  0.000012> initialising PLLs
>> [  0.128972] <  0.128831> Malloc space: 0xa6f00000 -> 0xa7f00000 (size 16 MB)
>> [  0.128995] <  0.000023> Stack space : 0xa6ef8000 -> 0xa6f00000 (size 32 kB)
>> [  0.129008] <  0.000013> running /env/bin/init...
>> [  0.224963] <  0.095955>
>> [  0.224984] <  0.000021> Hit any key to stop autoboot:  0
>> [  0.224999] <  0.000015> copy
>> [  0.592964] <  0.367965> done
>> [  0.652010] <  0.059046> Linux version 2.6.31-rc4-00004-g05786f8-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009
>
> So, this are ~0.6 s in boot loader and kernel copy until kernel starts, 
> correct?

Yes, correct. The copying itself is between 'copy' and 'done' so it
takes about 0.4s.

>
> What's the size of the uncompressed kernel copied here?

The image is about 2.8MB, but I copied the whole partition of 3MB
because with raw images you can't detect the image size.

>
> Btw.: I tried to summarize some hints given in this thread in
>
> http://elinux.org/Boot_Time#Boot_time_check_list

Nice work!

Regards
  Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: new ipdelay= option for faster netboot
  2009-08-18  1:40               ` Tim Bird
  2009-08-18  1:56                 ` David Miller
@ 2009-08-19 11:57                 ` Jamie Lokier
  1 sibling, 0 replies; 43+ messages in thread
From: Jamie Lokier @ 2009-08-19 11:57 UTC (permalink / raw)
  To: Tim Bird
  Cc: David Miller, r.schwebel, vda.linux, linux-kernel,
	linux-embedded, arjan, kernel, netdev

Tim Bird wrote:
> David Miller wrote:
> > From: Tim Bird <tim.bird@am.sony.com>
> > Date: Mon, 17 Aug 2009 18:24:26 -0700
> >
> >> David Miller wrote:
> >>> I have card/switch combinations that take up to 10 seconds to
> >>> negotiate a proper link.
> >> What types of delays are these timeouts supposed to
> >> cover?
> >
> > The problem is that if you don't first give at least some time for the
> > link to come up, the remaining time it takes the link to come up will
> > end up chewing into the actual bootp/dhcp protocol timeouts.  And
> > that's what we're trying to avoid.
> 
> What link?  I'm not that familiar with networking.
> 
> Assuming I'm using ethernet, what link needs to come up?

When you plug an ethernet cable in, you may have noticed it takes a
short time before the signal light comes on.  That's negotiation time.
Some are slower than others, but none of them do it instantly.

> Is this something to do with power propagation to the
> physical wire?

Not really.

> Is there some MAC layer negotiation between the card and the switch?
> Is it the time for the switch to do speed detection?

Yes and yes.

> And, can any of this be more accurately determined
> or guessed-at with knowledge of the onboard hardware?
> Or is it dependent on external conditions?

It can be accurately determined with most cards (all modern ones)
because you get a notification when it's done, or you can poll the card.

That's why on the desktop it's able to detect when you plug in an
ethernet cable and start DHCP as soon as link negotiation is complete.

So the right thing to do, as David Miller suggested too, isn't a fixed
timeout.  It should wait for link state UP and then start DHCP
immediately.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-19  7:21     ` Sascha Hauer
@ 2009-08-19 16:20       ` Dirk Behme
  2009-08-20  8:57         ` Sascha Hauer
  0 siblings, 1 reply; 43+ messages in thread
From: Dirk Behme @ 2009-08-19 16:20 UTC (permalink / raw)
  To: Sascha Hauer
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

Sascha Hauer wrote:
> On Tue, Aug 18, 2009 at 05:31:42PM +0200, Dirk Behme wrote:
>> Sascha Hauer wrote:
>>> On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
>>>> Hi,
>>>>
>>>> On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
>>>>> On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
>>>>>>> That's bad :-) So there is no room for improvement any more in our
>>>>>>> ARM boot sequences ...
>>>>>> on x86 we're doing pretty well ;-)
>>>>> On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
>>>>> power-on through the kernel up to "starting init". This is with
>>>>>
>>>>> - no delay in u-boot-v2
>>>>> - rootfs on NAND (UBIFS)
>>>>> - quiet
>>>>> - precalculated loops-per-jiffy
>>>>> - zImage kernel instead of uImage
>>>> Here's a little video of our demo system booting:
>>>> http://www.youtube.com/watch?v=xDbUnNsj0cI
>>>>
>>>> As you can see there, it needs about 15 s from the release of the reset button
>>>> up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
>>>> when we fade over from the initial framebuffer to the final one, in order to
>>>> hide the qt application startup noise).
>>>>
>>>> And below is the boot log (after turning "quiet" off again). The numbers are
>>>> the timestamp and the delta to the last timestamp, measured on the controlling
>>>> PC by looking at the serial console output. The ptx_ts script starts when the
>>>> regexp was found, so the numbers start basically in the moment when u-boot-v2
>>>> has initialized the system up to the point where we can see something.
>>>>
>>>> Result:
>>>>
>>>> - 2.4 s up from u-boot to the end of "Uncompressing Linux"
>>>> - 300 ms until ubifs initialization starts
>>>> - 3.7 s for ubifs, until "mounted root"
>>>>
>>>> So we basically have 7 s for the kernel. The rest is userspace, which hasn't
>>>> seen much optimization yet, other than trying to start the GUI application as
>>>> early as possible, while doing all other init stuff in parallel. Adding "quiet"
>>>> brings us another 300 ms.
>>>>
>>>> That's factor 70 away from the 110 ms boot time Tim has talked about some days
>>>> ago (and he measured on an ARM cpu which had almost half the speed of this
>>>> one), and I'm wondering what we can do to improve the boot time.
>>>>
>>>> Robert
>>>>
>>>> rsc@thebe:~$ microcom | ptx_ts "U-Boot 2.0.0-rc9"
>>>> [ 13.522625] <  0.043189>
>>>> [ 13.546627] <  0.024002> OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
>>>> [ 13.558613] <  0.011986>
>>>> [ 13.690643] <  0.132030>        _            ____ ___  ____  _____
>>>> [ 13.690731] <  0.000088>  _ __ | |__  _   _ / ___/ _ \|  _ \| ____|
>>>> [ 13.698595] <  0.007864> | '_ \| '_ \| | | | |  | | | | |_) |  _|
>>>> [ 13.698654] <  0.000059> | |_) | | | | |_| | |__| |_| |  _ <| |___
>>>> [ 13.702581] <  0.003927> | .__/|_| |_|\__, |\____\___/|_| \_\_____|
>>>> [ 13.706573] <  0.003992> |_|          |___/
>>>> [ 13.706622] <  0.000049>
>>>> [ 13.725043] <  0.018421>
>>>> [ 14.742608] <  1.017565>
>>> I made some changes suggested in this thread:
>>>
>>> - enable MMU in the bootloader
>>> - use assembler optimized memcpy/memset in the bootloader
>>> - start an uncompressed image
>>> - disable IP autoconfiguration in the Kernel
>>> - use lpj= command line parameter
>>> - use static device nodes instead of udev
>>> - skip some init scripts
>>> - made the kernel smaller (I do not have both configs handy, so I do not
>>>   know what exactly I changed)
>>>
>>> Already looks much better:
>>>
>>> [  0.000005] <  0.000005> U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25)
>>> [  0.000026] <  0.000021>
>>> [  0.000041] <  0.000015> Board: Phytec phyCORE-i.MX27
>>> [  0.000054] <  0.000013> cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
>>> [  0.000067] <  0.000013> NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
>>> [  0.000080] <  0.000013> imxfb@imxfb0: i.MX Framebuffer driver
>>> [  0.000092] <  0.000012> dma_alloc: 0xa6f56e40 0x10000000
>>> [  0.000105] <  0.000013> dma_alloc: 0xa6f57088 0x10000000
>>> [  0.000118] <  0.000013> dev_protect: currently broken
>>> [  0.000129] <  0.000011> Using environment in NOR Flash
>>> [  0.000141] <  0.000012> initialising PLLs
>>> [  0.128972] <  0.128831> Malloc space: 0xa6f00000 -> 0xa7f00000 (size 16 MB)
>>> [  0.128995] <  0.000023> Stack space : 0xa6ef8000 -> 0xa6f00000 (size 32 kB)
>>> [  0.129008] <  0.000013> running /env/bin/init...
>>> [  0.224963] <  0.095955>
>>> [  0.224984] <  0.000021> Hit any key to stop autoboot:  0
>>> [  0.224999] <  0.000015> copy
>>> [  0.592964] <  0.367965> done
>>> [  0.652010] <  0.059046> Linux version 2.6.31-rc4-00004-g05786f8-dirty (sha@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009
>> So, this are ~0.6 s in boot loader and kernel copy until kernel starts, 
>> correct?
> 
> Yes, correct. The copying itself is between 'copy' and 'done' so it
> takes about 0.4s.
> 
>> What's the size of the uncompressed kernel copied here?
> 
> The image is about 2.8MB, but I copied the whole partition of 3MB
> because with raw images you can't detect the image size.

With 3MB copied in ~0.4s you get ~8MB/s. This really depends on your 
HW, but I would think with standard NOR flashes you should be able to 
do at least two (three?) times better. Have you already checked the 
memory (NOR flash) timings configured in your SoC?

See the second topic of

http://elinux.org/Boot_Time#Boot_time_check_list

too ;)

Best regards

Dirk

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-19 16:20       ` Dirk Behme
@ 2009-08-20  8:57         ` Sascha Hauer
  0 siblings, 0 replies; 43+ messages in thread
From: Sascha Hauer @ 2009-08-20  8:57 UTC (permalink / raw)
  To: Dirk Behme
  Cc: Robert Schwebel, linux-kernel, linux-embedded, Arjan van de Ven,
	Tim Bird, kernel

On Wed, Aug 19, 2009 at 06:20:13PM +0200, Dirk Behme wrote:
>>
>> Yes, correct. The copying itself is between 'copy' and 'done' so it
>> takes about 0.4s.
>>
>>> What's the size of the uncompressed kernel copied here?
>>
>> The image is about 2.8MB, but I copied the whole partition of 3MB
>> because with raw images you can't detect the image size.
>
> With 3MB copied in ~0.4s you get ~8MB/s. This really depends on your HW, 
> but I would think with standard NOR flashes you should be able to do at 
> least two (three?) times better. Have you already checked the memory (NOR 
> flash) timings configured in your SoC?

It's NAND flash, so there's not much timing to optimize. What's
interesting about this is that the kernel NAND driver is much slower
than the one in U-Boot. Looking at it it turned out that the kernel
driver uses interrupts to wait for the controller to get ready.
Switching this to polling nearly doubles the NAND performance. UBI
mounts much faster now and this cuts off another few seconds from the
boot process :)

Sascha


-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-08-15 10:35     ` Johannes Stezenbach
  2009-08-18 10:06       ` Marco Stornelli
@ 2009-09-04 16:16       ` Wolfram Sang
  2009-09-09 14:33         ` Johannes Stezenbach
  1 sibling, 1 reply; 43+ messages in thread
From: Wolfram Sang @ 2009-09-04 16:16 UTC (permalink / raw)
  To: Johannes Stezenbach
  Cc: Robert Schwebel, Denys Vlasenko, linux-kernel, linux-embedded,
	Arjan van de Ven, Tim Bird, kernel

[-- Attachment #1: Type: text/plain, Size: 418 bytes --]


> Now that microcom is in Debian sid (thanks!), where can I find ptx_ts?
> It seems to be quite useful.

Back from the holidays, so here it is:

http://pengutronix.de/software/ptx_ts/index_en.html

Hope it can be useful...

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-09-04 16:16       ` Wolfram Sang
@ 2009-09-09 14:33         ` Johannes Stezenbach
  2009-09-10  0:03           ` Denys Vlasenko
  0 siblings, 1 reply; 43+ messages in thread
From: Johannes Stezenbach @ 2009-09-09 14:33 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Robert Schwebel, Denys Vlasenko, linux-kernel, linux-embedded,
	Arjan van de Ven, Tim Bird, kernel

Sorry for slow reply.

On Fri, Sep 04, 2009 at 06:16:26PM +0200, Wolfram Sang wrote:
> 
> > Now that microcom is in Debian sid (thanks!), where can I find ptx_ts?
> > It seems to be quite useful.
> 
> Back from the holidays, so here it is:
> 
> http://pengutronix.de/software/ptx_ts/index_en.html
> 
> Hope it can be useful...

Yes, it is.  Thanks!

BTW, some feedback about microcom:

- the choice of ^\ as an escape charater is unfortunate since that
  is usually mapped to set SIGQUIT in the tty; a btter choice would
  be ^] (like telnet) or ^A (like minicom)
- typing the escape character immediate causes the menu to be displayed,
  so one cannot send a break sequence for SysRq without cluttering up the screen

Would you take patches for that?


Johannes

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: New fast(?)-boot results on ARM
  2009-09-09 14:33         ` Johannes Stezenbach
@ 2009-09-10  0:03           ` Denys Vlasenko
  0 siblings, 0 replies; 43+ messages in thread
From: Denys Vlasenko @ 2009-09-10  0:03 UTC (permalink / raw)
  To: Johannes Stezenbach
  Cc: Wolfram Sang, Robert Schwebel, linux-kernel, linux-embedded,
	Arjan van de Ven, Tim Bird, kernel

On Wednesday 09 September 2009 16:33, Johannes Stezenbach wrote:
> Sorry for slow reply.
> 
> On Fri, Sep 04, 2009 at 06:16:26PM +0200, Wolfram Sang wrote:
> > 
> > > Now that microcom is in Debian sid (thanks!), where can I find ptx_ts?
> > > It seems to be quite useful.
> > 
> > Back from the holidays, so here it is:
> > 
> > http://pengutronix.de/software/ptx_ts/index_en.html
> > 
> > Hope it can be useful...
> 
> Yes, it is.  Thanks!
> 
> BTW, some feedback about microcom:
> 
> - the choice of ^\ as an escape charater is unfortunate since that
>   is usually mapped to set SIGQUIT in the tty; a btter choice would
>   be ^] (like telnet) or ^A (like minicom)
> - typing the escape character immediate causes the menu to be displayed,
>   so one cannot send a break sequence for SysRq without cluttering up the screen
> 
> Would you take patches for that?

Sure.
--
vda

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2009-09-10  0:01 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-14 17:02 New fast(?)-boot results on ARM Robert Schwebel
2009-08-14 18:19 ` Zan Lynx
2009-08-14 18:46   ` Jamie Lokier
2009-08-14 18:58     ` Robert Schwebel
2009-08-14 18:57   ` Robert Schwebel
2009-08-14 21:01     ` Linus Walleij
2009-08-14 21:15       ` Robert Schwebel
2009-08-14 21:35       ` Zan Lynx
2009-08-15  6:21         ` Artem Bityutskiy
2009-08-14 20:04 ` Denys Vlasenko
2009-08-14 20:43   ` Robert Schwebel
2009-08-15  5:59     ` Dirk Behme
2009-08-15 10:35     ` Johannes Stezenbach
2009-08-18 10:06       ` Marco Stornelli
2009-08-18 10:21         ` Robert Schwebel
2009-08-18 10:34           ` Alex Riesen
2009-08-18 10:44             ` Robert Schwebel
2009-08-18 10:48               ` Alex Riesen
2009-08-18 10:53                 ` Robert Schwebel
2009-09-04 16:16       ` Wolfram Sang
2009-09-09 14:33         ` Johannes Stezenbach
2009-09-10  0:03           ` Denys Vlasenko
2009-08-17 19:15     ` Tim Bird
2009-08-17 22:35       ` new ipdelay= option for faster netboot (was Re: New fast(?)-boot results on ARM) Tim Bird
2009-08-18  1:03         ` new ipdelay= option for faster netboot David Miller
2009-08-18  1:24           ` Tim Bird
2009-08-18  1:27             ` David Miller
2009-08-18  1:40               ` Tim Bird
2009-08-18  1:56                 ` David Miller
2009-08-19 11:57                 ` Jamie Lokier
2009-08-18  4:56               ` Denys Vlasenko
2009-08-18  5:00                 ` David Miller
2009-08-18  1:31           ` Rick Jones
2009-08-18  2:45             ` david
2009-08-18  4:56               ` Willy Tarreau
2009-08-15  6:14   ` New fast(?)-boot results on ARM Artem Bityutskiy
2009-08-18 14:06 ` Sascha Hauer
2009-08-18 15:31   ` Dirk Behme
2009-08-18 16:34     ` Marco Stornelli
2009-08-18 18:23     ` Tim Bird
2009-08-19  7:21     ` Sascha Hauer
2009-08-19 16:20       ` Dirk Behme
2009-08-20  8:57         ` Sascha Hauer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.