linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.18-mm2
@ 2006-09-28  8:46 Andrew Morton
  2006-09-28 11:54 ` 2.6.18-mm2 Michal Piotrowski
                   ` (7 more replies)
  0 siblings, 8 replies; 140+ messages in thread
From: Andrew Morton @ 2006-09-28  8:46 UTC (permalink / raw)
  To: linux-kernel


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/


- Added the SuperH architecture git tree to the -mm lineup as git-sh.patch
  (Paul Mundt)

- Added the SuperH64 architecture git tree to the -mm lineup as git-sh64.patch
  (Paul Mundt)

- Added the PCI-Domain support tree to the -mm lineup as git-pciseg.patch
  (Jeff Garzik)

- The git-input tree has been temporarily dropped due to various USB mouse
  related failures.

- More updates to the MSI code.  If your machine has Message Signalled
  Interrupts, please enable it and give it a try.

- The reboot command doesn't work if you're using netconsole-over-e100.




Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Semi-daily snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.




Changes since 2.6.18-mm1:


 origin.patch
 git-acpi.patch
 git-agpgart.patch
 git-arm.patch
 git-block.patch
 git-cifs.patch
 git-cpufreq.patch
 git-drm.patch
 git-dvb.patch
 git-geode.patch
 git-gfs2.patch
 git-ia64.patch
 git-ieee1394.patch
 git-intelfb.patch
 git-jfs.patch
 git-libata-all.patch
 git-lxdialog.patch
 git-mmc.patch
 git-mtd.patch
 git-netdev-all.patch
 git-net.patch
 git-ocfs2.patch
 git-parisc.patch
 git-pcmcia.patch
 git-powerpc.patch
 git-serial.patch
 git-pciseg.patch
 git-s390.patch
 git-scsi-misc.patch
 git-block-vs-git-sas.patch
 git-scsi-target.patch
 git-watchdog.patch

 git trees

+__percpu_alloc_mask-has-to-be-__always_inline-in-up-case.patch
+sys_getcpu-prototype-annotated.patch
+remove-generic__raw_read_trylock.patch
+jbd-memory-leak-in-journal_init_dev.patch

 Queued for 2.6.19-rc1.

-autofs4-zero-timeout-prevents-shutdown.patch
-rtc-lockdep-fix-workaround.patch
-i386-bootioremap--kexec-fix.patch
-do-not-free-non-slab-allocated-per_cpu_pageset.patch
-vidioc_enumstd-bug.patch
-backlight-fix-oops-in-__mutex_lock_slowpath-during-head-sys-class-graphics-fb0.patch
-cpu-to-node-relationship-fixup-take2.patch
-cpu-to-node-relationship-fixup-map-cpu-to-node.patch
-i386-fix-flat-mode-numa-on-a-real-numa-system.patch
-load_module-no-bug-if-module_subsys-uninitialized.patch
-fix-longstanding-load-balancing-bug-in-the-scheduler.patch
-trigger-a-syntax-error-if-percpu-macros-are-incorrectly-used.patch
-allow-file-systems-to-manually-d_move-inside-of-rename.patch
-jbd-fix-commit-of-ordered-data-buffers.patch
-update-to-the-kernel-kmap-kunmap-api.patch
-acpi-mwait-c-state-fixes.patch
-kthread-switch-arch-arm-kernel-apmc.patch
-gregkh-driver-documentation-abi-devfs-is-not-obsolete-but-removed.patch
-gregkh-driver-deprecate-physdev-keys.patch
-gregkh-driver-class_device_create-make-fmt-argument-const-char.patch
-gregkh-driver-device_create-make-fmt-argument-const-char.patch
-gregkh-driver-driver-core-add-const-to-class_create.patch
-gregkh-driver-sysfs-make-poll-behaviour-consistent.patch
-gregkh-driver-debugfs-kernel-doc-fixes-for-debugfs.patch
-gregkh-driver-sysfs_symlink_in_root.patch
-gregkh-driver-suspend-infrastructure-cleanup-and-extension.patch
-gregkh-driver-suspend-pci.patch
-gregkh-driver-make-suspend-quieter.patch
-gregkh-driver-fix-broken-dubious-driver-suspend-methods.patch
-gregkh-driver-pm-define-pm_event_prethaw.patch
-gregkh-driver-pm-pci-and-ide-handle-pm_event_prethaw.patch
-gregkh-driver-pm-video-drivers-and-pm_event_prethaw.patch
-gregkh-driver-pm-usb-hcds-use-pm_event_prethaw.patch
-gregkh-driver-pm-issue-pm_event_prethaw.patch
-gregkh-driver-pm-update-docs-for-writing-...-power-state.patch
-gregkh-driver-pm-add-kconfig-option-for-deprecated-...-power-state-files.patch
-gregkh-driver-pm-schedule-sys-devices-...-power-state-for-removal.patch
-gregkh-driver-pm-no-suspend_prepare-phase.patch
-gregkh-driver-pm-add-sys-power-documentation-to-documentation-abi.patch
-gregkh-driver-pm-device_suspend-resume-may-sleep.patch
-gregkh-driver-pm-platform_bus-and-late_suspend-early_resume.patch
-gregkh-driver-device-groups.patch
-gregkh-driver-device-class-parent.patch
-gregkh-driver-device-class-attr.patch
-gregkh-driver-device_rename.patch
-gregkh-driver-device-virtual.patch
-gregkh-driver-class_device_interface.patch
-gregkh-driver-device_bin_file.patch
-gregkh-driver-kobject-must_check-fixes.patch
-gregkh-driver-sysfs_remove_bin_file-no-return-value-dump_stack-on-error.patch
-gregkh-driver-driver-core-fix-comments-in-drivers-base-power-resume.c.patch
-gregkh-driver-driver-core-fixed-add_bind_files-definition.patch
-gregkh-driver-add-__must_check-to-device-management-code.patch
-gregkh-driver-add-config_enable_must_check.patch
-gregkh-driver-v4l-dev2-handle-__must_check.patch
-gregkh-driver-drivers-base-platform-notify-needs-to-occur-before-drivers-attach-to-the-device.patch
-gregkh-driver-drivers-base-check-errors.patch
-gregkh-driver-sysfs-add-proper-sysfs_init-prototype.patch
-gregkh-driver-driver-multithread.patch
-gregkh-driver-pci-multithreaded-probe.patch
-gregkh-driver-driver-core-fix-potential-deadlock-in-driver-core.patch
-gregkh-driver-driver-core-remove-unneeded-routines-from-driver-core.patch
-gregkh-driver-driver-core-don-t-call-put-methods-while-holding-a-spinlock.patch
-scsi-device_reprobe-can-fail.patch
-gregkh-i2c-i2c-dev-cleanups.patch
-gregkh-i2c-i2c-dev-convert-array-to-list.patch
-gregkh-i2c-i2c-dev-drop-template-client.patch
-gregkh-i2c-i2c-dev-device.patch
-gregkh-i2c-i2c-__must_check-fixes.patch
-gregkh-i2c-i2c-__must_check-fixes-i2c-dev.patch
-gregkh-i2c-i2c-algo-sibyte-cleanups.patch
-gregkh-i2c-i2c-algo-sibyte-merge-in-i2c-sibyte.patch
-gregkh-i2c-i2c-sibyte-drop-kip-walker-address.patch
-gregkh-i2c-i2c-au1550-fix-timeout-problem.patch
-gregkh-i2c-i2c-au1550-add-smbus-functionality-flag.patch
-gregkh-i2c-i2c-au1550-add-au1200-support.patch
-gregkh-i2c-i2c-fix-copy-n-paste-in-subsystem-Kconfig.patch
-gregkh-i2c-i2c-matroxfb-c99-struct-init.patch
-gregkh-i2c-i2c-algo-bit-kill-mdelay.patch
-gregkh-i2c-i2c-bus-driver-for-TI-OMAP-boards.patch
-gregkh-i2c-i2c-isa-plan-for-removal.patch
-gregkh-i2c-i2c-stub-add-chip_addr-param.patch
-gregkh-i2c-i2c-dev-attach-detach-adapter-cleanups.patch
-gregkh-i2c-i2c-chips-__must_check-fixes.patch
-gregkh-i2c-i2c-isa-return-attach_adapter.patch
-gregkh-i2c-i2c-algo-bit-cleanups.patch
-gregkh-i2c-i2c-algo-pcf-kill-mdelay.patch
-gregkh-i2c-i2c-drop-useless-masking.patch
-gregkh-i2c-i2c-warn-on-failed-client-attach.patch
-gregkh-i2c-i2c-viapro-add-VT8251-VT8237A.patch
-gregkh-i2c-i2c-isa-restore-driver-owner.patch
-gregkh-i2c-i2c-constify-i2c_algorithm.patch
-gregkh-i2c-i2c-algos-constify-i2c_algorithm.patch
-gregkh-i2c-i2c-busses-constify-i2c_algorithm.patch
-gregkh-i2c-i2c-drop-slave-functions.patch
-i2c-mpc-fix-up-error-handling.patch
-ia64-kprobes-fixup-the-pagefault-exception-caused-by-probehandlers.patch
-stowaway-keyboard-support.patch
-stowaway-keyboard-support-update.patch
-wistron-fix-detection-of-special-buttons.patch
-fail-kernel-compilation-in-case-of-unresolved-symbols-v2.patch
-kerneldoc-error-on-ata_piixc.patch
-1-of-2-jmicron-driver.patch
-1-of-2-jmicron-driver-fix.patch
-2-of-2-jmicron-driver-plumbing-and-quirk.patch
-2-of-2-jmicron-driver-plumbing-and-quirk-cleanup.patch
-via-sata-oops-on-init.patch
-e1000-memory-leak-in-e1000_set_ringparam.patch
-drivers-net-acenicc-removal-of-old-code.patch
-drivers-net-tokenring-lanstreamerc-removal-of-old-code.patch
-drivers-net-tokenring-lanstreamerh-removal-of-old-code.patch
-drivers-net-typhoonc-removal-of-old-code.patch
-signedness-issue-in-drivers-net-phy-phy_devicec.patch
-fix-possible-null-ptr-deref-in-forcedeth.patch
-e1000-account-for-net_ip_align-when-calculating-bufsiz.patch
-net-ipv6-bh_lock_sock_nested-on-tcp_v6_rcv.patch
-via-ircc-fix-memory-leak.patch
-atm-he-fix-section-mismatch.patch
-add-netpoll-netconsole-support-to-vlan-devices.patch
-neighbourc-pneigh_get_next-skips-published-entry.patch
-nfs-replace-null-dentries-that-appear-in-readdirs-list-2.patch
-add-newline-to-nfs-dprintk.patch
-fs-nfs-make-code-static.patch
-gregkh-pci-resources-insert-identical-resources-above-existing-resources.patch
-gregkh-pci-msi-cleanup-existing-msi-quirks.patch
-gregkh-pci-msi-factorize-common-code-in-pci_msi_supported.patch
-gregkh-pci-msi-export-the-pci_bus_flags_no_msi-flag-in-sysfs.patch
-gregkh-pci-msi-rename-pci_cap_id_ht_irqconf-into-pci_cap_id_ht.patch
-gregkh-pci-msi-blacklist-pci-e-chipsets-depending-on-hypertransport-msi-capability.patch
-gregkh-pci-pcie-check-and-return-bus_register-errors.patch
-gregkh-pci-pci-express-aer-implemetation-aer-howto-document.patch
-gregkh-pci-pci-express-aer-implemetation-export-pcie_port_bus_type.patch
-gregkh-pci-pci-express-aer-implemetation-aer-core-and-aerdriver.patch
-gregkh-pci-pci-express-aer-implemetation-pcie_portdrv-error-handler.patch
-gregkh-pci-shpchp-must_check-fixes.patch
-gregkh-pci-pci-hotplug-must_check-fixes.patch
-gregkh-pci-pci-must_check-fixes.patch
-gregkh-pci-pci-multiprobe-sanitizer.patch
-gregkh-pci-pci-drivers-pci-hotplug-acpiphp_glue.c-make-a-function-static.patch
-gregkh-pci-pci-restore-pci-express-capability-registers-after-pm-event.patch
-gregkh-pci-pci-hotplug-cleanup-pcihp-skeleton-code.patch
-gregkh-pci-acpiphp-set-hpp-values-before-starting-devices.patch
-gregkh-pci-acpiphp-initialize-ioapics-before-starting-devices.patch
-gregkh-pci-acpiphp-do-not-initialize-existing-ioapics.patch
-gregkh-pci-pci-add-pci_stop_bus_device.patch
-gregkh-pci-acpiphp-stop-bus-device-before-acpi_bus_trim.patch
-gregkh-pci-acpiphp-disable-bridges.patch
-gregkh-pci-pci-assign-ioapic-resource-at-hotplug.patch
-gregkh-pci-acpiphp-add-support-for-ioapic-hot-remove.patch
-gregkh-pci-ia64-pci-dont-disable-irq-which-is-not-enabled.patch
-gregkh-pci-pciehp-fix-wrong-return-value.patch
-revert-scsi-improve-inquiry-printing.patch
-dc395x-fix-printk-format-warning.patch
-pci_module_init-conversion-in-scsi-subsys-2nd-try.patch
-megaraid-use-the-proper-type-to-hold-the-irq-number.patch
-drivers-scsi-dpt-dpti_i2oh-removal-of-old.patch
-drivers-scsi-gdthh-removal-of-old-scsi-code.patch
-drivers-scsi-nsp32h-removal-of-old-scsi-code.patch
-drivers-message-fusion-linux_compath-removal-of-old-code.patch
-signedness-issue-in-drivers-scsi-iprc.patch
-signedness-issue-in-drivers-scsi-osstc.patch
-bodge-scsi-misc-module-reference-count-checks-with-no-module_unload.patch
-scsi-remove-seagateh.patch
-scsi-seagate-scsi_cmnd-conversion.patch
-3w-xxxx-fix-ata-udma-upgrade-message-number.patch
-scsi-included-header-cleanup.patch
-gregkh-usb-usb-unusual_devs-entry-for-lacie-dvd-rw.patch
-gregkh-usb-usb-unusual_dev-entry-for-sony-p990i.patch
-gregkh-usb-usb-doc-patch-1.patch
-gregkh-usb-usb-doc-patch-2.patch
-gregkh-usb-usb-ohci-avoids-root-hub-timer-polling.patch
-gregkh-usb-usb-ohci-s3c2410.c-clock-now-usb-bus-host.patch
-gregkh-usb-usb-ohci-controller-support-for-pnx4008.patch
-gregkh-usb-usb-kill-usb-kconfig-warning.patch
-gregkh-usb-usb-move-linux-usb_otg.h-to-linux-usb-otg.h.patch
-gregkh-usb-usb-pxa2xx_udc-understands-gpio-based-vbus-sensing.patch
-gregkh-usb-usb-allow-compile-in-g_ether-fix-typo.patch
-gregkh-usb-usb-ark3116-add-tiocgserial-and-tiocsserial-ioctl-calls.patch
-gregkh-usb-usb-ark3116-formatting-cleanups.patch
-gregkh-usb-usb-make-usb_buffer_free-null-safe.patch
-gregkh-usb-usbcore-add-configuration_string-to-attribute-group.patch
-gregkh-usb-usb-add-driver-for-phidgetmotorcontrol.patch
-gregkh-usb-usb-put-phidgets-driver-in-a-sysfs-class.patch
-gregkh-usb-usb-phidgets-should-check-create_device_file-return-value.patch
-gregkh-usb-usbfs-private-mutex-for-open-release-and-remove.patch
-gregkh-usb-usbfs-detect-device-unregistration.patch
-gregkh-usb-usb-skeleton-don-t-submit-urbs-after-disconnection.patch
-gregkh-usb-usbcore-rename-usb_suspend_device-to-usb_port_suspend.patch
-gregkh-usb-usbcore-move-code-among-source-files.patch
-gregkh-usb-usbcore-add-usb_device_driver-definition.patch
-gregkh-usb-usbcore-make-usb_generic-a-usb_device_driver.patch
-gregkh-usb-usbcore-split-suspend-resume-for-device-and-interfaces.patch
-gregkh-usb-usbcore-resume-device-resume-recursion.patch
-gregkh-usb-usbcore-track-whether-interfaces-are-suspended.patch
-gregkh-usb-usbcore-set-device-and-power-states-properly.patch
-gregkh-usb-usbcore-fix-up-device-and-power-state-tests.patch
-gregkh-usb-usbcore-suspending-devices-with-no-driver.patch
-gregkh-usb-hub-driver-improve-use-of-ifdef.patch
-gregkh-usb-usb-usbtouchscreen-version-0.4.patch
-gregkh-usb-usb-pl2303-removes-unneeded-goto.patch
-gregkh-usb-usb-pl2303-remove-80-columns-limit-violations-in-pl2303-driver.patch
-gregkh-usb-usb-pl2303-cosmetic-changes-to-pl2303_buf_-clear-data_avail.patch
-gregkh-usb-usb-pl2303-reduce-number-of-prototypes.patch
-gregkh-usb-usb-pl2303-cosmetic-changes-to-quirk.patch
-gregkh-usb-usb-usbnet-add-unlink_rx_urbs-call-to-allow-for-jumbo-frames.patch
-gregkh-usb-usb-asix-add-ax88178-support-and-many-other-changes.patch
-gregkh-usb-usbnet-printk-format-warning.patch
-gregkh-usb-usb-ipaq-minor-ipaq_open-cleanup.patch
-gregkh-usb-usb-usbcore-get-rid-of-the-timer-in-usb_start_wait_urb.patch
-gregkh-usb-usb-wacom-tablet-driver-reorganization.patch
-gregkh-usb-usb-garmin_gps-support-for-new-generation-of-gps-receivers.patch
-gregkh-usb-usb-build-fixes-ohci-omap.patch
-gregkh-usb-usb-onetouch-handle-errors-from-input_register_device.patch
-gregkh-usb-usb-correct-locking-in-gadgetfs_disconnect.patch
-gregkh-usb-usb-fix-ep_config-to-return-correct-value.patch
-gregkh-usb-usb-gadgetfs-protect-ep_release-with-lock.patch
-gregkh-usb-usb-gmidi-new-usb-midi-gadget-class-driver.patch
-gregkh-usb-usb-make-file-operations-structs-in-drivers-usb-const.patch
-gregkh-usb-usb-making-the-kernel-wshadow-clean-usb-completion.patch
-gregkh-usb-usb-new-functions-to-check-endpoints-info.patch
-gregkh-usb-usb-usblp-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-hub-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-appletouch-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-acecad-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-ati_remote-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-keyspan_remote-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-powermate-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-usb-serial-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-usblcd-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-ldusb-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-net1080-inherent-pad-length.patch
-gregkh-usb-usb-add-poll-to-gadgetfs-s-endpoint-zero.patch
-gregkh-usb-usb-gadget-gadgetfs-dont-try-to-lock-before-free.patch
-gregkh-usb-usb-properly-unregister-reboot-notifier-in-case-of-failure-in-ehci-hcd.patch
-gregkh-usb-uhci-increase-resume-detect-off-delay.patch
-gregkh-usb-usbcore-make-hcd_endpoint_disable-wait-for-queue-to-drain.patch
-gregkh-usb-usbcore-khubd-and-busy-port-handling.patch
-gregkh-usb-usb-skeleton-small-update.patch
-gregkh-usb-usb-storage-add-rio-karma-eject-support.patch
-gregkh-usb-usb-deal-with-broken-config-descriptors.patch
-gregkh-usb-wusb-hub-recognizes-wusb-ports.patch
-gregkh-usb-wusb-handle-wusb-device-ep0-speed-settings.patch
-gregkh-usb-wusb-pretty-print-new-devices.patch
-gregkh-usb-usb-core-use-const-where-possible.patch
-gregkh-usb-usb-fix-signedness-issue-in-drivers-usb-gadget-ether.c.patch
-gregkh-usb-usb-fix-typo-in-drivers-usb-gadget-kconfig.patch
-gregkh-usb-usb-storage-fix-for-ufi-lun-detection.patch
-gregkh-usb-usbcore-help-drivers-to-change-device-configs.patch
-gregkh-usb-usb-turn-usb_resume_both-into-static-inline.patch
-gregkh-usb-usb-usb-hub-driver-improve-use-of-ifdef-fix.patch
-gregkh-usb-usb-remove-struct-usb_operations.patch
-gregkh-usb-usbcore-add-flag-for-whether-a-host-controller-uses-dma.patch
-gregkh-usb-usbcore-trim-down-usb_bus-structure.patch
-gregkh-usb-usbmon-don-t-call-mon_dmapeek-if-dma-isn-t-being-used.patch
-gregkh-usb-usb-ethernet-gadget-avoids-zlps-for-musb_hdrc.patch
-gregkh-usb-usb-ehci-whitespace-fixes.patch
-gregkh-usb-gadgetfs-patch-for-ep0out.patch
-gregkh-usb-usb-replace-kernel_thread-with-kthread_run-in-libusual.c.patch
-gregkh-usb-usb-usb-serial-gadget-smp-related-bug.patch
-gregkh-usb-usb-net2280-update-dma-buffer-allocation.patch
-gregkh-usb-usb-ohci-at91-two-one-liners.patch
-gregkh-usb-usb-usb-input-usbmouse.c-whitespace-cleanup.patch
-gregkh-usb-usb-ub-let-cdrecord-to-see-a-device-with-media-absent.patch
-gregkh-usb-usbcore-store-each-usb_device-s-level-in-the-tree.patch
-gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure.patch
-gregkh-usb-usbcore-non-hub-specific-uses-of-autosuspend.patch
-gregkh-usb-usbcore-remove-usb_suspend_root_hub.patch
-gregkh-usb-usb-fix-root-hub-resume-when-config_usb_suspend-is-not-set.patch
-gregkh-usb-usb-core-must_check.patch
-gregkh-usb-usb-misc-must_check.patch
-gregkh-usb-usb-atm-must_check.patch
-gregkh-usb-usb-class-must_check.patch
-gregkh-usb-usb-input-must_check.patch
-gregkh-usb-usb-host-must_check.patch
-gregkh-usb-usb-serial-must_check-fixes.patch
-gregkh-usb-cypress_m8-use-appropriate-urb-polling-interval.patch
-gregkh-usb-cypress_m8-use-usb_fill_int_urb-where-appropriate.patch
-gregkh-usb-cypress_m8-improve-control-endpoint-error-handling.patch
-gregkh-usb-cypress_m8-implement-graceful-failure-handling.patch
-gregkh-usb-add-aircable-usb-bluetooth-dongle-driver.patch
-gregkh-usb-aircable-fix-printk-format-warnings.patch
-gregkh-usb-usb-adutux-driver.patch
-gregkh-usb-usb-add-playstation-2-trance-vibrator-driver.patch
-gregkh-usb-usb-moschip-7840-usb-serial-driver.patch
-gregkh-usb-usb-serial-support-alcor-micro-corp.-usb-2.0-to-rs-232-through-pl2303-driver.patch
-gregkh-usb-usb-ftdi-elan-client-driver-for-elan-uxxx-adapters.patch
-gregkh-usb-usb-u132-hcd-host-controller-driver-for-elan-u132-adapter.patch
-gregkh-usb-usb-remove-unneeded-void-casts-in-core-files.patch
-gregkh-usb-usb-dealias-110-code.patch
-gregkh-usb-usb-ohci_usb-can-oops-on-shutdown.patch
-gregkh-usb-usb-force-root-hub-resume-after-power-loss.patch
-gregkh-usb-usb-ehci-update-via-workaround.patch
-gregkh-usb-usb-remove-otg-build-warning.patch
-gregkh-usb-airprime_major_update.patch
-gregkh-usb-usb-storage-add-rio-karma-eject-support-fix.patch
-fix-gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure.patch
-x86_64-mm-i386-up-generic-arch.patch
-x86_64-mm-temp-revert-arch-perfmon.patch
-x86_64-mm-add-performance-counter-reservation-framework-for-up-kernels.patch
-x86_64-mm-utilize-performance-counter-reservation-framework-in-oprofile.patch
-x86_64-mm-add-smp-support-on-x86_64-to-reservation-framework.patch
-x86_64-mm-add-smp-support-on-i386-to-reservation-framework.patch
-x86_64-mm-cleanup-nmi-interrupt-path.patch
-x86_64-mm-tif-restore-sigmask.patch
-x86_64-mm-add-ppoll-pselect.patch
-x86_64-mm-remove-un-set_nmi_callback-and-reserve-release_lapic_nmi-functions.patch
-x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-sysfs.patch
-x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-procfs-update.patch
-x86_64-mm-allow-users-to-force-a-panic-on-nmi.patch
-x86_64-mm-x86-clean-up-nmi-panic-messages.patch
-x86_64-mm-x86-nmi-watchdog-suspend.patch
-x86_64-mm-unknown-nmi-panic.patch
-x86_64-mm-make-functions-static.patch
-x86_64-mm-kdump-x86_64-nmi-event-notification-fix.patch
-x86_64-mm-kdump-i386-nmi-event-notification-fix.patch
-x86_64-mm-i386-enable-nmi-wdog.patch
-x86_64-mm-add-nmi-watchdog-support-for-new-intel-cpus.patch
-x86_64-mm-rdtscp-macros.patch
-x86_64-mm-init-rdtscp.patch
-x86_64-mm-getcpu-vsyscall.patch
-x86_64-mm-generic-getcpu-syscall.patch
-x86_64-mm-no-asm-smp.patch
-x86_64-mm-tif-flags-for-debug-regs-and-io-bitmap-in-ctxsw.patch
-x86_64-mm-hpet-cosmetics.patch
-x86_64-mm-a-few-trivial-spelling-and-grammar-fixes.patch
-x86_64-mm-randomize-check.patch
-x86_64-mm-i386-profile-pc.patch
-x86_64-mm-simplify-profile-pc.patch
-x86_64-mm-backtracer-docs.patch
-x86_64-mm-asm-alternative.patch
-x86_64-mm-rwlock-to-asm.patch
-x86_64-mm-i386-remove-const-rwlock.patch
-x86_64-mm-fix-align.patch
-x86_64-mm-i386-asm-alternative.patch
-x86_64-mm-i386-semaphore-to-asm.patch
-x86_64-mm-remove-thunk-cvs-id.patch
-x86_64-mm-tce-comment.patch
-x86_64-mm-remove-apic-ifdefs.patch
-x86_64-mm-remove-apic-mismatch.patch
-x86_64-mm-remove-focus-disabled-workaround.patch
-x86_64-mm-tlb-flush-cleanup.patch
-x86_64-mm-i386-tlbflush-fixes.patch
-x86_64-mm-entry-comments.patch
-x86_64-mm-remove-pirq.patch
-x86_64-mm-remove-mca-eisa.patch
-x86_64-mm-remove-pic-mode.patch
-x86_64-mm-remove-mpparse-checks.patch
-x86_64-mm-io-apic-access.patch
-x86_64-mm-i386-io-apic-access.patch
-x86_64-mm-aux_device_info-is-one-byte-long,-use-movb.patch
-x86_64-mm-remove-apic-renumbering.patch
-x86_64-mm-quirks-own-file.patch
-x86_64-mm-mp-bus-type-bitmap.patch
-x86_64-mm-remove-mpparse-wrapper.patch
-x86_64-mm-remove-acpi-externs-in-mpparse.patch
-x86_64-mm-mpparse-acpi-style.patch
-x86_64-mm-i386-mpparse-acpi-style.patch
-x86_64-mm-apic-build-bug-on.patch
-x86_64-mm-detect-cfi.patch
-x86_64-mm-kernel-asm-remove-cvs-id.patch
-x86_64-mm-initialize-end-of-memory-variables-as-early-as.patch
-x86_64-mm-remove-int_delivery_dest.patch
-x86_64-mm-i386-end-of-memory.patch
-x86_64-mm-kernel-stack-doc.patch
-x86_64-mm-calgary-rearrange-struct-iommu_table.patch
-x86_64-mm-calgary-consolidate-per-bus-data.patch
-x86_64-mm-calgary-break-out-of.patch
-x86_64-mm-calgary-fix-error-path-memleak-in.patch
-x86_64-mm-calgary-fix-reference-counting-of.patch
-x86_64-mm-calgary-init-one.patch
-x86_64-mm-calgary-save-a-bit-of-space-in-bus_info.patch
-x86_64-mm-i386-remove-lock-section.patch
-x86_64-mm-remove-lock-section.patch
-x86_64-mm-fix-is_at_popf-for-compat-tasks.patch
-x86_64-mm-spinlock-cleanup.patch
-x86_64-mm-i386-spinlock-cleanup.patch
-x86_64-mm-annotate-lib.patch
-x86_64-mm-fix-gdt-table-size-in-trampoline.s.patch
-x86_64-mm-remove-superflous-bug_ons-in-nommu-and-gart.patch
-x86_64-mm-remove-lock-prefix-from-is_at_popf-tests.patch
-x86_64-mm-early-cpu-identify.patch
-x86_64-mm-allow-early_param-and-identical-__setup-to-exist.patch
-x86_64-mm-i386-early-param.patch
-x86_64-mm-early-param.patch
-x86_64-mm-remove-early-lockdep.patch
-x86_64-mm-move-acpi-disabled.patch
-x86_64-mm-move-acpi-numa.patch
-x86_64-mm-move-e820map.patch
-x86_64-mm-vsyscall-sparse.patch
-x86_64-mm-fault-sparse.patch
-x86_64-mm-sys_ia32-sparse.patch
-x86_64-mm-aout-sparse.patch
-x86_64-mm-replace-local_save_flags+local_irq_disable-with.patch
-x86_64-mm-acpi-remove-extern.patch
-x86_64-mm-tf-iret.patch
-x86_64-mm-print-whether-config_iommu_debug-is.patch
-x86_64-mm-only-verify-the-allocation-bitmap-if.patch
-x86_64-mm-remove-tce_cache_blast_stress.patch
-x86_64-mm-eradicate-sole-remaining-80-chars.patch
-x86_64-mm-stacktrace-cleanup.patch
-x86_64-mm-lockdep-stacktrace-no-recursion.patch
-x86_64-mm-early-safe-smp-processor-id.patch
-x86_64-mm-early-unwind-init.patch
-x86_64-mm-stacktrace-unwinder.patch
-x86_64-mm-stacktrace-terminate.patch
-x86_64-mm-i386-stacktrace-unwinder.patch
-x86_64-mm-i386-stacktrace-terminate.patch
-x86_64-mm-i386-backtrace-ebp-fallback.patch
-x86_64-mm-lockdep-dont-force-framepointer.patch
-x86_64-mm-fix-dubious-segment-register-clear-in-cpu_init.patch
-x86_64-mm-dont-taint-up-k7s-running-smp-kernels..patch
-x86_64-mm-kprobes-error_code.patch
-x86_64-mm-monotonic-clock.patch
-x86_64-mm-improve-crash-dump-description.patch
-x86_64-mm-boot-param-bss.patch
-x86_64-mm-i386-fix-mpparse-warning.patch
-x86_64-mm-fault-notifier-export.patch
-x86_64-mm-i386-fault-notifier-export.patch
-x86_64-mm-i386-acpi_force-static.patch
-x86_64-mm-i386-enable_local_apic-static.patch
-x86_64-mm-i386-kernel-thread.patch
-x86_64-mm-i386-desc-cleanup.patch
-x86_64-mm-per-cpu-area-size.patch
-x86_64-mm-i386-topology-cleanup.patch
-x86_64-mm-i386-more-init.patch
-x86_64-mm-fix-bus-numbering-format-in-mmconfig-warning.patch
-x86_64-mm-support-physical-cpu-hotplug-for-x86_64.patch
-x86_64-mm-less-lazy-fpu.patch
-x86_64-mm-wire-up-oops_enter-oops_exit.patch
-x86_64-mm-add-mem-fix.patch
-x86_64-mm-remove-redundant-generic_identify-calls-when-identifying-cpus.patch
-x86_64-mm-mark-init_amd-as-__cpuinit.patch
-x86_64-mm-mark-cpu_dev-structures-as-__cpuinitdata.patch
-x86_64-mm-mark-cpu-init-functions-as-__cpuinit,-data-as-__cpuinitdata.patch
-x86_64-mm-mark-cpu-identify-functions-as-__cpuinit.patch
-x86_64-mm-mark-cpu-cache-functions-as-__cpuinit.patch
-x86_64-mm-i386-kprobes-mca.patch
-x86_64-mm-i386-kprobes-nmi.patch
-x86_64-mm-remove-config.h-includes-from-asm-i386-asm-x86_64.patch
-x86_64-mm-drop-640k-reservation.patch
-x86_64-mm-move-compiler-check-to-ia64.patch
-x86_64-mm-make-numa_emulation-__init.patch
-x86_64-mm-i386-cfi-nmi.patch
-x86_64-mm-detect-clock-skew-during-suspend.patch
-x86_64-mm-remove-safe_smp_processor_id.patch
-x86_64-mm-early_ioremap-warning.patch
-x86_64-mm-pte-exec.patch
-x86_64-mm-cpa-pse-cleanup.patch
-x86_64-mm-remove-apic-version-capability.patch
-x86_64-mm-cleanup-apic-id-checking.patch
-x86_64-mm-mpparse-style.patch
-x86_64-mm-nmi-irqtrace-check.patch
-x86_64-mm-fix-head.S-warning.patch
-x86_64-mm-remove-e820-fallback.patch
-x86_64-mm-optimize-hweight64-for-x86_64.patch
-x86_64-mm-reload-cs-in-head.patch
-x86_64-mm-note-section.patch
-x86_64-mm-e820-comment.patch
-x86_64-mm-proxy-pda.patch
-x86_64-mm-fix-the-edd-code-misparsing-the-command-line.patch
-x86_64-mm-remove-most-of-the-special-cases-for-the-debug-ist-stack.patch
-x86_64-mm-kexec-dont-overwrite-pgd.patch
-x86_64-mm-i386-kexec-dont-overwrite-pgd.patch
-x86_64-mm-trace-kernel-text-address.patch
-x86_64-mm-document-tree.patch
-x86_64-mm-stack-protector-annotate-the-pda-offsets.patch
-x86_64-mm-stack-protector-add-the-kconfig-option.patch
-x86_64-mm-stack-protector-add-canary.patch
-x86_64-mm-stack-protector-add_stack_chk_fail.patch
-x86_64-mm-stack-protector-cflags.patch
-x86_64-mm-fix-irqcount-comment.patch
-x86_64-mm-pda-use-c-output-modifier.patch
-x86_64-mm-type-checking-for-write_pda.patch
-x86_64-mm-fix-pda-warning.patch
-x86_64-mm-i386-replace-sensitive-instructions.patch
-x86_64-mm-i386-allow-a-kernel-to-not-be-in-ring0.patch
-x86_64-mm-i386-pda-asm-offsets.patch
-x86_64-mm-i386-pda-basics.patch
-x86_64-mm-i386-pda-init-pda.patch
-x86_64-mm-i386-pda-use-gs.patch
-x86_64-mm-i386-pda-user-abi.patch
-x86_64-mm-i386-pda-vm86.patch
-x86_64-mm-i386-pda-smp-processorid.patch
-x86_64-mm-i386-pda-current.patch
-x86_64-mm-i386-early-fault.patch
-x86_64-mm-insert-ioapics-and-local-apic-into-resource-map.patch
-x86_64-mm-acpi-add-hpet-into-resource-map.patch
-x86_64-mm-copy-user-zeroing.patch
-x86_64-mm-copy-user-mustcheck.patch
-x86_64-mm-compat-pselect-must-check.patch
-x86_64-mm-compat-uname-must-check.patch
-x86_64-mm-copy-user-style.patch
-x86_64-mm-pda-style.patch
-x86_64-mm-pda-noreturn.patch
-x86_64-mm-remove-mmx.patch
-x86_64-mm-init-per-cpu-data-again.patch
-x86_64-mm-i386-kexec-not-experimental.patch
-x86_64-mm-kexec-not-experimental.patch
-x86_64-mm-fix-idle-notifiers.patch
-x86_64-mm-pci-probe-type1-first.patch
-x86_64-mm-mcfg-type1-heuristic.patch
-x86_64-mm-insert-gart-region-into-resource-map.patch
-x86_64-mm-mcfg-resource.patch
-x86_64-mm-i386-mcfg-resource.patch
-x86_64-mm-i386-pack-descriptor.patch
-x86_64-mm-i386-multiline-oops.patch
-x86_64-mm-restore-i8259a-eoi.patch
-x86_64-mm-core2-rep-good.patch
-x86_64-mm-mmconfig-fix-comment.patch
-x86_64-mm-amd-single-cpu-sync-rdtsc.patch
-x86_64-mm-remove-signal-map.patch
-x86_64-mm-ia32-signal-regparm.patch
-x86_64-mm-ia32-signal-style.patch
-x86_64-mm-unwind-signal-frame-detect.patch
-x86_64-mm-dont-leak-nt.patch
-x86_64-mm-early-scan-depends-on-pci.patch
-x86_64-mm-move-pci-direct-out-of-line.patch
-x86_64-mm-allow-disabling-early-pci-scans.patch
-x86_64-mm-fix-unw-pc-warning.patch
-x86_64-mm-i386-fix-unwind-disabled.patch
-x86_64-mm-add-64bit-jiffies-compares-for-use-with-get_jiffies_64.patch
-x86_64-mm-refactor-thermal-throttle-processing.patch
-x86_64-mm-make-the-jiffies-compares-use-the-64bit-safe-macros..patch
-x86_64-mm-add-a-cumulative-thermal-throttle-event-counter..patch
-fix-x86_64-mm-i386-pda-smp-processorid.patch
-fix-x86_64-mm-spinlock-cleanup.patch
-mm-vm_bug_on.patch
-mm-tracking-shared-dirty-pages.patch
-mm-tracking-shared-dirty-pages-nommu-fix-2.patch
-mm-balance-dirty-pages.patch
-mm-optimize-the-new-mprotect-code-a-bit.patch
-mm-small-cleanup-of-install_page.patch
-mm-fixup-do_wp_page.patch
-mm-msync-cleanup.patch
-mm-tracking-shared-dirty-pages-checks.patch
-mm-tracking-shared-dirty-pages-wimp.patch
-mm-make-functions-static.patch
-convert-i386-numa-kva-space-to-bootmem.patch
-convert-i386-numa-kva-space-to-bootmem-tidy.patch
-bootmem-remove-useless-__init-in-header-file.patch
-bootmem-mark-link_bootmem-as-part-of-the-__init-section.patch
-bootmem-remove-useless-parentheses-in-bootmem-header.patch
-bootmem-limit-to-80-columns-width.patch
-bootmem-remove-useless-headers-inclusions.patch
-bootmem-use-pfn-page-conversion-macros.patch
-bootmem-miscellaneous-coding-style-fixes.patch
-reduce-max_nr_zones-remove-two-strange-uses-of-max_nr_zones.patch
-reduce-max_nr_zones-fix-max_nr_zones-array-initializations.patch
-reduce-max_nr_zones-make-display-of-highmem-counters-conditional-on-config_highmem.patch
-reduce-max_nr_zones-make-display-of-highmem-counters-conditional-on-config_highmem-tidy.patch
-reduce-max_nr_zones-move-highmem-counters-into-highmemc-h.patch
-reduce-max_nr_zones-move-highmem-counters-into-highmemc-h-fix.patch
-reduce-max_nr_zones-page-allocator-zone_highmem-cleanup.patch
-reduce-max_nr_zones-use-enum-to-define-zones-reformat-and-comment.patch
-reduce-max_nr_zones-use-enum-to-define-zones-reformat-and-comment-cleanup.patch
-reduce-max_nr_zones-use-enum-to-define-zones-reformat-and-comment-fix.patch
-reduce-max_nr_zones-make-zone_dma32-optional.patch
-reduce-max_nr_zones-make-zone_highmem-optional.patch
-reduce-max_nr_zones-make-zone_highmem-optional-fix.patch
-reduce-max_nr_zones-make-zone_highmem-optional-fix-fix.patch
-reduce-max_nr_zones-make-zone_highmem-optional-fix-fix-fix.patch
-reduce-max_nr_zones-remove-display-of-counters-for-unconfigured-zones.patch
-reduce-max_nr_zones-remove-display-of-counters-for-unconfigured-zones-s390-fix.patch
-reduce-max_nr_zones-remove-display-of-counters-for-unconfigured-zones-s390-fix-fix.patch
-reduce-max_nr_zones-fix-i386-srat-check-for-max_nr_zones.patch
-mempolicies-fix-policy_zone-check.patch
-apply-type-enum-zone_type.patch
-apply-type-enum-zone_type-fix.patch
-linearly-index-zone-node_zonelists.patch
-out-of-memory-notifier.patch
-out-of-memory-notifier-tidy.patch
-cpu-hotplug-compatible-alloc_percpu.patch
-cpu-hotplug-compatible-alloc_percpu-fix.patch
-cpu-hotplug-compatible-alloc_percpu-fix-2.patch
-add-kerneldocs-for-some-functions-in-mm-memoryc.patch
-mm-remove_mapping-safeness.patch
-mm-remove_mapping-safeness-fix.patch
-mm-non-syncing-lock_page.patch
-slab-respect-architecture-and-caller-mandated-alignment.patch
-mm-swap-write-failure-fixup.patch
-mm-swap-write-failure-fixup-update.patch
-mm-swap-write-failure-fixup-fix.patch
-oom-use-unreclaimable-info.patch
-oom-reclaim_mapped-on-oom.patch
-oom-cpuset-hint.patch
-oom-handle-current-exiting.patch
-oom-handle-oom_disable-exiting.patch
-oom-swapoff-tasks-tweak.patch
-oom-kthread-infinite-loop-fix.patch
-oom-more-printk.patch
-bootmem-use-max_dma_address-instead-of-low32limit.patch
-add-some-comments-to-slabc.patch
-update-some-mm-comments.patch
-slab-optimize-kmalloc_node-the-same-way-as-kmalloc.patch
-slab-optimize-kmalloc_node-the-same-way-as-kmalloc-fix.patch
-slab-extract-__kmem_cache_destroy-from-kmem_cache_destroy.patch
-slab-do-not-panic-when-alloc_kmemlist-fails-and-slab-is-up.patch
-slab-fix-lockdep-warnings.patch
-slab-fix-lockdep-warnings-fix.patch
-slab-fix-lockdep-warnings-fix-2.patch
-add-__gfp_thisnode-to-avoid-fallback-to-other-nodes-and-ignore.patch
-add-__gfp_thisnode-to-avoid-fallback-to-other-nodes-and-ignore-fix.patch
-sys_move_pages-do-not-fall-back-to-other-nodes.patch
-guarantee-that-the-uncached-allocator-gets-pages-on-the-correct.patch
-cleanup-add-zone-pointer-to-get_page_from_freelist.patch
-profiling-require-buffer-allocation-on-the-correct-node.patch
-define-easier-to-handle-gfp_thisnode.patch
-standardize-pxx_page-macros.patch
-standardize-pxx_page-macros-fix.patch
-optimize-free_one_page.patch
-do-not-check-unpopulated-zones-for-draining-and-counter.patch
-extract-the-allocpercpu-functions-from-the-slab-allocator.patch
-introduce-mechanism-for-registering-active-regions-of-memory.patch
-have-power-use-add_active_range-and-free_area_init_nodes.patch
-have-power-use-add_active_range-and-free_area_init_nodes-ppc-fix.patch
-have-x86-use-add_active_range-and-free_area_init_nodes.patch
-have-x86-use-add_active_range-and-free_area_init_nodes-fix.patch
-have-x86_64-use-add_active_range-and-free_area_init_nodes.patch
-have-ia64-use-add_active_range-and-free_area_init_nodes.patch
-have-ia64-use-add_active_range-and-free_area_init_nodes-fix.patch
-account-for-memmap-and-optionally-the-kernel-image-as-holes.patch
-account-for-memmap-and-optionally-the-kernel-image-as-holes-fix.patch
-account-for-holes-that-are-outside-the-range-of-physical-memory.patch
-allow-an-arch-to-expand-node-boundaries.patch
-replace-min_unmapped_ratio-by-min_unmapped_pages-in-struct-zone.patch
-zvc-support-nr_slab_reclaimable--nr_slab_unreclaimable.patch
-zone_reclaim-dynamic-slab-reclaim.patch
-zone_reclaim-dynamic-slab-reclaim-tidy.patch
-zone-reclaim-with-slab-avoid-unecessary-off-node-allocations.patch
-oom-kill-update-comments-to-reflect-current-code.patch
-hugepages-use-page_to_nid-rather-than-traversing-zone-pointers.patch
-numa-add-zone_to_nid-function.patch
-numa-add-zone_to_nid-function-update.patch
-vm-add-per-zone-writeout-counter.patch
-own-header-file-for-struct-page.patch
-page-invalidation-cleanup.patch
-slab-fix-kmalloc_node-applying-memory-policies-if-nodeid-==-numa_node_id.patch
-slab-fix-kmalloc_node-applying-memory-policies-if-nodeid-==-numa_node_id-fix.patch
-condense-output-of-show_free_areas.patch
-add-numa_build-definition-in-kernelh-to-avoid-ifdef.patch
-disable-gfp_thisnode-in-the-non-numa-case.patch
-gfp_thisnode-for-the-slab-allocator-v2.patch
-gfp_thisnode-for-the-slab-allocator-v2-fix.patch
-gfp_thisnode-for-the-slab-allocator-v2-fix-3.patch
-add-node-to-zone-for-the-numa-case.patch
-add-node-to-zone-for-the-numa-case-fix.patch
-do-not-allocate-pagesets-for-unpopulated-zones.patch
-zone_statistics-use-hot-node-instead-of-cold-zone_pgdat.patch
-do_no_pfn.patch
-do_no_pfn-tweaks.patch
-mspec-driver.patch
-shared-page-table-for-hugetlb-page-v2.patch
-shared-page-table-for-hugetlb-page-v2-tidy.patch
-shared-page-table-for-hugetlb-page-v2-comments.patch
-selinux-eliminate-selinux_task_ctxid.patch
-selinux-rename-selinux_ctxid_to_string.patch
-selinux-replace-ctxid-with-sid-in.patch
-selinux-enable-configuration-of-max-policy-version.patch
-selinux-enable-configuration-of-max-policy-version-improve-security_selinux_policydb_version_max_value-help-texts.patch
-selinux-add-support-for-range-transitions-on-object.patch
-selinux-1-3-eliminate-inode_security_set_security.patch
-selinux-2-3-change-isec-semaphore-to-a-mutex.patch
-selinux-3-3-convert-sbsec-semaphore-to-a-mutex.patch
-selinux-fix-tty-locking.patch
-binfmt_elf-consistently-use-loff_t.patch
-frv-use-the-generic-irq-stuff.patch
-frv-improve-frvs-use-of-generic-irq-handling.patch
-frv-permit-__do_irq-to-be-dispensed-with.patch
-frv-fix-fls-to-handle-bit-31-being-set-correctly.patch
-frv-implement-fls64.patch
-frv-optimise-ffs.patch
-alchemy-delete-unused-pt_regs-argument-from-au1xxx_dbdma_chan_alloc.patch
-avr32-arch.patch
-avr32-config_debug_bugverbose-and-config_frame_pointer.patch
-avr32-fix-invalid-constraints-for-stcond.patch
-avr32-add-support-for-irq-flags-state-tracing.patch
-avr32-turn-off-support-for-discontigmem-and-sparsemem.patch
-avr32-always-enable-config_embedded.patch
-avr32-export-the-find__bit-functions.patch
-avr32-add-defconfig-for-at32stk1002.patch
-avr32-use-autoconf-instead-of-marker.patch
-avr32-dont-assume-anything-about-max_nr_zones.patch
-avr32-add-i-o-port-access-primitives.patch
-avr32-use-linux-pfnh.patch
-avr32-kill-config_discontigmem-support-completely.patch
-avr32-fix-bug-in-__avr32_asr64.patch
-avr32-switch-to-generic-timekeeping-framework.patch
-avr32-set-kbuild_defconfig.patch
-avr32-kprobes-compile-fix.patch
-avr32-asm-ioh-should-include-asm-byteorderh.patch
-avr32-fix-output-constraints-in-asm-bitopsh.patch
-avr32-standardize-pxx_page-macros-fix.patch
-avr32-rename-at32stk100x-atstk100x.patch
-avr32-dont-leave-dbe-set-when-resetting-cpu.patch
-avr32-make-prot_write-prot_exec-imply-prot_read.patch
-avr32-remove-set_wmb.patch
-avr32-use-parse_early_param.patch
-avr32-fix-exported-headers.patch
-avr32-fix-__const_udelay-overflow-bug.patch
-remove-zone_dma-remains-from-avr32.patch
-avr32-mtd-static-memory-controller-driver-try-2.patch
-avr32-mtd-at49bv6416-platform-device-for-atstk1000.patch
-nommu-check-that-access_process_vm-has-a-valid-target.patch
-nommu-set-bdi-capabilities-for-dev-mem-and-dev-kmem.patch
-nommu-set-bdi-capabilities-for-dev-mem-and-dev-kmem-tidy.patch
-nommu-use-find_vma-rather-than-reimplementing-a-vma-search.patch
-check-if-start-address-is-in-vma-region-in-nommu-function-get_user_pages.patch
-nommu-check-vma-protections.patch
-nommu-permit-ptrace-to-ignore-non-prot_write-vmas-in-nommu-mode.patch
-nommu-implement-proc-pid-maps-for-nommu.patch
-nommu-order-the-per-mm_struct-vma-list.patch
-nommu-make-mremap-partially-work-for-nommu-kernels.patch
-nommu-add-docs-about-shared-memory.patch
-nommu-make-futexes-work-under-nommu-conditions.patch
-nommu-make-futexes-work-under-nommu-conditions-doc.patch
-nommu-move-the-fallback-arch_vma_name-to-a-sensible-place.patch
-nommu-move-the-fallback-arch_vma_name-to-a-sensible-place-fix.patch
-hpet-rtc-emulation-add-watchdog-timer-2.patch
-i386-show_registers-try-harder-to-print-failing.patch
-use-bug_onfoo-instead-of-if-foo-bug-in-include-asm-i386-dma-mappingh.patch
-apm-clean-up-module-initalization.patch
-x86-remove-locally-defined-ldt-structure-in-favour-of-standard-type.patch
-x86-implement-always-locked-bit-ops-for-memory-shared-with-an-smp-hypervisor.patch
-x86-roll-all-the-cpuid-asm-into-one-__cpuid-call.patch
-x86-make-__fixaddr_top-variable-to-allow-it-to-make-space-for-a-hypervisor.patch
-x86-add-a-bootparameter-to-reserve-high-linear-address-space.patch
-x86-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch
-x86-put-note-sections-into-a-pt_note-segment-in-vmlinux-fix.patch
-x86-enable-vmsplit-for-highmem-kernels.patch
-x86-trivial-pgtableh-__assembly__-move.patch
-x86-trivial-move-of-__have-macros-in-i386-pagetable-headers.patch
-x86-trivial-move-of-ptep_set_access_flags.patch
-x86-remove-unused-include-from-efi_stubs.patch
-i386-adds-smp_call_function_single.patch
-voyager-tty-locking.patch
-i386-kill-references-to-xtime.patch
-mtrr-add-lock-annotations-for-prepare_set-and.patch
-i386-adds-smp_call_function_single-fix.patch
-alpha-fix-alpha_ev56-dependencies-typo.patch
-swsusp-write-timer.patch
-swsusp-write-speedup.patch
-swsusp-read-timer.patch
-swsusp-read-speedup.patch
-swsusp-read-speedup-fix.patch
-swsusp-read-speedup-cleanup.patch
-swsusp-read-speedup-cleanup-2.patch
-swsusp-read-speedup-fix-fix-2.patch
-swsusp-clean-up-browsing-of-pfns.patch
-swsusp-struct-snapshot_handle-cleanup.patch
-make-swsusp-avoid-memory-holes-and-reserved-memory-regions-on-x86_64.patch
-disable-cpu-hotplug-during-suspend-2.patch
-swsusp-fix-mark_free_pages.patch
-swsusp-reorder-memory-allocating-functions.patch
-swsusp-fix-alloc_pagedir.patch
-clean-up-suspend-header.patch
-change-the-name-of-pagedir_nosave.patch
-swsusp-introduce-some-helpful-constants.patch
-swsusp-introduce-memory-bitmaps.patch
-swsusp-use-memory-bitmaps-during-resume.patch
-swsusp-use-memory-bitmaps-during-resume-fix.patch
-pm-make-it-possible-to-disable-console-suspending.patch
-pm-make-it-possible-to-disable-console-suspending-fix.patch
-pm-make-it-possible-to-disable-console-suspending-fix-2.patch
-make-it-possible-to-disable-serial-console-suspend.patch
-i386-detect-clock-skew-during-suspend.patch
-pm-add-pm_trace-switch.patch
-pm-add-pm_trace-switch-doc.patch
-m32r-fix-make-headers_check.patch
-uml-use-klibc-setjmp-longjmp.patch
-uml-use-array_size-more-assiduously.patch
-uml-fix-stack-alignment.patch
-uml-whitespace-fixes.patch
-uml-fix-handling-of-failed-execs-of-helpers.patch
-uml-improve-sigbus-diagnostics.patch
-uml-sigio-cleanups.patch
-uml-move-signal-handlers-to-arch-code.patch
-uml-move-signal-handlers-to-arch-code-fix.patch
-uml-timer-cleanups.patch
-uml-remove-unused-variable.patch
-uml-clean-our-set_ether_mac.patch
-uml-stack-usage-reduction.patch
-uml-tty-locking.patch
-split-i386-and-x86_64-ptraceh.patch
-split-i386-and-x86_64-ptraceh-fix.patch
-make-uml-use-ptrace-abih.patch
-uml-use-mcmodel=kernel-for-x86_64.patch
-uml-fix-proc-vs-interrupt-context-spinlock-deadlock.patch
-s390-fix-cmm-kernel-thread-handling.patch
-autofs4-needs-to-force-fail-return-revalidate.patch
-kdump-introduce-reset_devices-command-line-option.patch
-fat-cleanup-fat_get_blocks.patch
-inode_diet-replace-inodeugeneric_ip-with-inodei_private.patch
-inode_diet-replace-inodeugeneric_ip-with-inodei_private-gfs-fix.patch
-inode-diet-move-i_pipe-into-a-union.patch
-inode-diet-move-i_bdev-into-a-union.patch
-inode-diet-move-i_cdev-into-a-union.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-fix.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-xfs-fix.patch
-reiserfs-warn-about-the-useless-nolargeio-option.patch
-x86-microcode-microcode-driver-cleanup.patch
-x86-microcode-microcode-driver-cleanup-tidy.patch
-x86-microcode-using-request_firmware-to-pull-microcode.patch
-x86-microcode-add-sysfs-and-hotplug-support.patch
-x86-microcode-add-sysfs-and-hotplug-support-fix.patch
-x86-microcode-add-sysfs-and-hotplug-support-fix-fix-2.patch
-x86-microcode-dont-check-the-size.patch
-consistently-use-max_errno-in-__syscall_return.patch
-consistently-use-max_errno-in-__syscall_return-fix.patch
-eisa-bus-modalias-attributes-support-1.patch
-eisa-bus-modalias-attributes-support-1-fix.patch
-eisa-bus-modalias-attributes-support-1-fix-git-kbuild-fix.patch
-alloc_fdtable-cleanup.patch
-include-__param-section-in-read-only-data-range.patch
-msi-use-kmem_cache_zalloc.patch
-sysctl-allow-proc-sys-without-sys_sysctl.patch
-sysctl-allow-proc-sys-without-sys_sysctl-fix.patch
-sysctl-document-that-sys_sysctl-will-be-removed.patch
-pid-implement-transfer_pid-and-use-it-to-simplify-de_thread.patch
-pid-remove-temporary-debug-code-in-attach_pid.patch
-de_thread-use-tsk-not-current.patch
-add-probe_kernel_address.patch
-x86-use-probe_kernel_address-in-handle_bug.patch
-fs-conversions-from-kmallocmemset-to-kzcalloc.patch
-fs-removing-useless-casts.patch
-jbd-add-lock-annotation-to-jbd_sync_bh.patch
-ext3-and-jbd-cleanup-remove-whitespace.patch
-ext3-turn-on-reservation-dump-on-block-allocation-errors.patch
-ext3-add-more-comments-in-block-allocation-reservation-code.patch
-jbd-use-build_bug_on-in-journal-init.patch
-fix-ext3-mounts-at-16t.patch
-fix-ext3-mounts-at-16t-fix.patch
-fix-ext2-mounts-at-16t.patch
-fix-ext2-mounts-at-16t-fix.patch
-more-ext3-16t-overflow-fixes.patch
-more-ext3-16t-overflow-fixes-fix.patch
-ext3-inode-numbers-are-unsigned-long.patch
-ext3-inode-numbers-are-unsigned-long-fix.patch
-really-ignore-kmem_cache_destroy-return-value.patch
-make-kmem_cache_destroy-return-void.patch
-ibm-acpi-documentation-delete-irrelevant-how-to-compile-external-module.patch
-ext3-wrong-error-behavior.patch
-ext3-more-whitespace-cleanups.patch
-ext3-fix-sparse-warnings.patch
-jbd-16t-fixes.patch
-dontdiff-add-utsreleaseh.patch

 Merged into mainline or a subsystem tree.

+acpi-preserve-correct-battery-state-through-suspend-resume-cycles.patch
+acpi-preserve-correct-battery-state-through-suspend-resume-cycles-tidy.patch

 ACPI fix.

+driver-core-fixes-check-for-return-value-of-sysfs_create_link.patch

 More __must_check fixes

+fix-gregkh-driver-nozomi.patch

 Fix nozomi driver a bit.

-git-dvb-fixup.patch

 Dropped.

+drivers-media-use-null-instead-of-0-for-ptrs.patch

 Cleanup

-git-gfs2-fixup.patch

 Dropped.

+inode_diet-replace-inodeugeneric_ip-with-inodei_private-gfs2.patch
+inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-gfs2.patch

 GFS2 fixes

+possible-dereference-in.patch

 Input possible-oops fix

+drivers-input-misc-added-acer-travelmate-2424nwxci-support-to-the-wistron-button-interface.patch

 Add new machine to Wistron driver.

-libata-add-40pin-short-cable-support-honour-drive-fix.patch

 Folded into libata-add-40pin-short-cable-support-honour-drive.patch

-via-pata-controller-xfer-fixes-fix.patch

 Folded into via-pata-controller-xfer-fixes.patch

-mmc-driver-for-ti-flashmedia-card-reader-source-tidy.patch
-mmc-driver-for-ti-flashmedia-card-reader-source-alpha-fix.patch
-mmc-driver-for-ti-flashmedia-card-reader-source-vs-git-mmc.patch

 Folded into mmc-driver-for-ti-flashmedia-card-reader-source.patch

+git-mtd-fixup.patch

 Fix rejects in git-mtd.patch

-git-netdev-all-fixup.patch

 Dropped.

+forcedeth-power-management-support.patch
+forcedeth-power-management-support-tidy.patch
+remove-unnecessary-check-in-drivers-net-depcac.patch

 netdev updates

+nfs-kill-obsolete-nfs_paranoia.patch

 NFS cleanup

-revert-allow-file-systems-to-manually-d_move-inside-of-rename.patch

 Dropped.

+off-by-one-in-arch-ppc-platforms-mpc8.patch
+ehea-firmware-interface-based-on-anton-blanchards-new-hvcall-interface.patch

 ppc fixes

-tickle-nmi-watchdog-on-serial-output-fix.patch

 Folded into tickle-nmi-watchdog-on-serial-output.patch

+remove-unnecessary-check-in.patch
+pci-turn-pci_fixup_video-into-generic-for-embedded.patch
+pcie_portdrv_restore_config-undefined-without-config_pm.patch

 PCI fixes

+remove-unnecessary-check-in-drivers-scsi-sgc.patch
+remove-extra-newline-from-info-message.patch
+fix-scsi-scsi_transporth-compile-error.patch
+overrun-in-drivers-scsi-scsic.patch
+megaraid-check-for-firmware-version.patch

 SCSI fixes

+scsi-target-needs-pci.patch

 Fix git-scsi-target.patch

+fix-gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure-2.patch
+usb-hubc-build-fix.patch
+usb-serial-possible-irq-lock-inversion-ppp-vs.patch
+usb-allow-both-root-hub-interrupts-and-polling.patch
+ohci-remove-existing-autosuspend-code.patch
+ohci-add-auto-stop-support.patch
+ohci-add-auto-stop-support-hack-hack.patch
+pegasus-driver-failing-for-admtek-8515-network-device.patch

 USB fixes

+x86_64-mm-copy-user-inatomic.patch
+x86_64-mm-allow-disabling-dac.patch
+x86_64-mm-iommu-setup-style.patch
+x86_64-mm-document-iommu-panic.patch
+x86_64-mm-unify-ioapic-checking.patch
+x86_64-mm-nmi-sysctl-cleanup.patch
+x86_64-mm-i386-setup-array-size.patch
+x86_64-mm-setup-array-size.patch
+x86_64-mm-i386-mmconfig-flush.patch
+x86_64-mm-re-positioning-the-bss-segment.patch
+x86_64-mm-vsyscall-blob-header.patch
+x86_64-mm-sem-early-clobber.patch

 x86 tree updates

-revert-x86_64-mm-i386-remove-lock-section.patch

 Dropped

-revert-x86_64-mm-i386-pda-current.patch
-revert-x86_64-mm-i386-pda-smp-processorid.patch
-revert-x86_64-mm-i386-pda-vm86.patch
-revert-x86_64-mm-i386-pda-user-abi.patch
-revert-x86_64-mm-i386-pda-use-gs.patch
-revert-x86_64-mm-i386-pda-init-pda.patch

 Dropped.

-hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup-fix.patch
-hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup-fix-2.patch

 Folded into hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup.patch

-hot-add-mem-x86_64-use-config_memory_hotplug_reserve-fix.patch

 Folded into hot-add-mem-x86_64-use-config_memory_hotplug_reserve.patch

+arch-i386-pci-mmconfigc-tlb-flush-fix-tweaks.patch

 x86 fix

+deal-with-cases-of-zone_dma-meaning-the-first-zone-fix.patch

 Fix deal-with-cases-of-zone_dma-meaning-the-first-zone.patch

-redo-radix-tree-fixes.patch
-adix-tree-rcu-lockless-readside-update.patch
-radix-tree-rcu-lockless-readside-semicolon.patch
-adix-tree-rcu-lockless-readside-update-tidy.patch
-adix-tree-rcu-lockless-readside-fix-2.patch
-adix-tree-rcu-lockless-readside-fix-3.patch
-radix-tree-cleanup-radix_tree_deref_slot-and.patch
-cleanup-radix_tree_derefreplace_slot-calling-conventions.patch
-cleanup-radix_tree_derefreplace_slot-calling-conventions-warning-fixes.patch

 Folded into radix-tree-rcu-lockless-readside.patch

+mm-fix-a-race-condition-under-smc-cow.patch

 MM fix

+uswsusp-add-pmops-prepareenterfinish-support-aka-platform-mode.patch
+swsusp-use-partition-device-and-offset-to-identify-swap-areas.patch
+swsusp-rearrange-swap-handling-code.patch
+swsusp-use-block-device-offsets-to-identify-swap-locations-rev-2.patch
+swsusp-add-resume_offset-command-line-parameter-rev-2.patch
+swsusp-add-resume_offset-command-line-parameter-rev-2-fix.patch
+swsusp-document-support-for-swap-files-rev-2.patch
+swsusp-debugging.patch

 swsusp updates

+uml-assign-random-macs-to-interfaces-if-necessary.patch
+uml-mechanical-tidying-after-random-macs-change.patch
+uml-locking-documentation.patch
+uml-close-file-descriptor-leaks.patch
+uml-stack-consumption-reduction.patch

 UML updates

-apple-motion-sensor-driver-2.patch
-apple-motion-sensor-driver-2-fixes-update.patch
-apple-motion-sensor-driver-kconfig-fix.patch
-ams-check-return-values-from-device_create_file.patch

 Dropped - I couldn't keep up with all the changes.

-make-reiserfs-default-to-barrier=flush.patch
-make-ext3-mount-default-to-barrier=1.patch

 Dropped - these slow things down too much.

+remove-sysrq_key-and-related-defines-from-ppc-sh-h8300.patch
+mmc-mainly-add-or-later-clause-to-licence-statement.patch
+prevent-multiple-inclusion-of-linux-sysrqh.patch
+move-ncpfs-32bit-compat-ioctl-to-ncpfs.patch
+ipmi-per-channel-command-registration.patch
+update-legacy-io-handling-for-pmac.patch
+ip2-use-newer-pci_get-functions.patch
+i2o-switch-to-pci_get-api.patch
+cardbus-switch-to-ref-counting-hotplug-safe-api.patch
+epoll_pwait.patch
+sysrq-disable-lockdep-on-reboot.patch
+trident-fix-pci_dev-reference-counting-and-buglet.patch
+off-by-one-in-drivers-char-mwave-mwaveddc.patch
+hdaps-support-lenovo-thinkpad-t60.patch
+typo-fixes-for-rt-mutex-designtxt.patch
+remove-bug_onunlikely-in-include-linux-aioh.patch

 Misc fixes and updates

+csa-accounting-taskstats-update-update-comments-in-linux-taskstatsh.patch

 Fix CSA accouting patches in -mm.

+char-mxser_new-correct-include-file.patch
+char-mxser_new-upgrade-to-191.patch
+char-mxser_new-rework-to-allow-dynamic-structs.patch

 Update the new mxser driver

+kprobe-whitespace-cleanup.patch
+disallow-kprobes-on-notifier_call_chain.patch
+kretprobe-spinlock-deadlock-patch.patch

 kprobes updates

+cpumask-add-highest_possible_node_id-fix.patch

 Fix cpumask-add-highest_possible_node_id.patch

+ecryptfs-file-operations-readdir-fix-for-seeking-in-directory-streams.patch
+ecryptfs-grab-lock-on-lower_page-in-ecryptfs_sync_page.patch

 ecryptfs updates

+reiser4-reiser4_drop_page-dont-call-remove_from_page_cache.patch
+reiser4-get-rid-of-semaphores-wherever-it-is-possible.patch

 reiser4 fixes

+fbdev-correct-buffer-size-limit-in-fbmem_read_proc.patch

 fbdev fix

-genirq-msi-restore-__do_irq-compat-logic-temporarily.patch

 Dropped, unneeded.

+msi-simplify-msi-sanity-checks-by-adding-with-generic-irq-code.patch
+msi-only-use-a-single-irq_chip-for-msi-interrupts.patch
+msi-refactor-and-move-the-msi-irq_chip-into-the-arch-code.patch
+msi-move-the-ia64-code-into-arch-ia64.patch

 MSI updates

+htirq-tidy-up-the-htirq-code.patch

 Update hypertransport driver.



All 1259 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/patch-list 


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
@ 2006-09-28 11:54 ` Michal Piotrowski
  2006-09-29 12:12   ` md deadlock (was Re: 2.6.18-mm2) Peter Zijlstra
  2006-09-28 17:50 ` 2.6.18-mm2 Steve Fox
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 140+ messages in thread
From: Michal Piotrowski @ 2006-09-28 11:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, Neil Brown, linux-raid, linux-kernel

Hi,

On 28/09/06, Andrew Morton <akpm@osdl.org> wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
>
>

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.18-mm2 #1
-------------------------------------------------------
nash/1264 is trying to acquire lock:
 (&bdev_part_lock_key){--..}, at: [<c0310d4a>] mutex_lock+0x1c/0x1f

but task is already holding lock:
 (&new->reconfig_mutex){--..}, at: [<c03108ff>]
mutex_lock_interruptible+0x1c/0x1f

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&new->reconfig_mutex){--..}:
       [<c01390b8>] add_lock_to_list+0x5c/0x7a
       [<c013b1dd>] __lock_acquire+0x9f3/0xaef
       [<c013b643>] lock_acquire+0x71/0x91
       [<c031068f>] __mutex_lock_interruptible_slowpath+0xd2/0x326
       [<c03108ff>] mutex_lock_interruptible+0x1c/0x1f
       [<c02ba4e3>] md_open+0x28/0x5d
       [<c0197853>] do_open+0x8b/0x377
       [<c0197cd5>] blkdev_open+0x1d/0x46
       [<c0172f36>] __dentry_open+0x133/0x260
       [<c01730d1>] nameidata_to_filp+0x1c/0x2e
       [<c0173111>] do_filp_open+0x2e/0x35
       [<c0173170>] do_sys_open+0x58/0xde
       [<c0173222>] sys_open+0x16/0x18
       [<c0103297>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

-> #1 (&bdev->bd_mutex){--..}:
       [<c01390b8>] add_lock_to_list+0x5c/0x7a
       [<c013b1dd>] __lock_acquire+0x9f3/0xaef
       [<c013b643>] lock_acquire+0x71/0x91
       [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
       [<c0310d4a>] mutex_lock+0x1c/0x1f
       [<c0197824>] do_open+0x5c/0x377
       [<c0197bab>] blkdev_get+0x6c/0x77
       [<c01978d0>] do_open+0x108/0x377
       [<c0197bab>] blkdev_get+0x6c/0x77
       [<c0197eb1>] open_by_devnum+0x30/0x3c
       [<c0147419>] swsusp_check+0x14/0xc5
       [<c0145865>] software_resume+0x7e/0x100
       [<c010049e>] init+0x121/0x29f
       [<c0103f23>] kernel_thread_helper+0x7/0x10
       [<c0109523>] save_stack_trace+0x17/0x30
       [<c0138fb0>] save_trace+0x4f/0xfb
       [<c01390b8>] add_lock_to_list+0x5c/0x7a
       [<c013b1dd>] __lock_acquire+0x9f3/0xaef
       [<c013b643>] lock_acquire+0x71/0x91
       [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
       [<c0310d4a>] mutex_lock+0x1c/0x1f
       [<c0197824>] do_open+0x5c/0x377
       [<c0197bab>] blkdev_get+0x6c/0x77
       [<c01978d0>] do_open+0x108/0x377
       [<c0197bab>] blkdev_get+0x6c/0x77
       [<c0197eb1>] open_by_devnum+0x30/0x3c
       [<c0147419>] swsusp_check+0x14/0xc5
       [<c0145865>] software_resume+0x7e/0x100
       [<c010049e>] init+0x121/0x29f
       [<c0103f23>] kernel_thread_helper+0x7/0x10
       [<ffffffff>] 0xffffffff

-> #0 (&bdev_part_lock_key){--..}:
       [<c013a7b6>] print_circular_bug_tail+0x30/0x64
       [<c013b114>] __lock_acquire+0x92a/0xaef
       [<c013b643>] lock_acquire+0x71/0x91
       [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
       [<c0310d4a>] mutex_lock+0x1c/0x1f
       [<c0197323>] bd_claim_by_disk+0x5f/0x18e
       [<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
       [<c02b6453>] autostart_arrays+0x24b/0x322
       [<c02b9158>] md_ioctl+0x91/0x13f4
       [<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
       [<c01ead23>] blkdev_ioctl+0x755/0x7a2
       [<c0196f9d>] block_ioctl+0x16/0x1b
       [<c01801d2>] do_ioctl+0x22/0x67
       [<c0180460>] vfs_ioctl+0x249/0x25c
       [<c01804ba>] sys_ioctl+0x47/0x75
       [<c0103297>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

other info that might help us debug this:

1 lock held by nash/1264:
 #0:  (&new->reconfig_mutex){--..}, at: [<c03108ff>]
mutex_lock_interruptible+0x1c/0x1f
stack backtrace:
 [<c0104215>] dump_trace+0x64/0x1cd
 [<c0104390>] show_trace_log_lvl+0x12/0x25
 [<c01049e5>] show_trace+0xd/0x10
 [<c0104aad>] dump_stack+0x19/0x1b
 [<c013a7df>] print_circular_bug_tail+0x59/0x64
 [<c013b114>] __lock_acquire+0x92a/0xaef
 [<c013b643>] lock_acquire+0x71/0x91
 [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
 [<c0310d4a>] mutex_lock+0x1c/0x1f
 [<c0197323>] bd_claim_by_disk+0x5f/0x18e
 [<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
 [<c02b6453>] autostart_arrays+0x24b/0x322
 [<c02b9158>] md_ioctl+0x91/0x13f4
 [<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
 [<c01ead23>] blkdev_ioctl+0x755/0x7a2
 [<c0196f9d>] block_ioctl+0x16/0x1b
 [<c01801d2>] do_ioctl+0x22/0x67
 [<c0180460>] vfs_ioctl+0x249/0x25c
 [<c01804ba>] sys_ioctl+0x47/0x75
 [<c0103297>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb

Leftover inexact backtrace:

 =======================
md: bind<hdb2>

config & dmesg http://www.stardust.webpages.pl/files/mm/2.6.18-mm2/

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
  2006-09-28 11:54 ` 2.6.18-mm2 Michal Piotrowski
@ 2006-09-28 17:50 ` Steve Fox
  2006-09-28 19:00   ` 2.6.18-mm2 thunder7
  2006-09-28 21:01   ` 2.6.18-mm2 Andrew Morton
  2006-09-28 22:39 ` 2.6.18-mm2 Jim Cromie
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 140+ messages in thread
From: Steve Fox @ 2006-09-28 17:50 UTC (permalink / raw)
  To: linux-kernel

On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/

Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.

TCP bic registered
TCP westwood registered
TCP htcp registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Unable to handle kernel paging request at ffffffffffffffff RIP: 
 [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
PGD 203027 PUD 2b031067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: 
CPU 0 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
Call Trace:
 [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
 [<ffffffff8061c68d>] packet_init+0x2d/0x53
 [<ffffffff80207182>] init+0x162/0x330
 [<ffffffff8020a9d8>] child_rip+0xa/0x12
 [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
 [<ffffffff80207020>] init+0x0/0x330
 [<ffffffff8020a9ce>] child_rip+0x0/0x12


Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
 RSP <ffff810bffcbde90>
CR2: ffffffffffffffff
 <0>Kernel panic - not syncing: Attempted to kill init!

-- 

Steve Fox
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28 17:50 ` 2.6.18-mm2 Steve Fox
@ 2006-09-28 19:00   ` thunder7
  2006-09-28 21:01   ` 2.6.18-mm2 Andrew Morton
  1 sibling, 0 replies; 140+ messages in thread
From: thunder7 @ 2006-09-28 19:00 UTC (permalink / raw)
  To: Steve Fox; +Cc: linux-kernel

From: Steve Fox <drfickle@us.ibm.com>
Date: Thu, Sep 28, 2006 at 05:50:31PM +0000
> On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> 
> Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> 
> TCP bic registered
> TCP westwood registered
> TCP htcp registered
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> Unable to handle kernel paging request at ffffffffffffffff RIP: 

I think you need to post additional details, such as .config files.
2.6.18-mm2 boots fine here (x86-64, X2 4600 cpu, smp)

Linux version 2.6.18-mm2 (jurriaan@middle) (gcc version 4.1.2 20060920 (prerelease) (Debian 4.1.1-14)) #5 SMP Thu Sep 28 19:56:29 CEST 2006
Command line: root=/dev/md2 video=nvidiafb:1600x1200-32@85 atkbd.softrepeat=1
protocol family 1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
NET: Registered protocol family 15
NET: Registered protocol family 8
NET: Registered protocol family 20
powernow-k8: Found 2 AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ processors (version 2.00.00)

Kind regards,
Jurriaan
-- 
"I resent it as well," said Scharde. "I am working to keep my rage under
control."
        Jack Vance - Ecce and Old Earth
Debian (Unstable) GNU/Linux 2.6.18-mm2 2x4826 bogomips load 1.35

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28 17:50 ` 2.6.18-mm2 Steve Fox
  2006-09-28 19:00   ` 2.6.18-mm2 thunder7
@ 2006-09-28 21:01   ` Andrew Morton
  2006-09-28 22:45     ` 2.6.18-mm2 Stephen Hemminger
  2006-10-04 13:42     ` 2.6.18-mm2 boot failure on x86-64 Steve Fox
  1 sibling, 2 replies; 140+ messages in thread
From: Andrew Morton @ 2006-09-28 21:01 UTC (permalink / raw)
  To: Steve Fox; +Cc: linux-kernel, netdev


(please always do reply-to-all)

On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
"Steve Fox" <drfickle@us.ibm.com> wrote:

> On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> 
> Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> 
> TCP bic registered
> TCP westwood registered
> TCP htcp registered
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> Unable to handle kernel paging request at ffffffffffffffff RIP: 
>  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> PGD 203027 PUD 2b031067 PMD 0 
> Oops: 0000 [1] SMP 
> last sysfs file: 
> CPU 0 
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
>  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
>  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> Call Trace:
>  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
>  [<ffffffff8061c68d>] packet_init+0x2d/0x53
>  [<ffffffff80207182>] init+0x162/0x330
>  [<ffffffff8020a9d8>] child_rip+0xa/0x12
>  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
>  [<ffffffff80207020>] init+0x0/0x330
>  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> 
> 
> Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
> RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
>  RSP <ffff810bffcbde90>
> CR2: ffffffffffffffff
>  <0>Kernel panic - not syncing: Attempted to kill init!
> 

I'm really struggling to work out what went wrong there.  Comparing your
miserable 20 bytes of code to my object code makes me think that this:

		struct packet_sock *po = pkt_sk(sk);

returned -1, perhaps in %ebp.  But it's all very crude.

Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
addresses might change) then have a poke around with `gdb vmlinux' (or
maybe just addr2line) to work out where it's really oopsing?

I don't see much which has changed in that area recently.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
  2006-09-28 11:54 ` 2.6.18-mm2 Michal Piotrowski
  2006-09-28 17:50 ` 2.6.18-mm2 Steve Fox
@ 2006-09-28 22:39 ` Jim Cromie
  2006-09-28 23:08   ` 2.6.18-mm2 Andi Kleen
  2006-09-28 22:44 ` 2.6.18-mm2 Matthias Hentges
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 140+ messages in thread
From: Jim Cromie @ 2006-09-28 22:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, ak


[jimc@harpo linux-2.6.18-mm2-sk]$ make
  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
  CHK     include/linux/compile.h
  GEN     .version
  CHK     include/linux/compile.h
  UPD     include/linux/compile.h
  CC      init/version.o
  LD      init/built-in.o
  LD      .tmp_vmlinux1
arch/i386/kernel/built-in.o(.text+0x34f1): In function `do_nmi':
arch/i386/kernel/traps.c:752: undefined reference to 
`panic_on_unrecovered_nmi'
arch/i386/kernel/built-in.o(.text+0x3564):arch/i386/kernel/traps.c:712: 
undefined reference to `panic_on_unrecovered_nmi'


$ grep nmi arch/i386/kernel/Makefile
obj-$(CONFIG_X86_LOCAL_APIC)    += apic.o nmi.o

which I dont have enabled.

It looks to be due to changes in x86_64-mm-nmi-sysctl-cleanup.patch

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
                   ` (2 preceding siblings ...)
  2006-09-28 22:39 ` 2.6.18-mm2 Jim Cromie
@ 2006-09-28 22:44 ` Matthias Hentges
  2006-09-29  3:19 ` 2.6.18-mm2 - oops in cache_alloc_refill() Valdis.Kletnieks
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 140+ messages in thread
From: Matthias Hentges @ 2006-09-28 22:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 796 bytes --]

Hello all,

I've just tested -mm2 on my C2D system and I'm getting a lot of these
messages:

"[  139.143807] printk: 131 messages suppressed.
[  139.148235] sky2 0000:03:00.0: pci express error (0x500547)"

Please note that the "sky2" driver has always been the black sheep on
that system due to regular full lock-ups of the driver, requiring a
rmmod sky2 + modprobe sky2 cycle.

This happens often enough to warrant writing a cronjob checking the
network and auto-rmmod'ing the module.....

While the above is bloody annoying at times (heh), the driver never
caused any messages like the ones I now get with -mm2 .

Dmesg of a fresh boot is attached. -mm 1 works perfectly fine on that
machine.

-- 
Matthias Hentges 

My OS: Debian SID. Geek by Nature, Linux by Choice

[-- Attachment #1.2: dmesg_2.6.18-mm2.txt.gz --]
[-- Type: application/x-gzip, Size: 9027 bytes --]

[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28 21:01   ` 2.6.18-mm2 Andrew Morton
@ 2006-09-28 22:45     ` Stephen Hemminger
  2006-10-04 13:42     ` 2.6.18-mm2 boot failure on x86-64 Steve Fox
  1 sibling, 0 replies; 140+ messages in thread
From: Stephen Hemminger @ 2006-09-28 22:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steve Fox, linux-kernel, netdev

On Thu, 28 Sep 2006 14:01:24 -0700
Andrew Morton <akpm@osdl.org> wrote:

> 
> (please always do reply-to-all)
> 
> On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> "Steve Fox" <drfickle@us.ibm.com> wrote:
> 
> > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > 
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > 
> > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > 
> > TCP bic registered
> > TCP westwood registered
> > TCP htcp registered
> > NET: Registered protocol family 1
> > NET: Registered protocol family 17
> > Unable to handle kernel paging request at ffffffffffffffff RIP: 
> >  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > PGD 203027 PUD 2b031067 PMD 0 
> > Oops: 0000 [1] SMP 
> > last sysfs file: 
> > CPU 0 
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> >  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> >  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > Call Trace:
> >  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> >  [<ffffffff8061c68d>] packet_init+0x2d/0x53
> >  [<ffffffff80207182>] init+0x162/0x330
> >  [<ffffffff8020a9d8>] child_rip+0xa/0x12
> >  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> >  [<ffffffff80207020>] init+0x0/0x330
> >  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > 
> > 
> > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
> > RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> >  RSP <ffff810bffcbde90>
> > CR2: ffffffffffffffff
> >  <0>Kernel panic - not syncing: Attempted to kill init!
> > 
> 
> I'm really struggling to work out what went wrong there.  Comparing your
> miserable 20 bytes of code to my object code makes me think that this:
> 
> 		struct packet_sock *po = pkt_sk(sk);
> 
> returned -1, perhaps in %ebp.  But it's all very crude.

That doesn't seem possible given:

static inline struct packet_sock *pkt_sk(struct sock *sk)
{
	return (struct packet_sock *)sk;
}

That means the packet socket list is corrupted??

Stephen Hemminger <shemminger@osdl.org>

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28 22:39 ` 2.6.18-mm2 Jim Cromie
@ 2006-09-28 23:08   ` Andi Kleen
  2006-09-29 20:14     ` 2.6.18-mm2 Ingo Molnar
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-09-28 23:08 UTC (permalink / raw)
  To: Jim Cromie; +Cc: Andrew Morton, linux-kernel

On Friday 29 September 2006 00:39, Jim Cromie wrote:
> 
> [jimc@harpo linux-2.6.18-mm2-sk]$ make
>   CHK     include/linux/version.h
>   CHK     include/linux/utsrelease.h
>   CHK     include/linux/compile.h
>   GEN     .version
>   CHK     include/linux/compile.h
>   UPD     include/linux/compile.h
>   CC      init/version.o
>   LD      init/built-in.o
>   LD      .tmp_vmlinux1
> arch/i386/kernel/built-in.o(.text+0x34f1): In function `do_nmi':
> arch/i386/kernel/traps.c:752: undefined reference to 
> `panic_on_unrecovered_nmi'
> arch/i386/kernel/built-in.o(.text+0x3564):arch/i386/kernel/traps.c:712: 
> undefined reference to `panic_on_unrecovered_nmi'
> 
> 
> $ grep nmi arch/i386/kernel/Makefile
> obj-$(CONFIG_X86_LOCAL_APIC)    += apic.o nmi.o
> 
> which I dont have enabled.

Will fix.

BTW I was planning to make LOCAL_APIC unconditional on i386 too like on x86-64.
There is basically no reason ever to disable it, and the bug work around
for buggy BIOS one can be done at runtime. Overall the #ifdef / compile breakage
ratio vs saved code on disabled APIC code is definitely unbalanced.

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
                   ` (3 preceding siblings ...)
  2006-09-28 22:44 ` 2.6.18-mm2 Matthias Hentges
@ 2006-09-29  3:19 ` Valdis.Kletnieks
  2006-09-29  3:29   ` Andrew Morton
  2006-09-29 13:57 ` 2.6.18-mm2 J.A. Magallón
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-29  3:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9312 bytes --]

On Thu, 28 Sep 2006 01:46:23 PDT, Andrew Morton said:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/

Yowza.  This has been one of the most unstable -mm I've personally tried since
2.6.0 came out (and I've tried to give each and every single one a shot).

Something is giving cache_alloc_refill() massive indigestion, I'm taking
lots of oopsen in it.  Usually within 5-10 minutes I'm dead in the water.

>From an untainted kernel:

Sep 28 21:51:59 turing-police kernel: [  526.046000] BUG: unable to handle kernel paging request at virtual address 00100104
Sep 28 21:51:59 turing-police kernel: [  526.046000]  printing eip:
Sep 28 21:51:59 turing-police kernel: [  526.046000] c0150c43
Sep 28 21:51:59 turing-police kernel: [  526.046000] *pde = 00000000

as far as it got logging it to disk - at that point the machine locked up
hard, even alt-sysrq was dead, had to power-cycle. Long time since that
happened.  Admittedly, that's not much to go on, but it shows that I'm having
issues in cache_alloc_refill() even when untainted.  I'll probably get more
complete untainted traces while playing  bisect-the-mm tomorrow....

Another few traces, more complete, almost same EIP (inside cache_alloc_refill
both times), but admittedly nvidia-tainted:

Sep 28 21:40:07 turing-police kernel: [  825.672000] BUG: unable to handle kernel paging request at virtual address 646c617a 
Sep 28 21:40:07 turing-police kernel: [  825.672000]  printing eip:
Sep 28 21:40:07 turing-police kernel: [  825.672000] c0150f9b
Sep 28 21:40:07 turing-police kernel: [  825.672000] *pde = 00000000
Sep 28 21:40:07 turing-police kernel: [  825.672000] Oops: 0002 [#1]
Sep 28 21:40:07 turing-police kernel: [  825.672000] PREEMPT 
Sep 28 21:40:07 turing-police kernel: [  825.672000] last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Sep 28 21:40:07 turing-police kernel: [  825.672000] Modules linked in: aes cryptomgr xt_SECMARK xt_CONNSECMARK ip6table_
mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp
 nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal sony_acpi processo
r fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class nvidia yenta_socket oh
ci1394 ieee1394 rsrc_nonstatic intel_agp pcmcia_core agpgart iTCO_wdt rtc
Sep 28 21:40:07 turing-police kernel: [  825.672000] CPU:    0
Sep 28 21:40:07 turing-police kernel: [  825.672000] EIP:    0060:[<c0150f9b>]    Tainted: P      VLI
Sep 28 21:40:07 turing-police kernel: [  825.672000] EFLAGS: 00210002   (2.6.18-mm2 #1)
Sep 28 21:40:07 turing-police kernel: [  825.672000] EIP is at cache_alloc_refill+0x12a/0x453
Sep 28 21:40:07 turing-police kernel: [  825.672000] eax: effdf4d0   ebx: effdfa40   ecx: 00000001   edx: 646c6176
Sep 28 21:40:07 turing-police kernel: [  825.672000] esi: dffedd00   edi: effdf4c0   ebp: def37f0c   esp: def37ec8
Sep 28 21:40:07 turing-police kernel: [  825.672000] ds: 007b   es: 007b   ss: 0068  
Sep 28 21:40:07 turing-police kernel: [  825.672000] Process badpost (pid: 3474, ti=def36000 task=dfe9aaa0 task.ti=def36000) 
Sep 28 21:40:07 turing-police kernel: [  825.672000] Stack: effe03e0 66666174 000000d0 effe18c0 00000003 effdfa40 00000000 ffffffff 
Sep 28 21:40:07 turing-police kernel: [  825.672000]        00000000 ffffffff 00000001 def37fbc 01200011 00000000 00200286 fffffff4 
Sep 28 21:40:07 turing-police kernel: [  825.672000]        dfe9aaa0 def37f18 c0150e68 def37fbc def37f5c c0111b6a def37fbc bfda5158 
Sep 28 21:40:07 turing-police kernel: [  825.672000] Call Trace:
Sep 28 21:40:07 turing-police kernel: [  825.672000]  [<c0150e68>] kmem_cache_alloc+0x25/0x2e
Sep 28 21:40:07 turing-police kernel: [  825.672000]  [<c0111b6a>] copy_process+0xa2/0x1183
Sep 28 21:40:07 turing-police kernel: [  825.672000]  [<c0112dbf>] do_fork+0x8d/0x172
Sep 28 21:40:07 turing-police kernel: [  825.672000]  [<c0101216>] sys_clone+0x25/0x2a
Sep 28 21:40:07 turing-police kernel: [  825.672000]  [<c0102d23>] syscall_call+0x7/0xb
Sep 28 21:40:07 turing-police kernel: [  825.672000] DWARF2 unwinder stuck at syscall_call+0x7/0xb
Sep 28 21:40:07 turing-police kernel: [  825.672000] 
Sep 28 21:40:07 turing-police kernel: [  825.672000] Leftover inexact backtrace:
Sep 28 21:40:07 turing-police kernel: [  825.672000]  
Sep 28 21:40:07 turing-police kernel: [  825.672000]  =======================
Sep 28 21:40:07 turing-police kernel: [  825.672000] Code: 9e 1c 89 46 14 8b 5d d0 89 54 8b 10 41 89 0b 8b 46 10 89 45 c0
 8b 55 c8 3b 42 1c 73 09 ff 4d cc 83 7d cc ff 75 bd 8b 16 8b 46 04 <89> 42 04 89 10 c7 06 00 01 10 00 c7 46 04 00 02 20 0
0 83 7e 14 
Sep 28 21:40:07 turing-police kernel: [  825.672000] EIP: [<c0150f9b>] cache_alloc_refill+0x12a/0x453 SS:ESP 0068:def37ec8
Sep 28 21:40:07 turing-police kernel: [  825.672000]  <6>note: badpost[3474] exited with preempt_count 1

And then a second oops at the same exact EIP as the untainted one:

Sep 28 21:40:11 turing-police kernel: [  829.630000] BUG: unable to handle kernel paging request at virtual address 646c617a 
Sep 28 21:40:11 turing-police kernel: [  829.630000]  printing eip:
Sep 28 21:40:11 turing-police kernel: [  829.630000] c0150f9b
Sep 28 21:40:11 turing-police kernel: [  829.630000] *pde = 00000000
Sep 28 21:40:11 turing-police kernel: [  829.630000] Oops: 0002 [#2]
Sep 28 21:40:11 turing-police kernel: [  829.630000] PREEMPT 
Sep 28 21:40:11 turing-police kernel: [  829.630000] last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Sep 28 21:40:11 turing-police kernel: [  829.630000] Modules linked in: aes cryptomgr xt_SECMARK xt_CONNSECMARK ip6table_
mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp
 nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal sony_acpi processo
r fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class nvidia yenta_socket oh
ci1394 ieee1394 rsrc_nonstatic intel_agp pcmcia_core agpgart iTCO_wdt rtc
Sep 28 21:40:11 turing-police kernel: [  829.630000] CPU:    0
Sep 28 21:40:11 turing-police kernel: [  829.630000] EIP:    0060:[<c0150f9b>]    Tainted: P      VLI
Sep 28 21:40:11 turing-police kernel: [  829.630000] EFLAGS: 00210002   (2.6.18-mm2 #1)
Sep 28 21:40:11 turing-police kernel: [  829.630000] EIP is at cache_alloc_refill+0x12a/0x453
Sep 28 21:40:11 turing-police kernel: [  829.630000] eax: effdf4d0   ebx: effdfa40   ecx: 00000000   edx: 646c6176
Sep 28 21:40:11 turing-police kernel: [  829.630000] esi: dffedd00   edi: effdf4c0   ebp: e11d3f0c   esp: e11d3ec8
Sep 28 21:40:11 turing-police kernel: [  829.630000] ds: 007b   es: 007b   ss: 0068

I've seen mostly 3 different stack traces for this:

EIP is at cache_alloc_refill+0x12d/0x453
eax: 00000167   ebx: effdfa40   ecx: 00000001   edx: d9eede00
esi: daf19700   edi: effdf4c0   ebp: db237f0c   esp: db237ec8
ds: 007b   es: 007b   ss: 0068
Process procmail (pid: 3206, ti=db236000 task=db299550 task.ti=db236000)
Stack: effe03e0 00000001 000000d0 effe18c0 00000003 effdfa40 00000000 ffffffff
       00000000 ffffffff 00000001 db237fbc 01200011 00000000 00200286 fffffff4
       db299550 db237f18 c0150e68 db237fbc db237f5c c0111b6a db237fbc bfbc3678
Call Trace:
 [<c0150e68>] kmem_cache_alloc+0x25/0x2e
 [<c0111b6a>] copy_process+0xa2/0x1183
 [<c0112dbf>] do_fork+0x8d/0x172
 [<c0101216>] sys_clone+0x25/0x2a
 [<c0102d23>] syscall_call+0x7/0xb

and

EIP is at cache_alloc_refill+0x12d/0x453
eax: 00000167   ebx: effdfa40   ecx: 00000000   edx: d9eede00
esi: daf19700   edi: effdf4c0   ebp: dceedda8   esp: dceedd64
ds: 007b   es: 007b   ss: 0068
Process fetchmail (pid: 2752, ti=dceec000 task=dbfb9aa0 task.ti=dceec000)
Stack: effe03e0 00000001 000000d0 effe18c0 00000004 effdfa40 00000000 e2774500
       dceeddd4 c02fe47b 0000014f 0000014f 0000000f 00000473 00200286 00000f80
       db1c1680 dceeddb4 c015130c db1c1680 dceeddd8 c02d2fb9 00000001 000000d0
Call Trace:
 [<c015130c>] __kmalloc+0x48/0x55
 [<c02d2fb9>] __alloc_skb+0x4f/0xf7
 [<c02f4b2a>] tcp_sendmsg+0x14c/0x965
 [<c030bdf4>] inet_sendmsg+0x3b/0x48
 [<c02cdb8b>] sock_aio_write+0xf5/0x102
 [<c0153691>] do_sync_write+0xae/0xec
 [<c0153e6b>] vfs_write+0xbc/0x157
 [<c01543be>] sys_write+0x3b/0x60
 [<c0102d23>] syscall_call+0x7/0xb

and

EIP is at cache_alloc_refill+0x12d/0x453
eax: 00000167   ebx: effdfa40   ecx: 00000000   edx: d9eede00
esi: daf19700   edi: effdf4c0   ebp: ddb2fdc4   esp: ddb2fd80
ds: 007b   es: 007b   ss: 0068
Process Eterm (pid: 2700, ti=ddb2e000 task=e39ab000 task.ti=ddb2e000)
Stack: effe03e0 00000001 000000d0 effe18c0 00000004 effdfa40 00000000 00000017
       00170001 00200082 ddb2fdd0 00200082 00000000 00000000 00200286 00000f80
       ee80cd80 ddb2fdd0 c015130c ee80cd80 ddb2fdf4 c02d2fb9 00000000 000000d0
Call Trace:
 [<c015130c>] __kmalloc+0x48/0x55
 [<c02d2fb9>] __alloc_skb+0x4f/0xf7
 [<c02cfef1>] sock_alloc_send_skb+0x5a/0x17b
 [<c03258b9>] unix_stream_sendmsg+0x13b/0x2e6
 [<c02cdb8b>] sock_aio_write+0xf5/0x102
 [<c0153691>] do_sync_write+0xae/0xec
 [<c0153e6b>] vfs_write+0xbc/0x157
 [<c01543be>] sys_write+0x3b/0x60
 [<c0102d23>] syscall_call+0x7/0xb

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-29  3:19 ` 2.6.18-mm2 - oops in cache_alloc_refill() Valdis.Kletnieks
@ 2006-09-29  3:29   ` Andrew Morton
  2006-09-29  3:58     ` Valdis.Kletnieks
  2006-09-29 15:19     ` Valdis.Kletnieks
  0 siblings, 2 replies; 140+ messages in thread
From: Andrew Morton @ 2006-09-29  3:29 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Thu, 28 Sep 2006 23:19:11 -0400
Valdis.Kletnieks@vt.edu wrote:

> On Thu, 28 Sep 2006 01:46:23 PDT, Andrew Morton said:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> 
> Yowza.  This has been one of the most unstable -mm I've personally tried since
> 2.6.0 came out (and I've tried to give each and every single one a shot).
> 
> Something is giving cache_alloc_refill() massive indigestion, I'm taking
> lots of oopsen in it.  Usually within 5-10 minutes I'm dead in the water.

Could be anything I'm afraid.  But you're the first to report it, so there's
something distinct in your .config or hardware.  

Whose idea was it to make it a monolithic kernel??

> >From an untainted kernel:
> 
> Sep 28 21:51:59 turing-police kernel: [  526.046000] BUG: unable to handle kernel paging request at virtual address 00100104
> Sep 28 21:51:59 turing-police kernel: [  526.046000]  printing eip:
> Sep 28 21:51:59 turing-police kernel: [  526.046000] c0150c43
> Sep 28 21:51:59 turing-police kernel: [  526.046000] *pde = 00000000
> 
> as far as it got logging it to disk - at that point the machine locked up
> hard, even alt-sysrq was dead, had to power-cycle. Long time since that
> happened.  Admittedly, that's not much to go on, but it shows that I'm having
> issues in cache_alloc_refill() even when untainted.  I'll probably get more
> complete untainted traces while playing  bisect-the-mm tomorrow....

bisecting would be good, thanks.  It might be quicker to strip down the .config
though.


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-29  3:29   ` Andrew Morton
@ 2006-09-29  3:58     ` Valdis.Kletnieks
  2006-09-29 15:19     ` Valdis.Kletnieks
  1 sibling, 0 replies; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-29  3:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 988 bytes --]

On Thu, 28 Sep 2006 20:29:31 PDT, Andrew Morton said:
> On Thu, 28 Sep 2006 23:19:11 -0400
> Valdis.Kletnieks@vt.edu wrote:
> 
> > On Thu, 28 Sep 2006 01:46:23 PDT, Andrew Morton said:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/

> > Something is giving cache_alloc_refill() massive indigestion, I'm taking
> > lots of oopsen in it.  Usually within 5-10 minutes I'm dead in the water.
> 
> Could be anything I'm afraid.  But you're the first to report it, so there's
> something distinct in your .config or hardware.

Like *that* hasn't happened before. :)

> bisecting would be good, thanks.  It might be quicker to strip down the .config
> though.

On the other hand, this really smells like the kind of storage overlay that
changing the config can change what gets overlaid, scaring it into hiding.
The fact the system lives 5-10 minutes means that there's *something* that
happens that makes it manifest - and that could be almost anything.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* md deadlock (was Re: 2.6.18-mm2)
  2006-09-28 11:54 ` 2.6.18-mm2 Michal Piotrowski
@ 2006-09-29 12:12   ` Peter Zijlstra
  2006-09-29 12:52     ` Neil Brown
  0 siblings, 1 reply; 140+ messages in thread
From: Peter Zijlstra @ 2006-09-29 12:12 UTC (permalink / raw)
  To: Michal Piotrowski
  Cc: Andrew Morton, Ingo Molnar, Neil Brown, linux-raid, linux-kernel

On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> Hi,
> 
> On 28/09/06, Andrew Morton <akpm@osdl.org> wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> >
> >
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.18-mm2 #1
> -------------------------------------------------------
> nash/1264 is trying to acquire lock:
>  (&bdev_part_lock_key){--..}, at: [<c0310d4a>] mutex_lock+0x1c/0x1f
> 
> but task is already holding lock:
>  (&new->reconfig_mutex){--..}, at: [<c03108ff>]
> mutex_lock_interruptible+0x1c/0x1f
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #2 (&new->reconfig_mutex){--..}:
>        [<c01390b8>] add_lock_to_list+0x5c/0x7a
>        [<c013b1dd>] __lock_acquire+0x9f3/0xaef
>        [<c013b643>] lock_acquire+0x71/0x91
>        [<c031068f>] __mutex_lock_interruptible_slowpath+0xd2/0x326
>        [<c03108ff>] mutex_lock_interruptible+0x1c/0x1f
>        [<c02ba4e3>] md_open+0x28/0x5d			-> mddev->reconfig_mutex
>        [<c0197853>] do_open+0x8b/0x377			-> bdev->bd_mutex (whole)
>        [<c0197cd5>] blkdev_open+0x1d/0x46
>        [<c0172f36>] __dentry_open+0x133/0x260
>        [<c01730d1>] nameidata_to_filp+0x1c/0x2e
>        [<c0173111>] do_filp_open+0x2e/0x35
>        [<c0173170>] do_sys_open+0x58/0xde
>        [<c0173222>] sys_open+0x16/0x18
>        [<c0103297>] syscall_call+0x7/0xb
>        [<ffffffff>] 0xffffffff
> 
> -> #1 (&bdev->bd_mutex){--..}:
>        [<c01390b8>] add_lock_to_list+0x5c/0x7a
>        [<c013b1dd>] __lock_acquire+0x9f3/0xaef
>        [<c013b643>] lock_acquire+0x71/0x91
>        [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
>        [<c0310d4a>] mutex_lock+0x1c/0x1f
>        [<c0197824>] do_open+0x5c/0x377
>        [<c0197bab>] blkdev_get+0x6c/0x77
>        [<c01978d0>] do_open+0x108/0x377
>        [<c0197bab>] blkdev_get+0x6c/0x77
>        [<c0197eb1>] open_by_devnum+0x30/0x3c
>        [<c0147419>] swsusp_check+0x14/0xc5
>        [<c0145865>] software_resume+0x7e/0x100
>        [<c010049e>] init+0x121/0x29f
>        [<c0103f23>] kernel_thread_helper+0x7/0x10
>        [<c0109523>] save_stack_trace+0x17/0x30
>        [<c0138fb0>] save_trace+0x4f/0xfb
>        [<c01390b8>] add_lock_to_list+0x5c/0x7a
>        [<c013b1dd>] __lock_acquire+0x9f3/0xaef
>        [<c013b643>] lock_acquire+0x71/0x91
>        [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
>        [<c0310d4a>] mutex_lock+0x1c/0x1f
>        [<c0197824>] do_open+0x5c/0x377			-> bdev->bd_mutex (whole)
>        [<c0197bab>] blkdev_get+0x6c/0x77
>        [<c01978d0>] do_open+0x108/0x377			-> bdev->bd_mutex (partition)
>        [<c0197bab>] blkdev_get+0x6c/0x77
>        [<c0197eb1>] open_by_devnum+0x30/0x3c
>        [<c0147419>] swsusp_check+0x14/0xc5
>        [<c0145865>] software_resume+0x7e/0x100
>        [<c010049e>] init+0x121/0x29f
>        [<c0103f23>] kernel_thread_helper+0x7/0x10
>        [<ffffffff>] 0xffffffff
> 
> -> #0 (&bdev_part_lock_key){--..}:
>        [<c013a7b6>] print_circular_bug_tail+0x30/0x64
>        [<c013b114>] __lock_acquire+0x92a/0xaef
>        [<c013b643>] lock_acquire+0x71/0x91
>        [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
>        [<c0310d4a>] mutex_lock+0x1c/0x1f
>        [<c0197323>] bd_claim_by_disk+0x5f/0x18e		-> bdev->bd_mutex (partition)
>        [<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
>        [<c02b6453>] autostart_arrays+0x24b/0x322
>        [<c02b9158>] md_ioctl+0x91/0x13f4
>        [<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
>        [<c01ead23>] blkdev_ioctl+0x755/0x7a2
>        [<c0196f9d>] block_ioctl+0x16/0x1b
>        [<c01801d2>] do_ioctl+0x22/0x67
>        [<c0180460>] vfs_ioctl+0x249/0x25c
>        [<c01804ba>] sys_ioctl+0x47/0x75
>        [<c0103297>] syscall_call+0x7/0xb
>        [<ffffffff>] 0xffffffff
> 
> other info that might help us debug this:
> 
> 1 lock held by nash/1264:
>  #0:  (&new->reconfig_mutex){--..}, at: [<c03108ff>]
> mutex_lock_interruptible+0x1c/0x1f
> stack backtrace:
>  [<c0104215>] dump_trace+0x64/0x1cd
>  [<c0104390>] show_trace_log_lvl+0x12/0x25
>  [<c01049e5>] show_trace+0xd/0x10
>  [<c0104aad>] dump_stack+0x19/0x1b
>  [<c013a7df>] print_circular_bug_tail+0x59/0x64
>  [<c013b114>] __lock_acquire+0x92a/0xaef
>  [<c013b643>] lock_acquire+0x71/0x91
>  [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
>  [<c0310d4a>] mutex_lock+0x1c/0x1f
>  [<c0197323>] bd_claim_by_disk+0x5f/0x18e		-> bdev->bd_mutex (part)
>  [<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
                autorun_devices				-> mddev->reconfig_mutex
>  [<c02b6453>] autostart_arrays+0x24b/0x322
>  [<c02b9158>] md_ioctl+0x91/0x13f4
>  [<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
>  [<c01ead23>] blkdev_ioctl+0x755/0x7a2
>  [<c0196f9d>] block_ioctl+0x16/0x1b
>  [<c01801d2>] do_ioctl+0x22/0x67
>  [<c0180460>] vfs_ioctl+0x249/0x25c
>  [<c01804ba>] sys_ioctl+0x47/0x75
>  [<c0103297>] syscall_call+0x7/0xb
> DWARF2 unwinder stuck at syscall_call+0x7/0xb
> 
> Leftover inexact backtrace:

Looks like a real deadlock here. It seems to me #2 is the easiest to
break.

static int md_open(struct inode *inode, struct file *file)
{
	/*
	 * Succeed if we can lock the mddev, which confirms that
	 * it isn't being stopped right now.
	 */
	mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
	int err;

	if ((err = mddev_lock(mddev)))
		goto out;

	err = 0;
	mddev_get(mddev);
	mddev_unlock(mddev);

	check_disk_change(inode->i_bdev);
 out:
	return err;
}

mddev_get() is a simple atomic_inc(), and I fail to see how waiting for
the lock makes any difference.




^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: md deadlock (was Re: 2.6.18-mm2)
  2006-09-29 12:12   ` md deadlock (was Re: 2.6.18-mm2) Peter Zijlstra
@ 2006-09-29 12:52     ` Neil Brown
  2006-09-29 14:03       ` Peter Zijlstra
  0 siblings, 1 reply; 140+ messages in thread
From: Neil Brown @ 2006-09-29 12:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michal Piotrowski, Andrew Morton, Ingo Molnar, linux-raid, linux-kernel

On Friday September 29, a.p.zijlstra@chello.nl wrote:
> On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> 
> Looks like a real deadlock here. It seems to me #2 is the easiest to
> break.

I guess it could deadlock if you tried to add /dev/md0 as a component
of /dev/md0.  I should probably check for that somewhere.
In other cases the array->member ordering ensures there is no
deadlock.

> 
> static int md_open(struct inode *inode, struct file *file)
> {
> 	/*
> 	 * Succeed if we can lock the mddev, which confirms that
> 	 * it isn't being stopped right now.
> 	 */
> 	mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
> 	int err;
> 
> 	if ((err = mddev_lock(mddev)))
> 		goto out;
> 
> 	err = 0;
> 	mddev_get(mddev);
> 	mddev_unlock(mddev);
> 
> 	check_disk_change(inode->i_bdev);
>  out:
> 	return err;
> }
> 
> mddev_get() is a simple atomic_inc(), and I fail to see how waiting for
> the lock makes any difference.

Hmm... I"m pretty sure I do want some sort of locking there - to make
sure that the
		if (atomic_read(&mddev->active)>2) {
test in do_md_stop actually means something.  However it does seem
that the locking I have doesn't really guarantee anything much.

But I really think that this locking order should be allowed.  md
should ensure that there are never any loops in the array->member
ordering, and somehow that needs to be communicated to lockdep.

One of the items on my todo list is to sort out the lifetime rules of
md devices (once accessed, they currently never disappear).  Getting
this locking right should be part of that.

NeilBrown

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
                   ` (4 preceding siblings ...)
  2006-09-29  3:19 ` 2.6.18-mm2 - oops in cache_alloc_refill() Valdis.Kletnieks
@ 2006-09-29 13:57 ` J.A. Magallón
  2006-09-29 14:39   ` 2.6.18-mm2 Matthew Wilcox
  2006-09-30  7:04 ` 2.6.18-mm2 - possible recursive locking detected Borislav Petkov
       [not found] ` <20060930133706.GA3291@melchior.yamamaya.is-a-geek.org>
  7 siblings, 1 reply; 140+ messages in thread
From: J.A. Magallón @ 2006-09-29 13:57 UTC (permalink / raw)
  To: Andrew Morton, Linux-Kernel, , linux-scsi

On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton <akpm@osdl.org> wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> 
> 

aic7xxx oopses on boot:

PCI: Setting latency timer of device 0000:00:0e.0 to 64
IRQ handler type mismatch for IRQ 0
 [<c013c697>] setup_irq+0xb7/0x1b0
 [<c0274770>] ahc_linux_isr+0x0/0x50
 [<c013c833>] request_irq+0xa3/0xc0
 [<c027605c>] ahc_pci_map_int+0x2c/0x50
 [<c027167a>] ahc_pci_config+0x5ea/0xcf0
 [<c0208c00>] pci_bus_write_config_byte+0x30/0x70
 [<c02761dc>] ahc_linux_pci_dev_probe+0xec/0x1e0
 [<c01983b5>] sysfs_dirent_exist+0x45/0x70
 [<c019927b>] sysfs_create_link+0x7b/0x180
 [<c020d643>] pci_match_device+0x13/0xd0
 [<c0202b2f>] kobject_get+0xf/0x20
 [<c020d776>] pci_device_probe+0x56/0x80
 [<c024ea7b>] really_probe+0x3b/0xe0
 [<c024eb5f>] driver_probe_device+0x3f/0xa0
 [<c030c7a3>] klist_next+0x53/0xa0
 [<c024ecba>] __driver_attach+0x7a/0x80
 [<c024e01a>] bus_for_each_dev+0x3a/0x60
 [<c024e986>] driver_attach+0x16/0x20
 [<c024ec40>] __driver_attach+0x0/0x80
 [<c024e39c>] bus_add_driver+0x7c/0x1a0
 [<c020d935>] __pci_register_driver+0x65/0x90
 [<c0405749>] ahc_linux_init+0x79/0x90
 [<c01004b0>] init+0x120/0x330
 [<c0102eca>] ret_from_fork+0x6/0x1c
 [<c0100390>] init+0x0/0x330
 [<c0100390>] init+0x0/0x330
 [<c0103b13>] kernel_thread_helper+0x7/0x14
 =======================
aic7xxx: probe of 0000:00:0e.0 failed with error -16

lspci:

leda:~# lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:0d.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
00:0e.0 SCSI storage controller: Adaptec AHA-2940U2/U2W / 7890/7891 (rev 01)
00:0f.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 64)
00:12.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
00:12.1 Input device controller: Creative Labs SB Live! Game Port (rev 07)
01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1)

(the 2940 is onboard and the U160 is a PCI card).

Full dmesg follows:

Linux version 2.6.18-jam02 (root@rescue) (gcc version 4.1.1 20060724 (prerelease) (4.1.1-3mdk)) #1 SMP Fri Sep 29 12:31:45 CEST 2006
BIOS-provided physical RAM map:
sanitize start
sanitize end
copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end: 000000000009fc00 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end: 00000000000a0000 type: 2
copy_e820_map() start: 00000000000e0000 size: 0000000000020000 end: 0000000000100000 type: 2
copy_e820_map() start: 0000000000100000 size: 000000001ff00000 end: 0000000020000000 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 00000000fec00000 size: 0000000000001000 end: 00000000fec01000 type: 2
copy_e820_map() start: 00000000fee00000 size: 0000000000001000 end: 00000000fee01000 type: 2
copy_e820_map() start: 00000000fffc0000 size: 0000000000040000 end: 0000000100000000 type: 2
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 0000000020000000 (usable)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
0MB HIGHMEM available.
512MB LOWMEM available.
found SMP MP-table at 000fb4c0
Entering add_active_range(0, 0, 131072) 0 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   131072
  HighMem    131072 ->   131072
early_node_map[1] active PFN ranges
    0:        0 ->   131072
On node 0 totalpages: 131072
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 992 pages used for memmap
  Normal zone: 125984 pages, LIFO batch:31
  HighMem zone: 0 pages used for memmap
DMI 2.1 present.
ACPI: Unable to locate RSDP
Intel MultiProcessor Specification v1.4
    Virtual Wire compatibility mode.
OEM ID: INTEL    Product ID: 440BX        APIC at: 0xFEE00000
Processor #0 6:7 APIC version 17
Processor #1 6:7 APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Processors: 2
Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000)
Detected 501.164 MHz processor.
Built 1 zonelists.  Total pages: 130048
Kernel command line: vga=6 root=/dev/sda1
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 8192 bytes)
Console: colour VGA+ 80x60
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 515792k/524288k available (2111k kernel code, 8076k reserved, 845k data, 204k init, 0k highmem)
virtual kernel memory layout:
    fixmap  : 0xfff9d000 - 0xfffff000   ( 392 kB)
    pkmap   : 0xff800000 - 0xffc00000   (4096 kB)
    vmalloc : 0xe0800000 - 0xff7fe000   ( 495 MB)
    lowmem  : 0xc0000000 - 0xe0000000   ( 512 MB)
      .init : 0xc03ea000 - 0xc041d000   ( 204 kB)
      .data : 0xc030fe3a - 0xc03e33a4   ( 845 kB)
      .text : 0xc0100000 - 0xc030fe3a   (2111 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 1003.12 BogoMIPS (lpj=5015604)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 16k freed
CPU0: Intel Pentium III (Katmai) stepping 03
Booting processor 1/1 eip 2000
Initializing CPU#1
Calibrating delay using timer specific routine.. 1002.31 BogoMIPS (lpj=5011557)
CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000
CPU1: Intel Pentium III (Katmai) stepping 03
Total of 2 processors activated (2005.43 BogoMIPS).
ExtINT not setup in hardware but reported by MP table
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=0 pin2=0
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=2850
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdb81, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
SCSI subsystem initialized
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
PCI quirk: region 0400-043f claimed by PIIX4 ACPI
PCI quirk: region 0440-044f claimed by PIIX4 SMB
PIIX4 devres B PIO at 0290-0297
Boot video device is 0000:01:00.0
PCI: Cannot allocate resource region 0 of device 0000:00:0e.0
PCI: Bridge: 0000:00:01.0
  IO window: d000-dfff
  MEM window: fca00000-feafffff
  PREFETCH window: e4800000-f48fffff
NET: Registered protocol family 2
IP route cache hash table entries: 16384 (order: 4, 65536 bytes)
TCP established hash table entries: 65536 (order: 7, 524288 bytes)
TCP bind hash table entries: 32768 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 65536 bind 32768)
TCP reno registered
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
Limiting direct PCI/PCI transfers.
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec 29160 Ultra160 SCSI adapter>
        aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi 0:0:0:0: Direct-Access     IBM      IC35L018UWD210-0 S5BS PQ: 0 ANSI: 3
scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
 target0:0:0: Beginning Domain Validation
 target0:0:0: wide asynchronous
 target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 63)
 target0:0:0: Ending Domain Validation
scsi 0:0:5:0: CD-ROM            TOSHIBA  CD-ROM XM-6401TA 1015 PQ: 0 ANSI: 2
 target0:0:5: Beginning Domain Validation
 target0:0:5: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16)
 target0:0:5: Domain Validation skipping write tests
 target0:0:5: Ending Domain Validation
PCI: Enabling device 0000:00:0e.0 (0000 -> 0003)
PCI: No IRQ known for interrupt pin A of device 0000:00:0e.0. Probably buggy MP table.
PCI: Setting latency timer of device 0000:00:0e.0 to 64
IRQ handler type mismatch for IRQ 0
 [<c013c697>] setup_irq+0xb7/0x1b0
 [<c0274770>] ahc_linux_isr+0x0/0x50
 [<c013c833>] request_irq+0xa3/0xc0
 [<c027605c>] ahc_pci_map_int+0x2c/0x50
 [<c027167a>] ahc_pci_config+0x5ea/0xcf0
 [<c0208c00>] pci_bus_write_config_byte+0x30/0x70
 [<c02761dc>] ahc_linux_pci_dev_probe+0xec/0x1e0
 [<c01983b5>] sysfs_dirent_exist+0x45/0x70
 [<c019927b>] sysfs_create_link+0x7b/0x180
 [<c020d643>] pci_match_device+0x13/0xd0
 [<c0202b2f>] kobject_get+0xf/0x20
 [<c020d776>] pci_device_probe+0x56/0x80
 [<c024ea7b>] really_probe+0x3b/0xe0
 [<c024eb5f>] driver_probe_device+0x3f/0xa0
 [<c030c7a3>] klist_next+0x53/0xa0
 [<c024ecba>] __driver_attach+0x7a/0x80
 [<c024e01a>] bus_for_each_dev+0x3a/0x60
 [<c024e986>] driver_attach+0x16/0x20
 [<c024ec40>] __driver_attach+0x0/0x80
 [<c024e39c>] bus_add_driver+0x7c/0x1a0
 [<c020d935>] __pci_register_driver+0x65/0x90
 [<c0405749>] ahc_linux_init+0x79/0x90
 [<c01004b0>] init+0x120/0x330
 [<c0102eca>] ret_from_fork+0x6/0x1c
 [<c0100390>] init+0x0/0x330
 [<c0100390>] init+0x0/0x330
 [<c0103b13>] kernel_thread_helper+0x7/0x14
 =======================
aic7xxx: probe of 0000:00:0e.0 failed with error -16
SCSI device sda: 35843670 512-byte hdwr sectors (18352 MB)
sda: Write Protect is off
sda: Mode Sense: cb 00 00 08
SCSI device sda: drive cache: write through
SCSI device sda: 35843670 512-byte hdwr sectors (18352 MB)
sda: Write Protect is off
sda: Mode Sense: cb 00 00 08
SCSI device sda: drive cache: write through
 sda: sda1 sda2 < sda5 >
sd 0:0:0:0: Attached scsi disk sda
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
input: AT Translated Set 2 keyboard as /class/input/input0
raid6: int32x1     95 MB/s
logips2pp: Detected unknown logitech mouse model 1
raid6: int32x2     98 MB/s
raid6: int32x4    114 MB/s
raid6: int32x8    117 MB/s
input: PS/2 Logitech Mouse as /class/input/input1
raid6: mmxx1      217 MB/s
raid6: mmxx2      323 MB/s
raid6: sse1x1     245 MB/s
raid6: sse1x2     329 MB/s
raid6: using algorithm sse1x2 (329 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: automatically using best checksumming function: pIII_sse
   pIII_sse  :  1014.400 MB/sec
raid5: using function: pIII_sse (1014.400 MB/sec)
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
Time: tsc clocksource has been installed.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 204k freed
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
USB Universal Host Controller Interface driver v3.0
uhci_hcd 0000:00:07.2: UHCI Host Controller
uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:07.2: irq 9, io base 0x0000ef80
usb usb1: new device found, idVendor=0000, idProduct=0000
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: UHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.18-jam02 uhci_hcd
usb usb1: SerialNumber: 0000:00:07.2
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
sr0: scsi-1 drive
Uniform CD-ROM driver Revision: 3.20
sr 0:0:5:0: Attached scsi CD-ROM sr0
EXT3 FS on sda1, internal journal
libata version 2.00 loaded.
ata_piix 0000:00:07.1: version 2.00ac7
ata1: PATA max UDMA/33 cmd 0x1F0 ctl 0x3F6 bmdma 0xFFA0 irq 14
ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xFFA8 irq 15
scsi1 : ata_piix
scsi2 : ata_piix
ATA: abnormal status 0x7F on port 0x177
ATA: abnormal status 0x7F on port 0x177
ata2.00: ATAPI, max MWDMA0, CDB intr
ata2.00: configured for PIO3
scsi 2:0:0:0: Direct-Access     IOMEGA   ZIP 250          51.G PQ: 0 ANSI: 5
sd 2:0:0:0: Attached scsi removable disk sdb
Adding 1148608k swap on /dev/sda5.  Priority:-1 extents:1 across:1148608k
loop: loaded (max 8 devices)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:0f.0: 3Com PCI 3c905B Cyclone 100baseTx at e0804f80.
eth0:  setting full-duplex.
nfsd: last server has exited
nfsd: unexporting all filesystems
Linux agpgart interface v0.101 (c) Dave Jones
nvidia: module license 'NVIDIA' taints kernel.
NVRM: loading NVIDIA Linux x86 Kernel Module  1.0-9625  Thu Sep 14 15:33:21 PDT 2006

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.18-jam02 (gcc 4.1.1 20060724 (prerelease) (4.1.1-3mdk)) #1 SMP PREEMPT

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: md deadlock (was Re: 2.6.18-mm2)
  2006-09-29 12:52     ` Neil Brown
@ 2006-09-29 14:03       ` Peter Zijlstra
  2006-10-02 13:47         ` Peter Zijlstra
  0 siblings, 1 reply; 140+ messages in thread
From: Peter Zijlstra @ 2006-09-29 14:03 UTC (permalink / raw)
  To: Neil Brown
  Cc: Michal Piotrowski, Andrew Morton, Ingo Molnar, linux-raid, linux-kernel

On Fri, 2006-09-29 at 22:52 +1000, Neil Brown wrote:
> On Friday September 29, a.p.zijlstra@chello.nl wrote:
> > On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> > 
> > Looks like a real deadlock here. It seems to me #2 is the easiest to
> > break.
> 
> I guess it could deadlock if you tried to add /dev/md0 as a component
> of /dev/md0.  I should probably check for that somewhere.
> In other cases the array->member ordering ensures there is no
> deadlock.
> 


	1					2

 open(/dev/md0)

					open(/dev/md0)
					- do_open() -> bdev->bd_mutex
 ioctl(/dev/md0, hotadd) 
 - md_ioctl() -> mddev->reconfig_mutex
 -- hot_add_disk()
 --- bind_rdev_to_array()
 ---- bd_claim_by_disk()
 ----- bd_claim_by_kobject()
					-- md_open()
					--- mddev_lock()
					---- mutex_lock(mddev->reconfig_mutex)
 ------ mutex_lock(bdev->bd_mutex)


looks like an AB-BA deadlock to me



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 13:57 ` 2.6.18-mm2 J.A. Magallón
@ 2006-09-29 14:39   ` Matthew Wilcox
  2006-09-29 17:15     ` 2.6.18-mm2 Alan Cox
  2006-09-29 23:15     ` 2.6.18-mm2 J.A. Magallón
  0 siblings, 2 replies; 140+ messages in thread
From: Matthew Wilcox @ 2006-09-29 14:39 UTC (permalink / raw)
  To: J.A. Magall??n; +Cc: Andrew Morton, Linux-Kernel, , linux-scsi

On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> aic7xxx oopses on boot:
> 
> PCI: Setting latency timer of device 0000:00:0e.0 to 64
> IRQ handler type mismatch for IRQ 0

Of course, this isn't a scsi problem, it's a peecee hardware problem.
Or maybe a PCI subsystem problem.  But it's clearly not aic7xxx's fault.

> PCI: Cannot allocate resource region 0 of device 0000:00:0e.0

That's not good.  Might be part of the problem.

> PCI: Enabling device 0000:00:0e.0 (0000 -> 0003)
> PCI: No IRQ known for interrupt pin A of device 0000:00:0e.0. Probably buggy MP table.

This is the direct problem.  You've got no irq.


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-29  3:29   ` Andrew Morton
  2006-09-29  3:58     ` Valdis.Kletnieks
@ 2006-09-29 15:19     ` Valdis.Kletnieks
  2006-09-29 19:45       ` Andrew Morton
  2006-09-29 19:47       ` Christoph Lameter
  1 sibling, 2 replies; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-29 15:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3058 bytes --]

On Thu, 28 Sep 2006 20:29:31 PDT, Andrew Morton said:

> bisecting would be good, thanks.  It might be quicker to strip down the .config
> though.

Well, I started with a clean 2.6.18 tree, and did a 'quilt push origin.patch'
to put just the stuff already in Linus's tree on.  Unfortunately, *that*
dies a *different* horrid death after 2 to 5 minutes or so of uptime (and
this one is also a locked-up-hard power-cycle hang, no alt-sysrq).  Of the
3 or 4 times I triggered it, it managed to scribble the oops down into
syslog before totally wedging:

BUG: unable to handle kernel paging request at virtual address 00100104
printing eip:
c014c8b3
*pde = 00000000
Oops: 0002 [#1]
PREEMPT
Modules linked in: xt_SECMARK xt_CONNSECMARK ip6table_mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal processor fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class ohci1394 intel_agp ieee1394 agpgart yenta_socket rsrc_nonstatic pcmcia_core rtc
CPU:    0
EIP:    0060:[<c014c8b3>]    Not tainted VLI
EFLAGS: 00010083   (2.6.18-test #1)
EIP is at drain_freelist+0x45/0x9b
eax: 00200200   ebx: e5ce0540   ecx: effe10c0   edx: 00100100
esi: effdf4c0   edi: 00000001   ebp: effd2f54   esp: effd2f40
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 3, ti=effd2000 task=c56cf000 task.ti=effd2000)
Stack: 00000002 effe18c0 effdf4c0 effe18c0 efe006c0 effd2f64 c014d8ea 00000296
c053df60 effd2f80 c0120f91 c014d864 00000000 efe006d0 efe006c0 efe006c8
effd2fc4 c01214d6 00000001 00000000 00000001 00010000 00000000 00000000
Call Trace:
[<c014d8ea>] cache_reap+0x86/0xc4
[<c0120f91>] run_workqueue+0x8f/0xe0
[<c01214d6>] worker_thread+0xe1/0x113
[<c0123861>] kthread+0xb0/0xdf
[<c0103813>] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

Leftover inexact backtrace:

[<c0103c4d>] show_trace_log_lvl+0x12/0x25
[<c0103cec>] show_stack_log_lvl+0x8c/0x97
[<c0103e18>] show_registers+0x121/0x1b2
[<c0104041>] die+0x198/0x273
[<c034fce1>] do_page_fault+0x3f5/0x4c2
[<c034e819>] error_code+0x39/0x40
[<c014d8ea>] cache_reap+0x86/0xc4
[<c0120f91>] run_workqueue+0x8f/0xe0
[<c01214d6>] worker_thread+0xe1/0x113
[<c0123861>] kthread+0xb0/0xdf
[<c0103813>] kernel_thread_helper+0x7/0x10
=======================
Code: f0 ff ff ff 40 14 8b 5e 14 39 d3 75 19 fb 89 e0 25 00 f0 ff ff ff 48 14 8b 40 08 a8 08 74 59 e8 99 04 20 00 eb 52 8b 13 8b 43 04 <89> 42 04 89 10 c7 03 00 01 10 00 c7 43 04 00 02 20 00 8b 46 18
EIP: [<c014c8b3>] drain_freelist+0x45/0x9b SS:ESP 0068:effd2f40
<6>note: events/0[3] exited with preempt_count 1

Now the question arises - is this the same bug I was seeing under the full -mm2,
and all the other patches just move the manifestation around, or is this fixed
by another -mm2 patch, and my original bug report is something else?

I may have to learn how to use 'git bisect' to shoot this one, it appears.


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 14:39   ` 2.6.18-mm2 Matthew Wilcox
@ 2006-09-29 17:15     ` Alan Cox
  2006-09-29 23:50       ` 2.6.18-mm2 Frederik Deweerdt
  2006-09-29 23:15     ` 2.6.18-mm2 J.A. Magallón
  1 sibling, 1 reply; 140+ messages in thread
From: Alan Cox @ 2006-09-29 17:15 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: J.A. Magall??n, Andrew Morton, Linux-Kernel,, linux-scsi

Ar Gwe, 2006-09-29 am 08:39 -0600, ysgrifennodd Matthew Wilcox:
> On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> > aic7xxx oopses on boot:
> > 
> > PCI: Setting latency timer of device 0000:00:0e.0 to 64
> > IRQ handler type mismatch for IRQ 0
> 
> Of course, this isn't a scsi problem, it's a peecee hardware problem.
> Or maybe a PCI subsystem problem.  But it's clearly not aic7xxx's fault.

AIC7xxx finding it has no IRQ configured is valid (annoying, stupid and
valid) so the driver should check before requesting "no IRQ"


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-29 15:19     ` Valdis.Kletnieks
@ 2006-09-29 19:45       ` Andrew Morton
  2006-09-30  0:01         ` Valdis.Kletnieks
  2006-09-29 19:47       ` Christoph Lameter
  1 sibling, 1 reply; 140+ messages in thread
From: Andrew Morton @ 2006-09-29 19:45 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Fri, 29 Sep 2006 11:19:41 -0400
Valdis.Kletnieks@vt.edu wrote:

> On Thu, 28 Sep 2006 20:29:31 PDT, Andrew Morton said:
> 
> > bisecting would be good, thanks.  It might be quicker to strip down the .config
> > though.
> 
> Well, I started with a clean 2.6.18 tree, and did a 'quilt push origin.patch'
> to put just the stuff already in Linus's tree on.  Unfortunately, *that*
> dies a *different* horrid death after 2 to 5 minutes or so of uptime (and
> this one is also a locked-up-hard power-cycle hang, no alt-sysrq).  Of the
> 3 or 4 times I triggered it, it managed to scribble the oops down into
> syslog before totally wedging:
> 
> BUG: unable to handle kernel paging request at virtual address 00100104
> printing eip:
> c014c8b3
> *pde = 00000000
> Oops: 0002 [#1]
> PREEMPT
> Modules linked in: xt_SECMARK xt_CONNSECMARK ip6table_mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal processor fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class ohci1394 intel_agp ieee1394 agpgart yenta_socket rsrc_nonstatic pcmcia_core rtc
> CPU:    0
> EIP:    0060:[<c014c8b3>]    Not tainted VLI
> EFLAGS: 00010083   (2.6.18-test #1)
> EIP is at drain_freelist+0x45/0x9b
> eax: 00200200   ebx: e5ce0540   ecx: effe10c0   edx: 00100100
> esi: effdf4c0   edi: 00000001   ebp: effd2f54   esp: effd2f40
> ds: 007b   es: 007b   ss: 0068
> Process events/0 (pid: 3, ti=effd2000 task=c56cf000 task.ti=effd2000)
> Stack: 00000002 effe18c0 effdf4c0 effe18c0 efe006c0 effd2f64 c014d8ea 00000296
> c053df60 effd2f80 c0120f91 c014d864 00000000 efe006d0 efe006c0 efe006c8
> effd2fc4 c01214d6 00000001 00000000 00000001 00010000 00000000 00000000
> Call Trace:
> [<c014d8ea>] cache_reap+0x86/0xc4
> [<c0120f91>] run_workqueue+0x8f/0xe0
> [<c01214d6>] worker_thread+0xe1/0x113
> [<c0123861>] kthread+0xb0/0xdf
> [<c0103813>] kernel_thread_helper+0x7/0x10
> DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
> 
> Leftover inexact backtrace:
> 
> [<c0103c4d>] show_trace_log_lvl+0x12/0x25
> [<c0103cec>] show_stack_log_lvl+0x8c/0x97
> [<c0103e18>] show_registers+0x121/0x1b2
> [<c0104041>] die+0x198/0x273
> [<c034fce1>] do_page_fault+0x3f5/0x4c2
> [<c034e819>] error_code+0x39/0x40
> [<c014d8ea>] cache_reap+0x86/0xc4
> [<c0120f91>] run_workqueue+0x8f/0xe0
> [<c01214d6>] worker_thread+0xe1/0x113
> [<c0123861>] kthread+0xb0/0xdf
> [<c0103813>] kernel_thread_helper+0x7/0x10
> =======================
> Code: f0 ff ff ff 40 14 8b 5e 14 39 d3 75 19 fb 89 e0 25 00 f0 ff ff ff 48 14 8b 40 08 a8 08 74 59 e8 99 04 20 00 eb 52 8b 13 8b 43 04 <89> 42 04 89 10 c7 03 00 01 10 00 c7 43 04 00 02 20 00 8b 46 18
> EIP: [<c014c8b3>] drain_freelist+0x45/0x9b SS:ESP 0068:effd2f40
> <6>note: events/0[3] exited with preempt_count 1
> 
> Now the question arises - is this the same bug I was seeing under the full -mm2,
> and all the other patches just move the manifestation around, or is this fixed
> by another -mm2 patch, and my original bug report is something else?

I'd expect it's the same bug - slab data structures have gone bad.

> I may have to learn how to use 'git bisect' to shoot this one, it appears.

That's one way.

Again: how come nobody else is hitting this?  Something's different.

What device drivers are being used?


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-29 15:19     ` Valdis.Kletnieks
  2006-09-29 19:45       ` Andrew Morton
@ 2006-09-29 19:47       ` Christoph Lameter
  1 sibling, 0 replies; 140+ messages in thread
From: Christoph Lameter @ 2006-09-29 19:47 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Andrew Morton, linux-kernel

On Fri, 29 Sep 2006, Valdis.Kletnieks@vt.edu wrote:

> I may have to learn how to use 'git bisect' to shoot this one, it appears.

Or enable SLAB_DEBUG?


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-28 23:08   ` 2.6.18-mm2 Andi Kleen
@ 2006-09-29 20:14     ` Ingo Molnar
  2006-09-29 20:36       ` 2.6.18-mm2 Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: Ingo Molnar @ 2006-09-29 20:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jim Cromie, Andrew Morton, linux-kernel


* Andi Kleen <ak@suse.de> wrote:

> BTW I was planning to make LOCAL_APIC unconditional on i386 too like 
> on x86-64.

please dont - embedded doesnt need it most of the time. At most make it 
default y and dependent on EMBEDDED.

	Ingo

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 20:36       ` 2.6.18-mm2 Andi Kleen
@ 2006-09-29 20:32         ` Ingo Molnar
  2006-09-29 20:58           ` 2.6.18-mm2 Andi Kleen
  2006-09-29 21:36         ` 2.6.18-mm2 Dave Jones
  1 sibling, 1 reply; 140+ messages in thread
From: Ingo Molnar @ 2006-09-29 20:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jim Cromie, Andrew Morton, linux-kernel


* Andi Kleen <ak@suse.de> wrote:

> On Friday 29 September 2006 22:14, Ingo Molnar wrote:
> > 
> > * Andi Kleen <ak@suse.de> wrote:
> > 
> > > BTW I was planning to make LOCAL_APIC unconditional on i386 too like 
> > > on x86-64.
> > 
> > please dont - embedded doesnt need it most of the time.
> 
> What do you mean with not need?  Local APIC is an infinitely better 
> interface than PIC and faster. On embedded too this makes a lot of 
> sense.

it's just not present or hardware-disabled.

> And a lot of modern systems don't even work anymore without APIC 
> enabled because Windows uses it and the BIOS haven't been tested 
> without it (e.g. you often find totally broken code paths in the AML 
> for PIC mode)
> 
> The code size also isn't a good argument because the delta
> isn't that big:
> 
>    text    data     bss     dec     hex filename
> 3303894  694980  436420 4435294  43ad5e obj32-up/vmlinux
> 3266532  665732  402372 4334636  42242c obj32-up-noapic/vmlinux
> 
> ~63K.

63K???? You've got to be kidding. That's huge. That's ~10% of the 
minconfig kernel. Even 1K would be bad. We did config hacks for half a K 
win. Please ... dont cripple the i686 kernel.

	Ingo

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 20:14     ` 2.6.18-mm2 Ingo Molnar
@ 2006-09-29 20:36       ` Andi Kleen
  2006-09-29 20:32         ` 2.6.18-mm2 Ingo Molnar
  2006-09-29 21:36         ` 2.6.18-mm2 Dave Jones
  0 siblings, 2 replies; 140+ messages in thread
From: Andi Kleen @ 2006-09-29 20:36 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jim Cromie, Andrew Morton, linux-kernel

On Friday 29 September 2006 22:14, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > BTW I was planning to make LOCAL_APIC unconditional on i386 too like 
> > on x86-64.
> 
> please dont - embedded doesnt need it most of the time.

What do you mean with not need?  Local APIC is an infinitely better
interface than PIC and faster. On embedded too this makes a lot of sense.
And a lot of modern systems don't even work anymore without
APIC enabled because Windows uses it and the BIOS haven't been
tested without it (e.g. you often find totally broken code paths
in the AML for PIC mode) 

The code size also isn't a good argument because the delta
isn't that big:

   text    data     bss     dec     hex filename
3303894  694980  436420 4435294  43ad5e obj32-up/vmlinux
3266532  665732  402372 4334636  42242c obj32-up-noapic/vmlinux

~63K. I don't think such a small difference is worth the maintenance
overhead of the many ifdefs and hairy code paths. If someone really
cared about that memory they could save much more by just optimizing
some dynamic memory allocations instead, which waste much more.

The only reason to not use it are old broken BIOS or old CPUs 
without local APIC, but those can be all handled at runtime like
the 64bit kernel does.

The SUSE kernel has a imho good default heuristic based on 
DMI date, DMI number of processors and of course trusting the ACPI tables
(don't use if disabled there) 

> At most make it  
> default y and dependent on EMBEDDED.

The whole point is to get rid of the many ifdefs and frequent
compile breakage of it. This would defeat it.

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 20:32         ` 2.6.18-mm2 Ingo Molnar
@ 2006-09-29 20:58           ` Andi Kleen
  2006-09-29 21:14             ` [patch] fix !apic build breakage Ingo Molnar
  2006-09-29 21:44             ` 2.6.18-mm2 Alan Cox
  0 siblings, 2 replies; 140+ messages in thread
From: Andi Kleen @ 2006-09-29 20:58 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jim Cromie, Andrew Morton, linux-kernel

On Friday 29 September 2006 22:32, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > On Friday 29 September 2006 22:14, Ingo Molnar wrote:
> > > 
> > > * Andi Kleen <ak@suse.de> wrote:
> > > 
> > > > BTW I was planning to make LOCAL_APIC unconditional on i386 too like 
> > > > on x86-64.
> > > 
> > > please dont - embedded doesnt need it most of the time.
> > 
> > What do you mean with not need?  Local APIC is an infinitely better 
> > interface than PIC and faster. On embedded too this makes a lot of 
> > sense.
> 
> it's just not present or hardware-disabled.

The kernel won't use it then. Also on next years embedded systems
this will likely change.

> 
> > And a lot of modern systems don't even work anymore without APIC 
> > enabled because Windows uses it and the BIOS haven't been tested 
> > without it (e.g. you often find totally broken code paths in the AML 
> > for PIC mode)
> > 
> > The code size also isn't a good argument because the delta
> > isn't that big:
> > 
> >    text    data     bss     dec     hex filename
> > 3303894  694980  436420 4435294  43ad5e obj32-up/vmlinux
> > 3266532  665732  402372 4334636  42242c obj32-up-noapic/vmlinux
> > 
> > ~63K.
> 
> 63K???? You've got to be kidding. That's huge. That's ~10% of the 
> minconfig kernel. 

A large part of it is the ACPI support. Without that it's smaller:

   text    data     bss     dec     hex filename
2978333  640752  416100 4035185  3d9271 obj32-up-noacpi/vmlinux
2947808  612088  400292 3960188  3c6d7c obj32-up-noacpi-noapic/vmlinux

~30k

You might be able to do without ACPI on your embedded system.

> Even 1K would be bad. We did config hacks for half a K  
> win. 

<rant>

Sorry, but that's silly. I did some measurements and just tweaking a 
few dynamic allocation pigs saves you much more memory without 
uglifying the code. In fact in most configurations you can find dynamic 
users who need more than the complete kernel text - this means 
even if you got the kernel text down to 0 bytes you wouldn't save as 
much as simple tweaks in the dynamic pig.

I know it's easy to do size vmlinux and complain about bloat there, 
but that is really not where the real bloat is. Finding the 
real ones takes more effort of course.

And maintainability is much more important. Too many CONFIGs
just waste developer time and this one is particularly nasty
because it tends to break all the time.

And if you really want to make vmlinux smaller anyways you usually
get much better payoff by concentrating on inline functions than
uglifying the code with more CONFIGs. A few people did excellent
work on that recently and the kernel actually shrunk for most people, not
just some extreme config. But CONFIGs just 
cause everybody more work for usually very little payoff.

</rant>

-andi


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [patch] fix !apic build breakage
  2006-09-29 20:58           ` 2.6.18-mm2 Andi Kleen
@ 2006-09-29 21:14             ` Ingo Molnar
  2006-09-29 21:44               ` Andi Kleen
  2006-09-29 21:44             ` 2.6.18-mm2 Alan Cox
  1 sibling, 1 reply; 140+ messages in thread
From: Ingo Molnar @ 2006-09-29 21:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jim Cromie, Andrew Morton, linux-kernel


* Andi Kleen <ak@suse.de> wrote:

> > 63K???? You've got to be kidding. That's huge. That's ~10% of the 
> > minconfig kernel. 
> 
> A large part of it is the ACPI support. Without that it's smaller:
> 
>    text    data     bss     dec     hex filename
> 2978333  640752  416100 4035185  3d9271 obj32-up-noacpi/vmlinux
> 2947808  612088  400292 3960188  3c6d7c obj32-up-noacpi-noapic/vmlinux
> 
> ~30k

that's still huge! The patch below fixes the panic_on_unrecovered_nmi 
thing ...

> You might be able to do without ACPI on your embedded system.

of course many people do.

> > Even 1K would be bad. We did config hacks for half a K  
> > win. 
> 
> <rant>
> 
> Sorry, but that's silly. I did some measurements and just tweaking a 
> few dynamic allocation pigs saves you much more memory without 
> uglifying the code. In fact in most configurations you can find 
> dynamic users who need more than the complete kernel text - this means 
> even if you got the kernel text down to 0 bytes you wouldn't save as 
> much as simple tweaks in the dynamic pig.

so please do it. The fact that there are /other/ reductions possible 
doesnt mean we can be lax. It's like: "oh, the buddy allocator scales 
better now, so we can slow down the SLAB allocator". No, kernel size is 
like scalability: we need a million small steps.

the panic_on_unrecovered_nmi thing is gross anyway: it has no place in 
kernel.h, it should go into include/[asm-i386|x86_64]/nmi.h and not the 
generic headers. There the prototype can be made #ifdef APIC, hence 
eliminating the #ifdefs from traps.c. (that's all we care about anyway)

please dont throw away a perfectly fine config option.

	Ingo

---------------->
From: Ingo Molnar <mingo@elte.hu>
Subject: fix !apic build breakage

fix !apic build breakage.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Index: linux-hrt-mm.q/arch/i386/kernel/traps.c
===================================================================
--- linux-hrt-mm.q.orig/arch/i386/kernel/traps.c
+++ linux-hrt-mm.q/arch/i386/kernel/traps.c
@@ -709,8 +709,10 @@ mem_parity_error(unsigned char reason, s
 		"CPU %d.\n", reason, smp_processor_id());
 	printk(KERN_EMERG "You probably have a hardware problem with your RAM "
 			"chips\n");
+#ifdef CONFIG_X86_LOCAL_APIC
 	if (panic_on_unrecovered_nmi)
                 panic("NMI: Not continuing");
+#endif
 
 	printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
 
@@ -749,8 +751,10 @@ unknown_nmi_error(unsigned char reason, 
 	printk(KERN_EMERG "Uhhuh. NMI received for unknown reason %02x on "
 		"CPU %d.\n", reason, smp_processor_id());
 	printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
+#ifdef CONFIG_X86_LOCAL_APIC
 	if (panic_on_unrecovered_nmi)
                 panic("NMI: Not continuing");
+#endif
 
 	printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
 }

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 20:36       ` 2.6.18-mm2 Andi Kleen
  2006-09-29 20:32         ` 2.6.18-mm2 Ingo Molnar
@ 2006-09-29 21:36         ` Dave Jones
  2006-09-29 21:46           ` 2.6.18-mm2 Andi Kleen
  1 sibling, 1 reply; 140+ messages in thread
From: Dave Jones @ 2006-09-29 21:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Jim Cromie, Andrew Morton, linux-kernel

On Fri, Sep 29, 2006 at 10:36:15PM +0200, Andi Kleen wrote:

 > The only reason to not use it are old broken BIOS or old CPUs 
 > without local APIC, but those can be all handled at runtime like
 > the 64bit kernel does.
 > 
 > The SUSE kernel has a imho good default heuristic based on 
 > DMI date, DMI number of processors and of course trusting the ACPI tables
 > (don't use if disabled there) 
 
Any plans to push those heuristics to mainline too ?

	Dave

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [patch] fix !apic build breakage
  2006-09-29 21:44               ` Andi Kleen
@ 2006-09-29 21:41                 ` Ingo Molnar
  0 siblings, 0 replies; 140+ messages in thread
From: Ingo Molnar @ 2006-09-29 21:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jim Cromie, Andrew Morton, linux-kernel


* Andi Kleen <ak@suse.de> wrote:

> > please dont throw away a perfectly fine config option.
> 
> I can't count how many that silly option already got broken by changes 
> in the APIC code. I definitely wouldn't describe it as "perfectly 
> fine", more as "fragile and tends to fall over when you even look at 
> it".

i disagree. I frequently (daily) boot with apic-on and apic-off configs. 
Very rarely does it break. Today it did, took me 30 seconds and 531 
milliseconds to fix. Spent much more time writing these silly emails ...

	Ingo

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 20:58           ` 2.6.18-mm2 Andi Kleen
  2006-09-29 21:14             ` [patch] fix !apic build breakage Ingo Molnar
@ 2006-09-29 21:44             ` Alan Cox
  1 sibling, 0 replies; 140+ messages in thread
From: Alan Cox @ 2006-09-29 21:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Jim Cromie, Andrew Morton, linux-kernel

Ar Gwe, 2006-09-29 am 22:58 +0200, ysgrifennodd Andi Kleen:
> 2978333  640752  416100 4035185  3d9271 obj32-up-noacpi/vmlinux
> 2947808  612088  400292 3960188  3c6d7c obj32-up-noacpi-noapic/vmlinux
> 
> ~30k

30K is a lot on an embedded x86 box.

> You might be able to do without ACPI on your embedded system.

Most embedded people don't use ACPI for some strange reason related to
the fact its bloated, hard to get right in the firmware and sucks. That
is one that makes sense to keep.

Alan


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [patch] fix !apic build breakage
  2006-09-29 21:14             ` [patch] fix !apic build breakage Ingo Molnar
@ 2006-09-29 21:44               ` Andi Kleen
  2006-09-29 21:41                 ` Ingo Molnar
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-09-29 21:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jim Cromie, Andrew Morton, linux-kernel


> so please do it. The fact that there are /other/ reductions possible 
> doesnt mean we can be lax. 

Well with that argument we would put ifdefs nearly everywhere
because most subsystem have some code that you don't need in some
obscure configuration.

Do we do that? No. Clean and maintainable code is more important.

This particular case of APIC CONFIG is just a historical ward.
I eliminated it on 64bit a long time ago (and it undoubtedly
saved me hours of fixing compilation issues and it made the code
cleaner too) and i386 is definitely ripe for that soon too.

> It's like: "oh, the buddy allocator scales  
> better now, so we can slow down the SLAB allocator". No, kernel size is 
> like scalability: we need a million small steps.

Sure you could do a million steps. Just for each step you need 
to look at the ratio of maintainability impact:usefulness
IMHO microconfig loses there usually badly.

[As terminology i call microconfig anything that requires ifdefs 
inside .c or .h files. CONFIGs that only appear in Makefiles are usually
not a problem]

There are lots of other steps to less bloat that make sense, but please don't
advocate that microCONFIG disease.

> the panic_on_unrecovered_nmi thing is gross anyway: it has no place in 
> kernel.h, it should go into include/[asm-i386|x86_64]/nmi.h and not the 
> generic headers. There the prototype can be made #ifdef APIC, hence 
> eliminating the #ifdefs from traps.c. (that's all we care about anyway)

Yes I fixed it already in a cleaner way (without ugly ifdefs) 

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/nmi-sysctl-cleanup
 
> please dont throw away a perfectly fine config option.

I can't count how many that silly option already got broken by 
changes in the APIC code. I definitely wouldn't describe it as "perfectly fine",
more as "fragile and tends to fall over when you even look at it".

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 21:36         ` 2.6.18-mm2 Dave Jones
@ 2006-09-29 21:46           ` Andi Kleen
  0 siblings, 0 replies; 140+ messages in thread
From: Andi Kleen @ 2006-09-29 21:46 UTC (permalink / raw)
  To: Dave Jones; +Cc: Ingo Molnar, Jim Cromie, Andrew Morton, linux-kernel

On Friday 29 September 2006 23:36, Dave Jones wrote:
> On Fri, Sep 29, 2006 at 10:36:15PM +0200, Andi Kleen wrote:
> 
>  > The only reason to not use it are old broken BIOS or old CPUs 
>  > without local APIC, but those can be all handled at runtime like
>  > the 64bit kernel does.
>  > 
>  > The SUSE kernel has a imho good default heuristic based on 
>  > DMI date, DMI number of processors and of course trusting the ACPI tables
>  > (don't use if disabled there) 
>  
> Any plans to push those heuristics to mainline too ?

Yes, probably not for .19 though. I wanted to do it together 
with the removal of the APIC CONFIGs and a lot of cleanup in this
area that will come from that.

-Andi


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 14:39   ` 2.6.18-mm2 Matthew Wilcox
  2006-09-29 17:15     ` 2.6.18-mm2 Alan Cox
@ 2006-09-29 23:15     ` J.A. Magallón
  1 sibling, 0 replies; 140+ messages in thread
From: J.A. Magallón @ 2006-09-29 23:15 UTC (permalink / raw)
  To: Matthew Wilcox, Linux-Kernel, , Andrew Morton, linux-scsi

On Fri, 29 Sep 2006 08:39:49 -0600, Matthew Wilcox <matthew@wil.cx> wrote:

> On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> > aic7xxx oopses on boot:
> > 
> > PCI: Setting latency timer of device 0000:00:0e.0 to 64
> > IRQ handler type mismatch for IRQ 0
> 
> Of course, this isn't a scsi problem, it's a peecee hardware problem.
> Or maybe a PCI subsystem problem.  But it's clearly not aic7xxx's fault.
> 
> > PCI: Cannot allocate resource region 0 of device 0000:00:0e.0
> 
> That's not good.  Might be part of the problem.
> 
> > PCI: Enabling device 0000:00:0e.0 (0000 -> 0003)
> > PCI: No IRQ known for interrupt pin A of device 0000:00:0e.0. Probably buggy MP table.
> 
> This is the direct problem.  You've got no irq.
> 

Thanks...

Now I have just realized this:

00:0d.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
00:0e.0 SCSI storage controller: Adaptec AHA-2940U2/U2W / 7890/7891 (rev 01)

leda:~# lsscsi -Hv
[0]    aic7xxx     
  dir: /sys/class/scsi_host/host0
  device dir: /sys/devices/pci0000:00/0000:00:0d.0/host0
[1]    ata_piix    
  dir: /sys/class/scsi_host/host1
  device dir: /sys/devices/pci0000:00/0000:00:07.1/host1
[2]    ata_piix    
  dir: /sys/class/scsi_host/host2
  device dir: /sys/devices/pci0000:00/0000:00:07.1/host2

leda:~# lsscsi
[0:0:0:0]    disk    IBM      IC35L018UWD210-0 S5BS  /dev/sda
[0:0:5:0]    cd/dvd  TOSHIBA  CD-ROM XM-6401TA 1015  /dev/sr0
[2:0:0:0]    disk    IOMEGA   ZIP 250          51.G  /dev/sdb

Device 00:0e.0 is the 2940, which has nothing hung.
Who's to blame ? the bios because is assigns no interupts as no devices are
connected to the bus ? Or the kernel that should understand something like
'this device is disabled' ?

I can try to change the cdrom to the 2940 and see what happens...

Thanks, I will try the patch posted, it looks something like what I said
above, disable the device.

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.18-jam02 (gcc 4.1.1 20060724 (prerelease) (4.1.1-3mdk)) #1 SMP PREEMPT

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 23:50       ` 2.6.18-mm2 Frederik Deweerdt
@ 2006-09-29 23:43         ` Alan Cox
  2006-09-30 14:09           ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
  2006-09-30 15:26         ` 2.6.18-mm2 James Bottomley
  1 sibling, 1 reply; 140+ messages in thread
From: Alan Cox @ 2006-09-29 23:43 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Matthew Wilcox, J.A. Magall??n, Andrew Morton, Linux-Kernel,, linux-scsi

Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
> Does this patch makes sense in that case? If yes, I'll put up a patch
> for the remaining cases in the drivers/scsi/aic7xxx/ directory.
> Also, aic7xxx's coding style would put parenthesis around the returned
> value, should I follow it?

Yes - but perhaps with a warning message so users know why ?

As to coding style - kernel style is unbracketed so I wouldnt worry
about either.



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 17:15     ` 2.6.18-mm2 Alan Cox
@ 2006-09-29 23:50       ` Frederik Deweerdt
  2006-09-29 23:43         ` 2.6.18-mm2 Alan Cox
  2006-09-30 15:26         ` 2.6.18-mm2 James Bottomley
  0 siblings, 2 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-09-29 23:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Matthew Wilcox, J.A. Magall??n, Andrew Morton, Linux-Kernel,, linux-scsi

On Fri, Sep 29, 2006 at 06:15:42PM +0100, Alan Cox wrote:
> Ar Gwe, 2006-09-29 am 08:39 -0600, ysgrifennodd Matthew Wilcox:
> > On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> > > aic7xxx oopses on boot:
> > > 
> > > PCI: Setting latency timer of device 0000:00:0e.0 to 64
> > > IRQ handler type mismatch for IRQ 0
> > 
> > Of course, this isn't a scsi problem, it's a peecee hardware problem.
> > Or maybe a PCI subsystem problem.  But it's clearly not aic7xxx's fault.
> 
> AIC7xxx finding it has no IRQ configured is valid (annoying, stupid and
> valid) so the driver should check before requesting "no IRQ"
> 
Alan,

Does this patch makes sense in that case? If yes, I'll put up a patch
for the remaining cases in the drivers/scsi/aic7xxx/ directory.
Also, aic7xxx's coding style would put parenthesis around the returned
value, should I follow it?

Regards,
Frederik

diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index ea5687d..38f5ca7 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -185,6 +185,9 @@ ahc_linux_pci_dev_probe(struct pci_dev *
 	int		 error;
 	struct device	*dev = &pdev->dev;
 
+	if (!pdev->irq)
+		return -ENODEV;
+
 	pci = pdev;
 	entry = ahc_find_pci_device(pci);
 	if (entry == NULL)

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-29 19:45       ` Andrew Morton
@ 2006-09-30  0:01         ` Valdis.Kletnieks
  2006-09-30  1:20           ` Andrew Morton
  0 siblings, 1 reply; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-30  0:01 UTC (permalink / raw)
  To: Andrew Morton, Jean Tourrilhes, John W. Linville; +Cc: linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 3192 bytes --]

On Fri, 29 Sep 2006 12:45:58 PDT, Andrew Morton said:

(Adding a bunch of people to the cc: list now that I have a clue what is
going on....)

> I'd expect it's the same bug - slab data structures have gone bad.

*bing*! We have a winner.  A quick check showed the kernel wasn't built with
slab debugging enabled, so I turned on the more obvious options, and got
rewarded with a traceback..

> Again: how come nobody else is hitting this?  Something's different.

gkrellm and wireless (specifically, gkrellm-wifi-0.9.12-3.fc6 from Fedora
Core extras-development).  Kernel is still a 2.6.18 with *only* the
origin.patch from -mm2 applied. Note that the gkrellm plugin hasn't had
a change in the code since 01/03/2004 - hopefully there's been no unintentional
API change on the kernel side since then...

Here's the traceback I got:

slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
[<c0103ad2>] dump_trace+0x64/0x1cd
[<c0103c4d>] show_trace_log_lvl+0x12/0x25
[<c010415f>] show_trace+0xd/0x10
[<c01041fc>] dump_stack+0x19/0x1b
[<c014c796>] __slab_error+0x17/0x1c
[<c014cdac>] cache_free_debugcheck+0xaf/0x230
[<c014d43e>] kfree+0x59/0x8c
[<c02dc04a>] ioctl_standard_call+0x1da/0x218
[<c02dc275>] wireless_process_ioctl+0x55/0x312
[<c02d3750>] dev_ioctl+0x45f/0x49a
[<c02c92aa>] sock_ioctl+0x1b3/0x1c6
[<c0160322>] do_ioctl+0x22/0x67
[<c01605a5>] vfs_ioctl+0x23e/0x251
[<c01605ff>] sys_ioctl+0x47/0x64
[<c0102cd3>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb

Leftover inexact backtrace:

=======================
de57e16c: redzone 1:0x170fc2a5, redzone 2:0x170fc200.

Repeated, over and over, just about once a second.

A quick strace of gkrellm finds these likely ioctl's causing the problem:

% grep ioctl /tmp/foo2 | sort -u | more
ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0
ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc)     = 0
ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)      = 0

Since I'm using an orinoco-based card, these 2 look like the most likely
candidates.  WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
stable for me. I'll let somebody else argue over what path these took that
I never tripped over them in an earlier -mm before they hit Linus's tree...

commit baef186519c69b11cf7e48c26e75feb1e6173baa
Author: John W. Linville <linville@tuxdriver.com>
Date:   Fri Sep 8 16:04:05 2006 -0400

    [PATCH] WE-21 support (core API)

    This is version 21 of the Wireless Extensions. Changelog :
        o finishes migrating the ESSID API (remove the +1)
        o netdev->get_wireless_stats is no more
        o long/short retry

    This is a redacted version of a patch originally submitted by Jean
    Tourrilhes.  I removed most of the additions, in order to minimize
    future support requirements for nl80211 (or other WE successor).

    CC: Jean Tourrilhes <jt@hpl.hp.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit eeec9f1a931262d69811135092c8447d6dccc3e6
Author: Jean Tourrilhes <jt@hpl.hp.com>
Date:   Tue Aug 29 18:02:31 2006 -0700

    [PATCH] WE-21 for orinoco

    Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>




[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  0:01         ` Valdis.Kletnieks
@ 2006-09-30  1:20           ` Andrew Morton
  2006-09-30  1:33             ` Jean Tourrilhes
                               ` (4 more replies)
  0 siblings, 5 replies; 140+ messages in thread
From: Andrew Morton @ 2006-09-30  1:20 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Jean Tourrilhes, John W. Linville, linux-kernel, netdev

On Fri, 29 Sep 2006 20:01:54 -0400
Valdis.Kletnieks@vt.edu wrote:

> On Fri, 29 Sep 2006 12:45:58 PDT, Andrew Morton said:
> 
> (Adding a bunch of people to the cc: list now that I have a clue what is
> going on....)
> 
> > I'd expect it's the same bug - slab data structures have gone bad.
> 
> *bing*! We have a winner.  A quick check showed the kernel wasn't built with
> slab debugging enabled, so I turned on the more obvious options, and got
> rewarded with a traceback..

doh.  I'd assumed that CONFIG_DEBUG_SLAB was enabled :(

> > Again: how come nobody else is hitting this?  Something's different.
> 
> gkrellm and wireless (specifically, gkrellm-wifi-0.9.12-3.fc6 from Fedora
> Core extras-development).  Kernel is still a 2.6.18 with *only* the
> origin.patch from -mm2 applied. Note that the gkrellm plugin hasn't had
> a change in the code since 01/03/2004 - hopefully there's been no unintentional
> API change on the kernel side since then...
> 
> Here's the traceback I got:
> 
> slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
> [<c0103ad2>] dump_trace+0x64/0x1cd
> [<c0103c4d>] show_trace_log_lvl+0x12/0x25
> [<c010415f>] show_trace+0xd/0x10
> [<c01041fc>] dump_stack+0x19/0x1b
> [<c014c796>] __slab_error+0x17/0x1c
> [<c014cdac>] cache_free_debugcheck+0xaf/0x230
> [<c014d43e>] kfree+0x59/0x8c
> [<c02dc04a>] ioctl_standard_call+0x1da/0x218
> [<c02dc275>] wireless_process_ioctl+0x55/0x312
> [<c02d3750>] dev_ioctl+0x45f/0x49a
> [<c02c92aa>] sock_ioctl+0x1b3/0x1c6
> [<c0160322>] do_ioctl+0x22/0x67
> [<c01605a5>] vfs_ioctl+0x23e/0x251
> [<c01605ff>] sys_ioctl+0x47/0x64
> [<c0102cd3>] syscall_call+0x7/0xb
> DWARF2 unwinder stuck at syscall_call+0x7/0xb
> 
> Leftover inexact backtrace:
> 
> =======================
> de57e16c: redzone 1:0x170fc2a5, redzone 2:0x170fc200.
> 
> Repeated, over and over, just about once a second.
> 
> A quick strace of gkrellm finds these likely ioctl's causing the problem:
> 
> % grep ioctl /tmp/foo2 | sort -u | more
> ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0
> ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc)     = 0
> ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)      = 0

Yes.  The main thing which those WE-21 patches do is to shorten the size of
various buffers which are used in wireless ioctls.

> Since I'm using an orinoco-based card, these 2 look like the most likely
> candidates.  WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
> stable for me.

The WE-21 patches weren't in Jeff's tree for -mm1 or for -mm2.  They
appeared there transiently then quickly went mainline.  They _might_ have
been in the wireless git tree, although I often drop that due to git woes. 
But that hasn't happened recently....

> I'll let somebody else argue over what path these took that
> I never tripped over them in an earlier -mm before they hit Linus's tree...
> 
> commit baef186519c69b11cf7e48c26e75feb1e6173baa
> Author: John W. Linville <linville@tuxdriver.com>
> Date:   Fri Sep 8 16:04:05 2006 -0400
> 
>     [PATCH] WE-21 support (core API)
> 
>     This is version 21 of the Wireless Extensions. Changelog :
>         o finishes migrating the ESSID API (remove the +1)
>         o netdev->get_wireless_stats is no more
>         o long/short retry
> 
>     This is a redacted version of a patch originally submitted by Jean
>     Tourrilhes.  I removed most of the additions, in order to minimize
>     future support requirements for nl80211 (or other WE successor).
> 
>     CC: Jean Tourrilhes <jt@hpl.hp.com>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
> 
> commit eeec9f1a931262d69811135092c8447d6dccc3e6
> Author: Jean Tourrilhes <jt@hpl.hp.com>
> Date:   Tue Aug 29 18:02:31 2006 -0700
> 
>     [PATCH] WE-21 for orinoco
> 
>     Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
> 

Try reverting those?

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  1:20           ` Andrew Morton
@ 2006-09-30  1:33             ` Jean Tourrilhes
  2006-09-30  3:31               ` Valdis.Kletnieks
  2006-09-30  1:40             ` Jean Tourrilhes
                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 140+ messages in thread
From: Jean Tourrilhes @ 2006-09-30  1:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Valdis.Kletnieks, John W. Linville, linux-kernel, netdev

On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 20:01:54 -0400
> > 
> > Here's the traceback I got:
> > 
> > slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
> > [<c0103ad2>] dump_trace+0x64/0x1cd
> > [<c0103c4d>] show_trace_log_lvl+0x12/0x25
> > [<c010415f>] show_trace+0xd/0x10
> > [<c01041fc>] dump_stack+0x19/0x1b
> > [<c014c796>] __slab_error+0x17/0x1c
> > [<c014cdac>] cache_free_debugcheck+0xaf/0x230
> > [<c014d43e>] kfree+0x59/0x8c
> > [<c02dc04a>] ioctl_standard_call+0x1da/0x218
> > [<c02dc275>] wireless_process_ioctl+0x55/0x312
> > [<c02d3750>] dev_ioctl+0x45f/0x49a
> > [<c02c92aa>] sock_ioctl+0x1b3/0x1c6
> > [<c0160322>] do_ioctl+0x22/0x67
> > [<c01605a5>] vfs_ioctl+0x23e/0x251
> > [<c01605ff>] sys_ioctl+0x47/0x64
> > [<c0102cd3>] syscall_call+0x7/0xb
> > DWARF2 unwinder stuck at syscall_call+0x7/0xb

	Hum... Not clear what's happening. I'll look more into it on
monday.

> > A quick strace of gkrellm finds these likely ioctl's causing the problem:
> > 
> > % grep ioctl /tmp/foo2 | sort -u | more
> > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0

	That's most likely the one. I need to check the source code.

> Yes.  The main thing which those WE-21 patches do is to shorten the size of
> various buffers which are used in wireless ioctls.

	Only for ESSID, it reduce it by one char, and remove the final
'\0'. But, kernel wise, it should not matter.

> > Since I'm using an orinoco-based card, these 2 look like the most likely
> > candidates.  WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
> > stable for me.

	I'm using Orinoco, I've not seen that with iwconfig.
	I'll look into that...

	Jean



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  1:20           ` Andrew Morton
  2006-09-30  1:33             ` Jean Tourrilhes
@ 2006-09-30  1:40             ` Jean Tourrilhes
  2006-09-30  3:31               ` Valdis.Kletnieks
  2006-09-30  1:57             ` Makefile for linux modules x z
                               ` (2 subsequent siblings)
  4 siblings, 1 reply; 140+ messages in thread
From: Jean Tourrilhes @ 2006-09-30  1:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Valdis.Kletnieks, John W. Linville, linux-kernel, netdev

On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 20:01:54 -0400
> > 
> > A quick strace of gkrellm finds these likely ioctl's causing the problem:
> > 
> > % grep ioctl /tmp/foo2 | sort -u | more
> > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0
> > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc)     = 0
> > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)      = 0

	Excuse me, can you point out wich version of gkrellm you use
and where to find it, the only version that is listed on my page does
not use the ESSID ioctl. I want to be sure I'm looking at the same
thing as you are...

	Jean

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Makefile for linux modules
  2006-09-30  1:20           ` Andrew Morton
  2006-09-30  1:33             ` Jean Tourrilhes
  2006-09-30  1:40             ` Jean Tourrilhes
@ 2006-09-30  1:57             ` x z
  2006-09-30  8:55               ` Sam Ravnborg
  2006-09-30  1:59             ` x z
  2006-10-02 17:52             ` 2.6.18-mm2 - oops in cache_alloc_refill() Jean Tourrilhes
  4 siblings, 1 reply; 140+ messages in thread
From: x z @ 2006-09-30  1:57 UTC (permalink / raw)
  To: linux-kernel, netdev

Hi
   I have a makefielt to make several driver modules:
obj-$(CONFIG_FUSION_SPI)	+= mptbase.o mptscsih.o
mptspi.o
obj-$(CONFIG_FUSION_FC)		+= mptbase.o mptscsih.o
mptfc.o
obj-m				+= mptbase.o mptscsih.o mptsas.o
obj-$(CONFIG_FUSION_LAN)	+= mptlan.o
obj-m				+= mptctl.o
obj-m                           += mptcfg.o
obj-m                       +=mptstm.o


this will compile and modules can be installed
successfully.

I need to have a comfunc.c file, which contains all
common functions, which could be used by these module
files.
I added the line below to the content just below
mptstm.o (I tried adding just above mptlan). All
modules are compiled successfully. I can install
mptbase.ko. However, when I try to install mptctl.ko
(or other modules), I got errors like mptctl: Unknown
symbol mpt_register; mpt_deregister. These functions
are implemented in mptbase.c.

How do I fix this problem?

thanks
Robert
mptbase-objs             := comfunc.o

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Makefile for linux modules
  2006-09-30  1:20           ` Andrew Morton
                               ` (2 preceding siblings ...)
  2006-09-30  1:57             ` Makefile for linux modules x z
@ 2006-09-30  1:59             ` x z
  2006-10-02 17:52             ` 2.6.18-mm2 - oops in cache_alloc_refill() Jean Tourrilhes
  4 siblings, 0 replies; 140+ messages in thread
From: x z @ 2006-09-30  1:59 UTC (permalink / raw)
  To: linux-kernel, netdev

Hi
   I have a makefile to make several driver modules:
obj-$(CONFIG_FUSION_SPI)	+= mptbase.o mptscsih.o
mptspi.o
obj-$(CONFIG_FUSION_FC)		+= mptbase.o mptscsih.o
mptfc.o
obj-m				+= mptbase.o mptscsih.o mptsas.o
obj-$(CONFIG_FUSION_LAN)	+= mptlan.o
obj-m				+= mptctl.o
obj-m                           += mptcfg.o
obj-m                       +=mptstm.o


this will compile all modules and the modules can be
installed successfully.

I need to have a comfunc.c file, which contains all
common functions, which could be used by these module
files.
I added the line below to the content just below
mptstm.o (I tried adding just above mptlan). 
mptbase-objs             := comfunc.o

All modules are compiled successfully. I can install
mptbase.ko. However, when I try to install mptctl.ko
(or other modules), I got errors like mptctl: Unknown
symbol mpt_register; mpt_deregister. These functions
are implemented in mptbase.c.

How do I fix this problem?

thanks
Robert


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  1:33             ` Jean Tourrilhes
@ 2006-09-30  3:31               ` Valdis.Kletnieks
  2006-09-30  7:50                 ` Valdis.Kletnieks
  0 siblings, 1 reply; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-30  3:31 UTC (permalink / raw)
  To: jt; +Cc: Andrew Morton, John W. Linville, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 654 bytes --]

On Fri, 29 Sep 2006 18:33:48 PDT, Jean Tourrilhes said:
> On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> > On Fri, 29 Sep 2006 20:01:54 -0400
> > > 
> > > Here's the traceback I got:
> > > 
> > > slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten

> 	Hum... Not clear what's happening. I'll look more into it on
> monday.

Fair enough,  I'm going to try reverting the 2 commits and see if things
behave better.

> 	I'm using Orinoco, I've not seen that with iwconfig.
> 	I'll look into that...

I'll bet it's the difference between a modern iwconfig and a 3-year-old
stone-age gkrellm plugin :)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  1:40             ` Jean Tourrilhes
@ 2006-09-30  3:31               ` Valdis.Kletnieks
  0 siblings, 0 replies; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-30  3:31 UTC (permalink / raw)
  To: jt; +Cc: Andrew Morton, John W. Linville, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 1093 bytes --]

On Fri, 29 Sep 2006 18:40:43 PDT, Jean Tourrilhes said:
> On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> > On Fri, 29 Sep 2006 20:01:54 -0400
> > > 
> > > A quick strace of gkrellm finds these likely ioctl's causing the problem:
> > > 
> > > % grep ioctl /tmp/foo2 | sort -u | more
> > > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0
> > > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc)     = 0
> > > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)      = 0
> 
> 	Excuse me, can you point out wich version of gkrellm you use
> and where to find it, the only version that is listed on my page does
> not use the ESSID ioctl. I want to be sure I'm looking at the same
> thing as you are...

All the pieces:
http://download.fedora.redhat.com/pub/fedora/linux/extras/development/SRPMS/

The particular plugin causing the trouble:
http://download.fedora.redhat.com/pub/fedora/linux/extras/development/SRPMS/gkrellm-wifi-0.9.12-3.fc6.src.rpm

If you're not on a box that has rpm2cpio or similar, yell and I'll
break that .src.rpm up for you - there's basically just an 18K .tar.gz and
a 14K patch in there.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - possible recursive locking detected
  2006-09-28  8:46 2.6.18-mm2 Andrew Morton
                   ` (5 preceding siblings ...)
  2006-09-29 13:57 ` 2.6.18-mm2 J.A. Magallón
@ 2006-09-30  7:04 ` Borislav Petkov
  2006-09-30  8:28   ` Andrew Morton
       [not found] ` <20060930133706.GA3291@melchior.yamamaya.is-a-geek.org>
  7 siblings, 1 reply; 140+ messages in thread
From: Borislav Petkov @ 2006-09-30  7:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Thu, Sep 28, 2006 at 01:46:23AM -0700, Andrew Morton wrote:
Hi,

    .config is at http://tim.dnsalias.org/2.6.18-mm2.cfg.

Sep 30 08:38:17 zmei kernel: [  285.197902] 
Sep 30 08:38:19 zmei kernel: [  285.197905] =============================================
Sep 30 08:38:19 zmei kernel: [  285.204776] [ INFO: possible recursive locking detected ]
Sep 30 08:38:19 zmei kernel: [  285.210163] 2.6.18-mm2 #1
Sep 30 08:38:19 zmei kernel: [  285.212782] ---------------------------------------------
Sep 30 08:38:19 zmei kernel: [  285.218168] swapper/0 is trying to acquire lock:
Sep 30 08:38:19 zmei kernel: [  285.222777]  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [  285.229114] 
Sep 30 08:38:19 zmei kernel: [  285.229115] but task is already holding lock:
Sep 30 08:38:19 zmei kernel: [  285.234952]  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [  285.241290] 
Sep 30 08:38:19 zmei kernel: [  285.241291] other info that might help us debug this:
Sep 30 08:38:19 zmei kernel: [  285.247817] 4 locks held by swapper/0:
Sep 30 08:38:19 zmei kernel: [  285.251561]  #0:  (&tp->rx_lock){-+..}, at: [<c020f350>] rtl8139_poll+0x42/0x405
Sep 30 08:38:19 zmei kernel: [  285.259041]  #1:  (slock-AF_INET/1){-+..}, at: [<c02aa753>] tcp_v4_rcv+0x3fa/0x8eb
Sep 30 08:38:19 zmei kernel: [  285.266700]  #2:  (af_callback_keys + sk->sk_family#3){-.-?}, at: [<c0278d83>] sock_def_readable+0x15/0x69
Sep 30 08:38:19 zmei kernel: [  285.276454]  #3:  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [  285.283241] 
Sep 30 08:38:19 zmei kernel: [  285.283242] stack backtrace:
Sep 30 08:38:19 zmei kernel: [  285.287688]  [<c0103b65>] dump_trace+0x64/0x1cd
Sep 30 08:38:19 zmei kernel: [  285.292243]  [<c0103ce0>] show_trace_log_lvl+0x12/0x25
Sep 30 08:38:19 zmei kernel: [  285.297405]  [<c010431c>] show_trace+0xd/0x10
Sep 30 08:38:19 zmei kernel: [  285.301780]  [<c01043e4>] dump_stack+0x19/0x1b
Sep 30 08:38:19 zmei kernel: [  285.306250]  [<c013022d>] __lock_acquire+0x750/0x96c
Sep 30 08:38:19 zmei kernel: [  285.311304]  [<c013098c>] lock_acquire+0x4b/0x6b
Sep 30 08:38:19 zmei kernel: [  285.316005]  [<c02ca474>] _spin_lock_irqsave+0x2c/0x3c
Sep 30 08:38:19 zmei kernel: [  285.321233]  [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [  285.325638]  [<c0178dd4>] ep_poll_safewake+0x91/0xc3
Sep 30 08:38:19 zmei kernel: [  285.330760]  [<c0179c69>] ep_poll_callback+0x83/0x8e
Sep 30 08:38:19 zmei kernel: [  285.335888]  [<c01122e5>] __wake_up_common+0x2f/0x53
Sep 30 08:38:19 zmei kernel: [  285.340898]  [<c0112f83>] __wake_up+0x28/0x3b
Sep 30 08:38:19 zmei kernel: [  285.345312]  [<c0278da8>] sock_def_readable+0x3a/0x69
Sep 30 08:38:20 zmei kernel: [  285.350778]  [<c02a1892>] tcp_data_queue+0x50f/0xa53
Sep 30 08:38:20 zmei kernel: [  285.356232]  [<c02a34c3>] tcp_rcv_established+0x5aa/0x64f
Sep 30 08:38:20 zmei kernel: [  285.362077]  [<c02a86f6>] tcp_v4_do_rcv+0x26/0x2f2
Sep 30 08:38:20 zmei kernel: [  285.367322]  [<c02aabd4>] tcp_v4_rcv+0x87b/0x8eb
Sep 30 08:38:20 zmei kernel: [  285.372432]  [<c02928e3>] ip_local_deliver+0x19c/0x265
Sep 30 08:38:20 zmei kernel: [  285.378033]  [<c029270b>] ip_rcv+0x453/0x48f
Sep 30 08:38:20 zmei kernel: [  285.382769]  [<c027e51a>] netif_receive_skb+0x1a6/0x239
Sep 30 08:38:20 zmei kernel: [  285.388440]  [<c020f5a5>] rtl8139_poll+0x297/0x405
Sep 30 08:38:20 zmei kernel: [  285.393553]  [<c027ff20>] net_rx_action+0x76/0x109
Sep 30 08:38:20 zmei kernel: [  285.398782]  [<c011dad0>] __do_softirq+0x70/0xf0
Sep 30 08:38:20 zmei kernel: [  285.403459]  [<c011db89>] do_softirq+0x39/0x55
Sep 30 08:38:20 zmei kernel: [  285.407963]  [<c011dcd5>] irq_exit+0x49/0x56
Sep 30 08:38:20 zmei kernel: [  285.412295]  [<c010537f>] do_IRQ+0x8f/0x9c
Sep 30 08:38:20 zmei kernel: [  285.416408]  [<c01035e1>] common_interrupt+0x25/0x2c
Sep 30 08:38:20 zmei kernel: [  285.421393] DWARF2 unwinder stuck at common_interrupt+0x25/0x2c
Sep 30 08:38:20 zmei kernel: [  285.427298] 
Sep 30 08:38:20 zmei kernel: [  285.428786] Leftover inexact backtrace:
Sep 30 08:38:20 zmei kernel: [  285.428787] 
Sep 30 08:38:20 zmei kernel: [  285.434103]  [<c010168b>] cpu_idle+0x72/0x9b
Sep 30 08:38:20 zmei kernel: [  285.438383]  [<c010064e>] rest_init+0x37/0x39
Sep 30 08:38:20 zmei kernel: [  285.442742]  [<c043d73b>] start_kernel+0x356/0x35e
Sep 30 08:38:20 zmei kernel: [  285.447549]  [<00000000>] 0x0
Sep 30 08:38:20 zmei kernel: [  285.450541]  =======================

-- 
Regards/Gruß,
    Boris.

	

	
		
___________________________________________________________ 
Der frühe Vogel fängt den Wurm. Hier gelangen Sie zum neuen Yahoo! Mail: http://mail.yahoo.de

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  3:31               ` Valdis.Kletnieks
@ 2006-09-30  7:50                 ` Valdis.Kletnieks
  2006-09-30  8:33                   ` Andrew Morton
  0 siblings, 1 reply; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-09-30  7:50 UTC (permalink / raw)
  To: jt, Andrew Morton, John W. Linville; +Cc: linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 1564 bytes --]

On Fri, 29 Sep 2006 23:31:07 EDT, Valdis.Kletnieks@vt.edu said:
> Fair enough,  I'm going to try reverting the 2 commits and see if things
> behave better.

OK, it's definitely something in those 2 commits - I reverted them and the
resulting 2.6.18-mm2 kernel has been up and stable for 4 hours, even with
the problem gkrellm updating once a second the whole time.

I'm not *seeing* how those changes can cause trouble - unless it's this:

diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
index 1840b69..9e19a96 100644
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -3037,7 +3037,7 @@ static int orinoco_ioctl_getessid(struct
        }
 
        erq->flags = 1;
-       erq->length = strlen(essidbuf) + 1;
+       erq->length = strlen(essidbuf);

Does some other code go batshit if length ==0?  My current config doesn't
try to actually ifup the wireless if I also have connectivity via copper (in
order to avoid chewing up a DHCP lease in crowded address space if not needed).

% iwconfig eth5
eth5      IEEE 802.11b  ESSID:""  Nickname:"HERMES I"
          Mode:Managed  Frequency:2.457 GHz  Access Point: Not-Associated   
          Bit Rate:11 Mb/s   Sensitivity:1/3  
          Retry limit:4   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=0/92  Signal level=134/153  Noise level=134/153
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

That ESSID the source of the trouble?


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - possible recursive locking detected
  2006-09-30  7:04 ` 2.6.18-mm2 - possible recursive locking detected Borislav Petkov
@ 2006-09-30  8:28   ` Andrew Morton
  2006-09-30 18:19     ` Davide Libenzi
  0 siblings, 1 reply; 140+ messages in thread
From: Andrew Morton @ 2006-09-30  8:28 UTC (permalink / raw)
  To: petkov; +Cc: Borislav Petkov, linux-kernel, Davide Libenzi, Ingo Molnar

On Sat, 30 Sep 2006 09:04:06 +0200
Borislav Petkov <bbpetkov@yahoo.de> wrote:

> On Thu, Sep 28, 2006 at 01:46:23AM -0700, Andrew Morton wrote:
> Hi,
> 
>     .config is at http://tim.dnsalias.org/2.6.18-mm2.cfg.
> 
> Sep 30 08:38:17 zmei kernel: [  285.197902] 
> Sep 30 08:38:19 zmei kernel: [  285.197905] =============================================
> Sep 30 08:38:19 zmei kernel: [  285.204776] [ INFO: possible recursive locking detected ]
> Sep 30 08:38:19 zmei kernel: [  285.210163] 2.6.18-mm2 #1
> Sep 30 08:38:19 zmei kernel: [  285.212782] ---------------------------------------------
> Sep 30 08:38:19 zmei kernel: [  285.218168] swapper/0 is trying to acquire lock:
> Sep 30 08:38:19 zmei kernel: [  285.222777]  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [  285.229114] 
> Sep 30 08:38:19 zmei kernel: [  285.229115] but task is already holding lock:
> Sep 30 08:38:19 zmei kernel: [  285.234952]  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [  285.241290] 
> Sep 30 08:38:19 zmei kernel: [  285.241291] other info that might help us debug this:
> Sep 30 08:38:19 zmei kernel: [  285.247817] 4 locks held by swapper/0:
> Sep 30 08:38:19 zmei kernel: [  285.251561]  #0:  (&tp->rx_lock){-+..}, at: [<c020f350>] rtl8139_poll+0x42/0x405
> Sep 30 08:38:19 zmei kernel: [  285.259041]  #1:  (slock-AF_INET/1){-+..}, at: [<c02aa753>] tcp_v4_rcv+0x3fa/0x8eb
> Sep 30 08:38:19 zmei kernel: [  285.266700]  #2:  (af_callback_keys + sk->sk_family#3){-.-?}, at: [<c0278d83>] sock_def_readable+0x15/0x69
> Sep 30 08:38:19 zmei kernel: [  285.276454]  #3:  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [  285.283241] 
> Sep 30 08:38:19 zmei kernel: [  285.283242] stack backtrace:
> Sep 30 08:38:19 zmei kernel: [  285.287688]  [<c0103b65>] dump_trace+0x64/0x1cd
> Sep 30 08:38:19 zmei kernel: [  285.292243]  [<c0103ce0>] show_trace_log_lvl+0x12/0x25
> Sep 30 08:38:19 zmei kernel: [  285.297405]  [<c010431c>] show_trace+0xd/0x10
> Sep 30 08:38:19 zmei kernel: [  285.301780]  [<c01043e4>] dump_stack+0x19/0x1b
> Sep 30 08:38:19 zmei kernel: [  285.306250]  [<c013022d>] __lock_acquire+0x750/0x96c
> Sep 30 08:38:19 zmei kernel: [  285.311304]  [<c013098c>] lock_acquire+0x4b/0x6b
> Sep 30 08:38:19 zmei kernel: [  285.316005]  [<c02ca474>] _spin_lock_irqsave+0x2c/0x3c
> Sep 30 08:38:19 zmei kernel: [  285.321233]  [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [  285.325638]  [<c0178dd4>] ep_poll_safewake+0x91/0xc3
> Sep 30 08:38:19 zmei kernel: [  285.330760]  [<c0179c69>] ep_poll_callback+0x83/0x8e
> Sep 30 08:38:19 zmei kernel: [  285.335888]  [<c01122e5>] __wake_up_common+0x2f/0x53
> Sep 30 08:38:19 zmei kernel: [  285.340898]  [<c0112f83>] __wake_up+0x28/0x3b
> Sep 30 08:38:19 zmei kernel: [  285.345312]  [<c0278da8>] sock_def_readable+0x3a/0x69
> Sep 30 08:38:20 zmei kernel: [  285.350778]  [<c02a1892>] tcp_data_queue+0x50f/0xa53
> Sep 30 08:38:20 zmei kernel: [  285.356232]  [<c02a34c3>] tcp_rcv_established+0x5aa/0x64f
> Sep 30 08:38:20 zmei kernel: [  285.362077]  [<c02a86f6>] tcp_v4_do_rcv+0x26/0x2f2
> Sep 30 08:38:20 zmei kernel: [  285.367322]  [<c02aabd4>] tcp_v4_rcv+0x87b/0x8eb

<looks at ep_poll_safewake>

<falls out of chair>

We'll need to teach lockdep about that one, but I don't have a clue how.

Is it not vulnerable to ab/ba deadlocking?



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  7:50                 ` Valdis.Kletnieks
@ 2006-09-30  8:33                   ` Andrew Morton
  0 siblings, 0 replies; 140+ messages in thread
From: Andrew Morton @ 2006-09-30  8:33 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: jt, John W. Linville, linux-kernel, netdev

On Sat, 30 Sep 2006 03:50:43 -0400
Valdis.Kletnieks@vt.edu wrote:

> On Fri, 29 Sep 2006 23:31:07 EDT, Valdis.Kletnieks@vt.edu said:
> > Fair enough,  I'm going to try reverting the 2 commits and see if things
> > behave better.
> 
> OK, it's definitely something in those 2 commits - I reverted them and the
> resulting 2.6.18-mm2 kernel has been up and stable for 4 hours, even with
> the problem gkrellm updating once a second the whole time.
> 
> I'm not *seeing* how those changes can cause trouble - unless it's this:
> 
> diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
> index 1840b69..9e19a96 100644
> --- a/drivers/net/wireless/orinoco.c
> +++ b/drivers/net/wireless/orinoco.c
> @@ -3037,7 +3037,7 @@ static int orinoco_ioctl_getessid(struct
>         }
>  
>         erq->flags = 1;
> -       erq->length = strlen(essidbuf) + 1;
> +       erq->length = strlen(essidbuf);

You know what the next question is ;)

Did reverting just that line fix it?

> Does some other code go batshit if length ==0? My current config doesn't
> try to actually ifup the wireless if I also have connectivity via copper (in
> order to avoid chewing up a DHCP lease in crowded address space if not needed).
> 
> % iwconfig eth5
> eth5      IEEE 802.11b  ESSID:""  Nickname:"HERMES I"
>           Mode:Managed  Frequency:2.457 GHz  Access Point: Not-Associated   
>           Bit Rate:11 Mb/s   Sensitivity:1/3  
>           Retry limit:4   RTS thr:off   Fragment thr:off
>           Power Management:off
>           Link Quality=0/92  Signal level=134/153  Noise level=134/153
>           Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
>           Tx excessive retries:0  Invalid misc:0   Missed beacon:0
> 
> That ESSID the source of the trouble?
> 

Might be.  I can't immediately spot a problem with it, but perhaps
length==0 causes the driver to not allocate a buffer and to then write to
the not-allocated buffer.  Not sure..

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: Makefile for linux modules
  2006-09-30  1:57             ` Makefile for linux modules x z
@ 2006-09-30  8:55               ` Sam Ravnborg
  0 siblings, 0 replies; 140+ messages in thread
From: Sam Ravnborg @ 2006-09-30  8:55 UTC (permalink / raw)
  To: x z; +Cc: linux-kernel, netdev

Hi Robert.

>    I have a makefielt to make several driver modules:
> obj-$(CONFIG_FUSION_SPI)	+= mptbase.o mptscsih.o
> mptspi.o
> obj-$(CONFIG_FUSION_FC)		+= mptbase.o mptscsih.o
> mptfc.o
> obj-m				+= mptbase.o mptscsih.o mptsas.o
> obj-$(CONFIG_FUSION_LAN)	+= mptlan.o
> obj-m				+= mptctl.o
> obj-m                           += mptcfg.o
> obj-m                       +=mptstm.o

The above kbuild file snippet tells us that you are creating
a number of modules:
mptbase.ko mptscsih.ko mptsas.ko mptlan.ko mptctl.ko mtpcfg.ko and mptstm.ko
They are each build from a single .c file.

> mptbase-objs             := comfunc.o

Now you try to include confunc.o in every module.
To do so you need to tell kbuild that you are dealing with a module
based on composite .o files.
That would look like:
obj-$(CONFIG_FUSION_PCI) += mptbase-foo.o
mtpbase-foo-y := comfunc.o mptbase.o

This will result in a module named mtpbase-foo.ko which is hardly what
you try to achive. Likewise you will have duplicate symbols in the
modules due to comfunc.o being included more than once.

The only sane approce here is to compile comfunc.o as an independent
module and let the modutils pull in the comfunc (deservers a more
specific name) module as needed.

So what you need to do is simply:
obj-m += comfunc.o

And accept this is a module so all symbols that you needs must be properly
exported using EXPORT_SYMBOL*

	Sam

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-09-30 14:19             ` Alan Cox
@ 2006-09-30 13:51               ` Willy Tarreau
  0 siblings, 0 replies; 140+ messages in thread
From: Willy Tarreau @ 2006-09-30 13:51 UTC (permalink / raw)
  To: Alan Cox
  Cc: Frederik Deweerdt, Matthew Wilcox, J.A. Magall??n, Andrew Morton,
	Linux-Kernel,,
	linux-scsi

On Sat, Sep 30, 2006 at 03:19:14PM +0100, Alan Cox wrote:
> Ar Sad, 2006-09-30 am 14:09 +0000, ysgrifennodd Frederik Deweerdt:
> > Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
> 
> Acked-by: Alan Cox <alan@redhat.com>

It seems to me that it's also valid for 2.4. Has someone any objection ?

Willy


^ permalink raw reply	[flat|nested] 140+ messages in thread

* [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-09-29 23:43         ` 2.6.18-mm2 Alan Cox
@ 2006-09-30 14:09           ` Frederik Deweerdt
  2006-09-30 14:19             ` Alan Cox
  2006-09-30 23:58             ` Jeff Garzik
  0 siblings, 2 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-09-30 14:09 UTC (permalink / raw)
  To: Alan Cox
  Cc: Matthew Wilcox, J.A. Magall??n, Andrew Morton, Linux-Kernel,, linux-scsi

On Sat, Sep 30, 2006 at 12:43:24AM +0100, Alan Cox wrote:
> Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
> > Does this patch makes sense in that case? If yes, I'll put up a patch
> > for the remaining cases in the drivers/scsi/aic7xxx/ directory.
> > Also, aic7xxx's coding style would put parenthesis around the returned
> > value, should I follow it?
> 
> Yes - but perhaps with a warning message so users know why ?
> 
> As to coding style - kernel style is unbracketed so I wouldnt worry
> about either.
> 
Thanks for the advices. 

The following patch checks whenever the irq is valid before issuing a
request_irq() for AIC7XXX and AIC79XX. An error message is displayed to
let the user know what went wrong.

Regards,
Frederik

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>

diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
index 2001fe8..8279122 100644
--- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
@@ -132,6 +132,11 @@ ahd_linux_pci_dev_probe(struct pci_dev *
 	char		*name;
 	int		 error;
 
+	if (!pdev->irq) {
+		printk(KERN_WARNING "aic79xx: No irq line set\n");
+		return -ENODEV;
+	}
+
 	pci = pdev;
 	entry = ahd_find_pci_device(pci);
 	if (entry == NULL)
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index ea5687d..ca61cdb 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -185,6 +185,11 @@ ahc_linux_pci_dev_probe(struct pci_dev *
 	int		 error;
 	struct device	*dev = &pdev->dev;
 
+	if (!pdev->irq) {
+		printk(KERN_WARNING "aic7xxx: No irq line set\n");
+		return -ENODEV;
+	}
+
 	pci = pdev;
 	entry = ahc_find_pci_device(pci);
 	if (entry == NULL)

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-09-30 14:09           ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
@ 2006-09-30 14:19             ` Alan Cox
  2006-09-30 13:51               ` Willy Tarreau
  2006-09-30 23:58             ` Jeff Garzik
  1 sibling, 1 reply; 140+ messages in thread
From: Alan Cox @ 2006-09-30 14:19 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Matthew Wilcox, J.A. Magall??n, Andrew Morton, Linux-Kernel,, linux-scsi

Ar Sad, 2006-09-30 am 14:09 +0000, ysgrifennodd Frederik Deweerdt:
> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>

Acked-by: Alan Cox <alan@redhat.com>


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-29 23:50       ` 2.6.18-mm2 Frederik Deweerdt
  2006-09-29 23:43         ` 2.6.18-mm2 Alan Cox
@ 2006-09-30 15:26         ` James Bottomley
  2006-09-30 16:21           ` 2.6.18-mm2 Matthew Wilcox
  2006-09-30 20:54           ` 2.6.18-mm2 Alan Cox
  1 sibling, 2 replies; 140+ messages in thread
From: James Bottomley @ 2006-09-30 15:26 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Alan Cox, Matthew Wilcox, J.A. Magall??n, Andrew Morton,
	Linux-Kernel,,
	linux-scsi

On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
> +       if (!pdev->irq)
> +               return -ENODEV;
> +

Don't I remember that 0 is a valid IRQ on some platforms?

i.e. shouldn't this be

if (pdev->irq == NO_IRQ)
	return -ENODEV;

?

I think this won't quite work because only the platforms that actually
have a valid zero irq define it, but there must be something else that
works.

James



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-30 15:26         ` 2.6.18-mm2 James Bottomley
@ 2006-09-30 16:21           ` Matthew Wilcox
  2006-09-30 17:20             ` 2.6.18-mm2 Mark Rustad
  2006-09-30 20:54           ` 2.6.18-mm2 Alan Cox
  1 sibling, 1 reply; 140+ messages in thread
From: Matthew Wilcox @ 2006-09-30 16:21 UTC (permalink / raw)
  To: James Bottomley
  Cc: Frederik Deweerdt, Alan Cox, J.A. Magall??n, Andrew Morton,
	Linux-Kernel,,
	linux-scsi

On Sat, Sep 30, 2006 at 10:26:22AM -0500, James Bottomley wrote:
> On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
> > +       if (!pdev->irq)
> > +               return -ENODEV;
> > +
> 
> Don't I remember that 0 is a valid IRQ on some platforms?
> 
> i.e. shouldn't this be
> 
> if (pdev->irq == NO_IRQ)
> 	return -ENODEV;
> 
> ?
> 
> I think this won't quite work because only the platforms that actually
> have a valid zero irq define it, but there must be something else that
> works.

Linus threw a hissy fit and declared that platforms which use 0 as a
valid IRQ are broken and wrong.  Despite PCI using 255 to mean no IRQ
and 0 as a valid IRQ ;-)

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-30 16:21           ` 2.6.18-mm2 Matthew Wilcox
@ 2006-09-30 17:20             ` Mark Rustad
  0 siblings, 0 replies; 140+ messages in thread
From: Mark Rustad @ 2006-09-30 17:20 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: James Bottomley, Frederik Deweerdt, Alan Cox, J.A. Magall??n,
	Andrew Morton, Linux-Kernel,,
	linux-scsi

On Sep 30, 2006, at 11:21 AM, Matthew Wilcox wrote:

> On Sat, Sep 30, 2006 at 10:26:22AM -0500, James Bottomley wrote:
>> On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
>>> +       if (!pdev->irq)
>>> +               return -ENODEV;
>>> +
>>
>> Don't I remember that 0 is a valid IRQ on some platforms?
>>
>> i.e. shouldn't this be
>>
>> if (pdev->irq == NO_IRQ)
>> 	return -ENODEV;
>>
>> ?
>>
>> I think this won't quite work because only the platforms that  
>> actually
>> have a valid zero irq define it, but there must be something else  
>> that
>> works.
>
> Linus threw a hissy fit and declared that platforms which use 0 as a
> valid IRQ are broken and wrong.  Despite PCI using 255 to mean no IRQ
> and 0 as a valid IRQ ;-)

Having gone down the path of creating a platform that had IRQ 0 as a  
valid interrupt some time ago with the 2.4 kernel, all I can say is  
that while it can be made to work, things go much more smoothly if  
you don't use IRQ 0. Every driver added to the environment pretty  
much had to be tweaked. Of course that mainly meant adding to the  
#ifdef's that were already there for other architectures that had  
also made that mistake.

The biggest pain is admitting the mistake (of using IRQ 0) and  
changing it. Making a clear statement on the issue will help prevent  
others from making the same mistake again. I know that I wish that I  
had known not to do that from the beginning. Having been there and  
done that, I don't need any convincing.

-- 
Mark Rustad, MRustad@mac.com


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - possible recursive locking detected
  2006-09-30  8:28   ` Andrew Morton
@ 2006-09-30 18:19     ` Davide Libenzi
  0 siblings, 0 replies; 140+ messages in thread
From: Davide Libenzi @ 2006-09-30 18:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: petkov, Borislav Petkov, Linux Kernel Mailing List, Ingo Molnar

On Sat, 30 Sep 2006, Andrew Morton wrote:

> On Sat, 30 Sep 2006 09:04:06 +0200
> Borislav Petkov <bbpetkov@yahoo.de> wrote:
> 
> > On Thu, Sep 28, 2006 at 01:46:23AM -0700, Andrew Morton wrote:
> > Hi,
> > 
> >     .config is at http://tim.dnsalias.org/2.6.18-mm2.cfg.
> > 
> > Sep 30 08:38:17 zmei kernel: [  285.197902] 
> > Sep 30 08:38:19 zmei kernel: [  285.197905] =============================================
> > Sep 30 08:38:19 zmei kernel: [  285.204776] [ INFO: possible recursive locking detected ]
> > Sep 30 08:38:19 zmei kernel: [  285.210163] 2.6.18-mm2 #1
> > Sep 30 08:38:19 zmei kernel: [  285.212782] ---------------------------------------------
> > Sep 30 08:38:19 zmei kernel: [  285.218168] swapper/0 is trying to acquire lock:
> > Sep 30 08:38:19 zmei kernel: [  285.222777]  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [  285.229114] 
> > Sep 30 08:38:19 zmei kernel: [  285.229115] but task is already holding lock:
> > Sep 30 08:38:19 zmei kernel: [  285.234952]  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [  285.241290] 
> > Sep 30 08:38:19 zmei kernel: [  285.241291] other info that might help us debug this:
> > Sep 30 08:38:19 zmei kernel: [  285.247817] 4 locks held by swapper/0:
> > Sep 30 08:38:19 zmei kernel: [  285.251561]  #0:  (&tp->rx_lock){-+..}, at: [<c020f350>] rtl8139_poll+0x42/0x405
> > Sep 30 08:38:19 zmei kernel: [  285.259041]  #1:  (slock-AF_INET/1){-+..}, at: [<c02aa753>] tcp_v4_rcv+0x3fa/0x8eb
> > Sep 30 08:38:19 zmei kernel: [  285.266700]  #2:  (af_callback_keys + sk->sk_family#3){-.-?}, at: [<c0278d83>] sock_def_readable+0x15/0x69
> > Sep 30 08:38:19 zmei kernel: [  285.276454]  #3:  (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [  285.283241] 
> > Sep 30 08:38:19 zmei kernel: [  285.283242] stack backtrace:
> > Sep 30 08:38:19 zmei kernel: [  285.287688]  [<c0103b65>] dump_trace+0x64/0x1cd
> > Sep 30 08:38:19 zmei kernel: [  285.292243]  [<c0103ce0>] show_trace_log_lvl+0x12/0x25
> > Sep 30 08:38:19 zmei kernel: [  285.297405]  [<c010431c>] show_trace+0xd/0x10
> > Sep 30 08:38:19 zmei kernel: [  285.301780]  [<c01043e4>] dump_stack+0x19/0x1b
> > Sep 30 08:38:19 zmei kernel: [  285.306250]  [<c013022d>] __lock_acquire+0x750/0x96c
> > Sep 30 08:38:19 zmei kernel: [  285.311304]  [<c013098c>] lock_acquire+0x4b/0x6b
> > Sep 30 08:38:19 zmei kernel: [  285.316005]  [<c02ca474>] _spin_lock_irqsave+0x2c/0x3c
> > Sep 30 08:38:19 zmei kernel: [  285.321233]  [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [  285.325638]  [<c0178dd4>] ep_poll_safewake+0x91/0xc3
> > Sep 30 08:38:19 zmei kernel: [  285.330760]  [<c0179c69>] ep_poll_callback+0x83/0x8e
> > Sep 30 08:38:19 zmei kernel: [  285.335888]  [<c01122e5>] __wake_up_common+0x2f/0x53
> > Sep 30 08:38:19 zmei kernel: [  285.340898]  [<c0112f83>] __wake_up+0x28/0x3b
> > Sep 30 08:38:19 zmei kernel: [  285.345312]  [<c0278da8>] sock_def_readable+0x3a/0x69
> > Sep 30 08:38:20 zmei kernel: [  285.350778]  [<c02a1892>] tcp_data_queue+0x50f/0xa53
> > Sep 30 08:38:20 zmei kernel: [  285.356232]  [<c02a34c3>] tcp_rcv_established+0x5aa/0x64f
> > Sep 30 08:38:20 zmei kernel: [  285.362077]  [<c02a86f6>] tcp_v4_do_rcv+0x26/0x2f2
> > Sep 30 08:38:20 zmei kernel: [  285.367322]  [<c02aabd4>] tcp_v4_rcv+0x87b/0x8eb
> 
> <looks at ep_poll_safewake>
> 
> <falls out of chair>

Haha :)
I hope the comment describes the nastiness of the potential problems 
that can heppen when adding epoll descriptors inside epoll descriptors 
(non-trivial loops, looong chains, etc).



> We'll need to teach lockdep about that one, but I don't have a clue how.
> 
> Is it not vulnerable to ab/ba deadlocking?

The two locks are different. One looks the netcard ->poll one, and one is 
the epoll file ->poll one. I don't know lockdep, so I wouldn't know how to 
make it quite in this case (w/out losing the ability to detect other 
legitimate wait_queue_head_t-based x-locks).
Ingo?




- Davide



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
       [not found] ` <20060930133706.GA3291@melchior.yamamaya.is-a-geek.org>
@ 2006-09-30 19:53   ` Andrew Morton
  0 siblings, 0 replies; 140+ messages in thread
From: Andrew Morton @ 2006-09-30 19:53 UTC (permalink / raw)
  To: Tobias Diedrich; +Cc: linux-kernel, netdev

On Sat, 30 Sep 2006 15:37:06 +0200
Tobias Diedrich <ranma@tdiedrich.de> wrote:

> Andrew Morton wrote:
> 
> > - More updates to the MSI code.  If your machine has Message Signalled
> >   Interrupts, please enable it and give it a try.
> 
> I'm happy to report, that with 2.6.18-mm2 suspend to disk works for
> me without additional patches, tested both with MSI interrupts
> disabled and enabled (forcedeth driver).

Thanks.

Which kernel version(s) didn't work?  -mm1?  Mainline?

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2
  2006-09-30 15:26         ` 2.6.18-mm2 James Bottomley
  2006-09-30 16:21           ` 2.6.18-mm2 Matthew Wilcox
@ 2006-09-30 20:54           ` Alan Cox
  1 sibling, 0 replies; 140+ messages in thread
From: Alan Cox @ 2006-09-30 20:54 UTC (permalink / raw)
  To: James Bottomley
  Cc: Frederik Deweerdt, Matthew Wilcox, J.A. Magall??n, Andrew Morton,
	Linux-Kernel,,
	linux-scsi

Ar Sad, 2006-09-30 am 10:26 -0500, ysgrifennodd James Bottomley:
> On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
> > +       if (!pdev->irq)
> > +               return -ENODEV;
> > +
> 
> Don't I remember that 0 is a valid IRQ on some platforms?
> 
> i.e. shouldn't this be
> 
> if (pdev->irq == NO_IRQ)
> 	return -ENODEV;

NO_IRQ is gone. Everyone uses zero and Linus has declared that is how it
shall be.


Alan


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-09-30 14:09           ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
  2006-09-30 14:19             ` Alan Cox
@ 2006-09-30 23:58             ` Jeff Garzik
  2006-10-01 14:28               ` Matthew Wilcox
  2006-10-01 21:31               ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
  1 sibling, 2 replies; 140+ messages in thread
From: Jeff Garzik @ 2006-09-30 23:58 UTC (permalink / raw)
  To: Frederik Deweerdt, Andrew Morton
  Cc: Alan Cox, Matthew Wilcox, J.A. Magall??n, Linux-Kernel,, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 1121 bytes --]

Frederik Deweerdt wrote:
> On Sat, Sep 30, 2006 at 12:43:24AM +0100, Alan Cox wrote:
>> Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
>>> Does this patch makes sense in that case? If yes, I'll put up a patch
>>> for the remaining cases in the drivers/scsi/aic7xxx/ directory.
>>> Also, aic7xxx's coding style would put parenthesis around the returned
>>> value, should I follow it?
>> Yes - but perhaps with a warning message so users know why ?
>>
>> As to coding style - kernel style is unbracketed so I wouldnt worry
>> about either.
>>
> Thanks for the advices. 
> 
> The following patch checks whenever the irq is valid before issuing a
> request_irq() for AIC7XXX and AIC79XX. An error message is displayed to
> let the user know what went wrong.
> 
> Regards,
> Frederik
> 
> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>

Actually, rather than adding this check to every driver, I would rather 
do something like the attached patch:  create a pci_request_irq(), and 
pass a struct pci_device to it.  Then the driver author doesn't have to 
worry about such details.

	Jeff



[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2025 bytes --]

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a544997..9743471 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -809,6 +809,40 @@ err_out:
 	return -EBUSY;
 }
 
+#ifndef ARCH_VALIDATE_PCI_IRQ
+int pci_valid_irq(struct pci_dev *pdev)
+{
+	if (pdev->irq == 0)
+		return -EINVAL;
+	
+	return 0;
+}
+EXPORT_SYMBOL(pci_valid_irq);
+#endif /* ARCH_VALIDATE_PCI_IRQ */
+
+int pci_request_irq(struct pci_dev *pdev,
+		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
+		    unsigned long flags, const char *name, void *userdata)
+{
+	int rc;
+
+	rc = pci_valid_irq(pdev);
+	if (rc) {
+		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
+		return rc;
+	}
+
+	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
+			   name, userdata);
+}
+EXPORT_SYMBOL(pci_request_irq);
+
+void pci_release_irq(struct pci_dev *pdev, void *userdata)
+{
+	free_irq(pdev->irq, userdata);
+}
+EXPORT_SYMBOL(pci_release_irq);
+
 /**
  * pci_set_master - enables bus-mastering for device dev
  * @dev: the PCI device to enable
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5c3a417..5e254fc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -52,6 +52,7 @@ #include <linux/list.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/device.h>
+#include <linux/interrupt.h>
 
 /* File state for mmap()s on /proc/bus/pci/X/Y */
 enum pci_mmap_state {
@@ -537,6 +538,12 @@ void pci_release_regions(struct pci_dev 
 int __must_check pci_request_region(struct pci_dev *, int, const char *);
 void pci_release_region(struct pci_dev *, int);
 
+int __must_check pci_valid_irq(struct pci_dev *pdev);
+int __must_check pci_request_irq(struct pci_dev *pdev,
+		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
+		    unsigned long flags, const char *name, void *userdata);
+void pci_release_irq(struct pci_dev *pdev, void *userdata);
+
 /* drivers/pci/bus.c */
 int __must_check pci_bus_alloc_resource(struct pci_bus *bus,
 			struct resource *res, resource_size_t size,

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-09-30 23:58             ` Jeff Garzik
@ 2006-10-01 14:28               ` Matthew Wilcox
  2006-10-01 19:05                 ` Arjan van de Ven
  2006-10-01 21:31               ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
  1 sibling, 1 reply; 140+ messages in thread
From: Matthew Wilcox @ 2006-10-01 14:28 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Frederik Deweerdt, Andrew Morton, Alan Cox, J.A. Magall??n,
	Linux-Kernel,,
	linux-scsi

On Sat, Sep 30, 2006 at 07:58:18PM -0400, Jeff Garzik wrote:
> Actually, rather than adding this check to every driver, I would rather 
> do something like the attached patch:  create a pci_request_irq(), and 
> pass a struct pci_device to it.  Then the driver author doesn't have to 
> worry about such details.

I like pci_request_irq(), but pci_valid_irq is bad.

> +#ifndef ARCH_VALIDATE_PCI_IRQ
> +int pci_valid_irq(struct pci_dev *pdev)
> +{
> +	if (pdev->irq == 0)
> +		return -EINVAL;
> +	
> +	return 0;
> +}
> +EXPORT_SYMBOL(pci_valid_irq);
> +#endif /* ARCH_VALIDATE_PCI_IRQ */

Better would be:

#ifndef ARCH_VALIDATE_IRQ
static inline int valid_irq(unsigned int irq)
{
	return irq ? 1 : 0;
}
#endif

in linux/interrupt.h (around request_irq).

And it doesn't need to be a __must_check.  There's no point -- it has
no side-effects.  The only reason to call it is if you want the answer
to the question.  You had the sense of the return code wrong too; you
want to use it as:

int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
			unsigned long flags, const char *name, void *data)
{
	if (!valid_irq(pdev->irq)) {
		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
		return -EINVAL;
	}

	return request_irq(pdev->irq, handler, flags | IRQF_SHARED, name, data);
}


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-10-01 14:28               ` Matthew Wilcox
@ 2006-10-01 19:05                 ` Arjan van de Ven
  2006-10-01 19:19                   ` Jeff Garzik
  2006-10-01 19:36                   ` Matthew Wilcox
  0 siblings, 2 replies; 140+ messages in thread
From: Arjan van de Ven @ 2006-10-01 19:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Frederik Deweerdt,
	Jeff Garzik

> .
> 
> And it doesn't need to be a __must_check.  There's no point -- it has
> no side-effects.  The only reason to call it is if you want the answer
> to the question.  You had the sense of the return code wrong too; you
> want to use it as:
> 
> int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> 			unsigned long flags, const char *name, void *data)
> {
> 	if (!valid_irq(pdev->irq)) {
> 		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> 		return -EINVAL;
> 	}
> 
> 	return request_irq(pdev->irq, handler, flags | IRQF_SHARED, name, data);
> }


well... why not go one step further and eliminate the flags argument
entirely? And use pci_name() for the name (so eliminate the argument ;)
and always pass pdev as data, so that that argument can go away too....

that'll cover 99% of the request_irq() users for pci devices.. and makes
it really nicely simple and consistent.

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-10-01 19:05                 ` Arjan van de Ven
@ 2006-10-01 19:19                   ` Jeff Garzik
  2006-10-01 19:34                     ` Arjan van de Ven
  2006-10-01 19:36                   ` Matthew Wilcox
  1 sibling, 1 reply; 140+ messages in thread
From: Jeff Garzik @ 2006-10-01 19:19 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Frederik Deweerdt

Arjan van de Ven wrote:
> well... why not go one step further and eliminate the flags argument
> entirely? And use pci_name() for the name (so eliminate the argument ;)
> and always pass pdev as data, so that that argument can go away too....
> 
> that'll cover 99% of the request_irq() users for pci devices.. and makes
> it really nicely simple and consistent.

Disagree.  That would involve rewriting a lot of drivers.

flags: may or may not need sample-random flag.

name: is always the ethernet interface, for net drivers, or did you 
forget from your irqbalance days?  ;-)

data: in practice, is _rarely_ struct pci_dev.  It's usually a 
driver-private structure which is the structure most frequently 
accessed.  struct pci_dev* is rarely accessed inside the interrupt 
handler, except maybe somewhere deep in an error handling path.

	Jeff


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-10-01 19:19                   ` Jeff Garzik
@ 2006-10-01 19:34                     ` Arjan van de Ven
  0 siblings, 0 replies; 140+ messages in thread
From: Arjan van de Ven @ 2006-10-01 19:34 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Frederik Deweerdt

On Sun, 2006-10-01 at 15:19 -0400, Jeff Garzik wrote:
> Arjan van de Ven wrote:
> > well... why not go one step further and eliminate the flags argument
> > entirely? And use pci_name() for the name (so eliminate the argument ;)
> > and always pass pdev as data, so that that argument can go away too....
> > 
> > that'll cover 99% of the request_irq() users for pci devices.. and makes
> > it really nicely simple and consistent.
> 
> Disagree.  That would involve rewriting a lot of drivers.
> 
> flags: may or may not need sample-random flag.

ok fair.. but I'd then almost call it "samplerandom" not "flags"...


> 
> name: is always the ethernet interface, for net drivers, or did you 
> forget from your irqbalance days?  ;-)

I'd say the "always" isn't quite true .. I remember that well.
If it's always the pci device at least irqbalance can look up the device
type in sysfs ;)


> data: in practice, is _rarely_ struct pci_dev.  It's usually a 
> driver-private structure which is the structure most frequently 
> accessed.  struct pci_dev* is rarely accessed inside the interrupt 
> handler, except maybe somewhere deep in an error handling path.

hmmm could put a pointer to the private data in the pci_dev at least...
that'd be generally useful, and then this can either just pass that,
or have the isr get to it that way (whichever makes more sense)


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-10-01 19:05                 ` Arjan van de Ven
  2006-10-01 19:19                   ` Jeff Garzik
@ 2006-10-01 19:36                   ` Matthew Wilcox
  2006-10-01 19:42                     ` Jeff Garzik
  2006-10-02  2:12                     ` Arjan van de Ven
  1 sibling, 2 replies; 140+ messages in thread
From: Matthew Wilcox @ 2006-10-01 19:36 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Frederik Deweerdt,
	Jeff Garzik

On Sun, Oct 01, 2006 at 09:05:23PM +0200, Arjan van de Ven wrote:
> > int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> > 			unsigned long flags, const char *name, void *data)
> > {
> > 	if (!valid_irq(pdev->irq)) {
> > 		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> > 		return -EINVAL;
> > 	}
> > 
> > 	return request_irq(pdev->irq, handler, flags | IRQF_SHARED, name, data);
> > }
> 
> well... why not go one step further and eliminate the flags argument
> entirely? And use pci_name() for the name (so eliminate the argument ;)
> and always pass pdev as data, so that that argument can go away too....
> 
> that'll cover 99% of the request_irq() users for pci devices.. and makes
> it really nicely simple and consistent.

hmm.  $ echo `cut -c34- /proc/interrupts`
timer i8042 cascade acpi yenta, ehci_hcd:usb1, Intel 82801DB-ICH4 yenta,
uhci_hcd:usb2 uhci_hcd:usb4, eth0 ide0 uhci_hcd:usb3, eth1

Network drivers use their eth%d name.  USB drivers use [eu]hci_hcd:usb%d.
Others tend to use the driver name.  Changing them all to be 0000:00:1d.2
isn't really an improvement in the readability of /proc/interrupts, IMO.

Passing pdev as the data is a good idea for practically no device driver.
It's rare to actually want the pci_device down in the interrupt handler;
normally you want the device private data.  Using pci_get_drvdata(pdev)
as the data would make sense for both sym2 and tg3.  I don't feel like
auditing other drivers to see if it'd make sense for them too.

So, current proposal:

int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
			const char *name)
{
	if (!valid_irq(pdev->irq)) {
		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
		return -EINVAL;
	}

	return request_irq(pdev->irq, handler, IRQF_SHARED, name,
				pci_get_drvdata(pdev));
}

But what about IRQF_SAMPLE_RANDOM?

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-10-01 19:36                   ` Matthew Wilcox
@ 2006-10-01 19:42                     ` Jeff Garzik
  2006-10-02  2:12                     ` Arjan van de Ven
  1 sibling, 0 replies; 140+ messages in thread
From: Jeff Garzik @ 2006-10-01 19:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Frederik Deweerdt

Matthew Wilcox wrote:
> Others tend to use the driver name.  Changing them all to be 0000:00:1d.2
> isn't really an improvement in the readability of /proc/interrupts, IMO.

agreed


> Passing pdev as the data is a good idea for practically no device driver.

agreed


> It's rare to actually want the pci_device down in the interrupt handler;
> normally you want the device private data.  Using pci_get_drvdata(pdev)
> as the data would make sense for both sym2 and tg3.  I don't feel like

Using pci_get_drvdata() is a pretty good idea


> int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> 			const char *name)
> {
> 	if (!valid_irq(pdev->irq)) {
> 		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> 		return -EINVAL;
> 	}
> 
> 	return request_irq(pdev->irq, handler, IRQF_SHARED, name,
> 				pci_get_drvdata(pdev));
> }
> 
> But what about IRQF_SAMPLE_RANDOM?

I still like having a flags argument though.  It's enough of an open 
question, and I bet there will be a new flag or two in the future that 
PCI drivers will want to use.

	Jeff



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-09-30 23:58             ` Jeff Garzik
  2006-10-01 14:28               ` Matthew Wilcox
@ 2006-10-01 21:31               ` Frederik Deweerdt
  1 sibling, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-01 21:31 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Andrew Morton, Alan Cox, Matthew Wilcox, J.A. Magall??n,
	Linux-Kernel,,
	linux-scsi

On Sat, Sep 30, 2006 at 07:58:18PM -0400, Jeff Garzik wrote:
> Frederik Deweerdt wrote:
> >On Sat, Sep 30, 2006 at 12:43:24AM +0100, Alan Cox wrote:
> >>Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
> >>>Does this patch makes sense in that case? If yes, I'll put up a patch
> >>>for the remaining cases in the drivers/scsi/aic7xxx/ directory.
> >>>Also, aic7xxx's coding style would put parenthesis around the returned
> >>>value, should I follow it?
> >>Yes - but perhaps with a warning message so users know why ?
> >>
> >>As to coding style - kernel style is unbracketed so I wouldnt worry
> >>about either.
> >>
> >Thanks for the advices. The following patch checks whenever the irq is valid before issuing a
> >request_irq() for AIC7XXX and AIC79XX. An error message is displayed to
> >let the user know what went wrong.
> >Regards,
> >Frederik
> >Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
> 
> Actually, rather than adding this check to every driver, I would rather do something like the attached patch:  create a 
> pci_request_irq(), and pass a struct pci_device to it.  Then the driver author doesn't have to worry about such details.
> 
That's better, indeed. 
[...]
> +#ifndef ARCH_VALIDATE_PCI_IRQ
> +int pci_valid_irq(struct pci_dev *pdev)
> +{
> +	if (pdev->irq == 0)
> +		return -EINVAL;
                        ^^^^^^
Woulnd't this rather be ENODEV? Admitedly, from pci_valid_irq() (or
is_irq_valid()) point of view, it _has_ been passed an invalid value. But
from userspace's point of view, it's like the device was not present.

Regards,
Frederik

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)
  2006-10-01 19:36                   ` Matthew Wilcox
  2006-10-01 19:42                     ` Jeff Garzik
@ 2006-10-02  2:12                     ` Arjan van de Ven
  2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
  1 sibling, 1 reply; 140+ messages in thread
From: Arjan van de Ven @ 2006-10-02  2:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Frederik Deweerdt,
	Jeff Garzik


> Network drivers use their eth%d name.  USB drivers use [eu]hci_hcd:usb%d.
> Others tend to use the driver name.  Changing them all to be 0000:00:1d.2
> isn't really an improvement in the readability of /proc/interrupts, IMO.

hmm ok; how about allowing name to be NULL, and if it's NULL, use the
pci name?

> 
> So, current proposal:
> 
> int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> 			const char *name)
> {
> 	if (!valid_irq(pdev->irq)) {
> 		dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> 		return -EINVAL;
> 	}
> 
> 	return request_irq(pdev->irq, handler, IRQF_SHARED, name,
> 				pci_get_drvdata(pdev));
> }
> 
> But what about IRQF_SAMPLE_RANDOM?

that's a tough question. I'd almost suggest making such things
properties of the pdev, but sample-random is so far away from PCI
related that it makes no sense I suppose ;(

(others do I think)

One other interesting question is if this function can/should be used to
use MSI transparently (after pci_enable_msi() obviously)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: md deadlock (was Re: 2.6.18-mm2)
  2006-09-29 14:03       ` Peter Zijlstra
@ 2006-10-02 13:47         ` Peter Zijlstra
  2006-10-10  3:53           ` Neil Brown
  0 siblings, 1 reply; 140+ messages in thread
From: Peter Zijlstra @ 2006-10-02 13:47 UTC (permalink / raw)
  To: Neil Brown
  Cc: Michal Piotrowski, Andrew Morton, Ingo Molnar, linux-raid, linux-kernel

On Fri, 2006-09-29 at 16:03 +0200, Peter Zijlstra wrote:
> On Fri, 2006-09-29 at 22:52 +1000, Neil Brown wrote:
> > On Friday September 29, a.p.zijlstra@chello.nl wrote:
> > > On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> > > 
> > > Looks like a real deadlock here. It seems to me #2 is the easiest to
> > > break.
> > 
> > I guess it could deadlock if you tried to add /dev/md0 as a component
> > of /dev/md0.  I should probably check for that somewhere.
> > In other cases the array->member ordering ensures there is no
> > deadlock.
> > 
> 
> 
> 	1					2
> 
>  open(/dev/md0)
> 
> 					open(/dev/md0)
> 					- do_open() -> bdev->bd_mutex
>  ioctl(/dev/md0, hotadd) 
>  - md_ioctl() -> mddev->reconfig_mutex
>  -- hot_add_disk()
>  --- bind_rdev_to_array()
>  ---- bd_claim_by_disk()
>  ----- bd_claim_by_kobject()
> 					-- md_open()
> 					--- mddev_lock()
> 					---- mutex_lock(mddev->reconfig_mutex)
>  ------ mutex_lock(bdev->bd_mutex)
> 

D'0h, 1:bdev->bd_mutex is ofcourse rdev->bd_mutex; the slave device's
mutex.

So mddev->bd_mutex wants to be another class all-together. 


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-09-30  1:20           ` Andrew Morton
                               ` (3 preceding siblings ...)
  2006-09-30  1:59             ` x z
@ 2006-10-02 17:52             ` Jean Tourrilhes
  2006-10-02 19:57               ` Valdis.Kletnieks
  2006-10-03 15:58               ` Samuel Tardieu
  4 siblings, 2 replies; 140+ messages in thread
From: Jean Tourrilhes @ 2006-10-02 17:52 UTC (permalink / raw)
  To: Andrew Morton, Pavel Roskin
  Cc: Valdis.Kletnieks, John W. Linville, linux-kernel, netdev

On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 20:01:54 -0400
> > 
> > % grep ioctl /tmp/foo2 | sort -u | more
> > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0
> > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc)     = 0
> > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)      = 0
> 
> Yes.  The main thing which those WE-21 patches do is to shorten the size of
> various buffers which are used in wireless ioctls.

	Ok, I've found it. Actually, I feel ashamed, as it is a fairly
classical buffer overflow, we put one extra char in a buffer. Now, I
don't understand why it did not blow up on my box ;-)
	New patch. I think it is right, but I would not mind Pavel to
have a look at it. On my box it does not make thing worse.
	Valdis : would you mind trying if this patch fix the problem
you are seeing with WE-21 ? If it fixes it, I'll send it to John...
	Have fun...

	Jean

P.S. : I'll audit the other wireless drivers for the same thing.

-------------------------------------------------

diff -u -p linux/drivers/net/wireless/orinoco.j1.c linux/drivers/net/wireless/orinoco.c
--- linux/drivers/net/wireless/orinoco.j1.c	2006-10-02 10:15:41.000000000 -0700
+++ linux/drivers/net/wireless/orinoco.c	2006-10-02 10:39:20.000000000 -0700
@@ -2456,6 +2456,7 @@ void free_orinocodev(struct net_device *
 /* Wireless extensions                                              */
 /********************************************************************/
 
+/* Return : < 0 -> error code ; >= 0 -> length */
 static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
 				char buf[IW_ESSID_MAX_SIZE+1])
 {
@@ -2500,9 +2501,9 @@ static int orinoco_hw_get_essid(struct o
 	len = le16_to_cpu(essidbuf.len);
 	BUG_ON(len > IW_ESSID_MAX_SIZE);
 
-	memset(buf, 0, IW_ESSID_MAX_SIZE+1);
+	memset(buf, 0, IW_ESSID_MAX_SIZE);
 	memcpy(buf, p, len);
-	buf[len] = '\0';
+	err = len;
 
  fail_unlock:
 	orinoco_unlock(priv, &flags);
@@ -3026,17 +3027,18 @@ static int orinoco_ioctl_getessid(struct
 
 	if (netif_running(dev)) {
 		err = orinoco_hw_get_essid(priv, &active, essidbuf);
-		if (err)
+		if (err < 0)
 			return err;
+		erq->length = err;
 	} else {
 		if (orinoco_lock(priv, &flags) != 0)
 			return -EBUSY;
-		memcpy(essidbuf, priv->desired_essid, IW_ESSID_MAX_SIZE + 1);
+		memcpy(essidbuf, priv->desired_essid, IW_ESSID_MAX_SIZE);
+		erq->length = strlen(priv->desired_essid);
 		orinoco_unlock(priv, &flags);
 	}
 
 	erq->flags = 1;
-	erq->length = strlen(essidbuf);
 
 	return 0;
 }
@@ -3074,10 +3076,10 @@ static int orinoco_ioctl_getnick(struct 
 	if (orinoco_lock(priv, &flags) != 0)
 		return -EBUSY;
 
-	memcpy(nickbuf, priv->nick, IW_ESSID_MAX_SIZE+1);
+	memcpy(nickbuf, priv->nick, IW_ESSID_MAX_SIZE);
 	orinoco_unlock(priv, &flags);
 
-	nrq->length = strlen(nickbuf);
+	nrq->length = strlen(priv->nick);
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)
  2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
@ 2006-10-02 18:15                         ` Matthew Wilcox
  2006-10-02 21:09                           ` Frederik Deweerdt
  2006-10-02 20:07                         ` [RFC PATCH] move aic7xxx to pci_request_irq Frederik Deweerdt
                                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 140+ messages in thread
From: Matthew Wilcox @ 2006-10-02 18:15 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 08:00:48PM +0000, Frederik Deweerdt wrote:
>  /**
> + * pci_request_irq - Reserve an IRQ for a PCI device
> + * @pdev: The PCI device whose irq is to be reserved
> + * handler: The interrupt handler function,

> + * pci_get_drvdata(pdev) shall be passed as an argument to that function

I don't think you can (or should) do this.  Move it to the body of the
comment below.

> + * @flags: The flags to be passed to request_irq()
> + * @name: The name of the device to be associated with the irq
> + *
> + * Returns 0 on success, or a negative value on error.  A warning
> + * message is also printed on failure.
> + */
> +int pci_request_irq(struct pci_dev *pdev,
> +		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
> +		    unsigned long flags, const char *name)
> +{
> +	int rc;
> +	const char *actual_name = name;
> +
> +	rc = is_irq_valid(pdev->irq);
> +	if (!rc) {
> +		dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
> +		return -EINVAL;
> +	}

Why is that more readable than

	if (!is_irq_valid(pdev->irq)) {
		dev_err(&pdev->dev, "invalid irq #%d\n", pdev->irq);
		return -EINVAL;
	}

> +	if (!actual_name)
> +		actual_name = pci_name(pdev);
> +
> +	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> +			   actual_name, pci_get_drvdata(pdev));

The driver name is a far more common usage than the pci_name.

	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
			name ? name : pdev->driver->name,
			pci_get_drvdata(pdev));


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move aic7xxx to pci_request_irq
  2006-10-02 20:07                         ` [RFC PATCH] move aic7xxx to pci_request_irq Frederik Deweerdt
@ 2006-10-02 18:27                           ` Matthew Wilcox
  2006-10-02 21:02                             ` Frederik Deweerdt
  2006-10-03  3:45                           ` Arjan van de Ven
  1 sibling, 1 reply; 140+ messages in thread
From: Matthew Wilcox @ 2006-10-02 18:27 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 08:07:03PM +0000, Frederik Deweerdt wrote:
> +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> @@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
>  {
>  	int error;
>  
> -	error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
> -			    IRQF_SHARED, "aic79xx", ahd);
> +	error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
> +			    IRQF_SHARED, "aic79xx");
>  	if (!error)
>  		ahd->platform_data->irq = ahd->dev_softc->irq;
>  	
> -	return (-error);
> +	return error;

Seems unsafe to me.  Unless you want to trace through the whole driver
changing its internal conventions to use negative errnos like the rest
of the kernel.

> -	
> -	return (-error);
> -}
>  
> +	return error;
> +}

Ditto.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move tg3 to pci_request_irq
  2006-10-02 20:11                         ` [RFC PATCH] move tg3 " Frederik Deweerdt
@ 2006-10-02 18:28                           ` Matthew Wilcox
  2006-10-02 21:04                             ` Frederik Deweerdt
  2006-10-03  7:18                           ` Arjan van de Ven
  1 sibling, 1 reply; 140+ messages in thread
From: Matthew Wilcox @ 2006-10-02 18:28 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 08:11:34PM +0000, Frederik Deweerdt wrote:
> @@ -6838,9 +6838,9 @@ restart_timer:
>  
>  static int tg3_request_irq(struct tg3 *tp)
>  {
> +	struct net_device *dev = tp->dev;
>  	irqreturn_t (*fn)(int, void *, struct pt_regs *);
>  	unsigned long flags;
> -	struct net_device *dev = tp->dev;
>  
>  	if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
>  		fn = tg3_msi;

Is there any reason for this noise?


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move drm to pci_request_irq
  2006-10-02 20:12                         ` [RFC PATCH] move drm " Frederik Deweerdt
@ 2006-10-02 18:37                           ` Matthew Wilcox
  2006-10-02 21:07                             ` Frederik Deweerdt
  2006-10-02 20:36                           ` Alan Cox
  2006-10-02 23:54                           ` Dave Airlie
  2 siblings, 1 reply; 140+ messages in thread
From: Matthew Wilcox @ 2006-10-02 18:37 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 08:12:29PM +0000, Frederik Deweerdt wrote:
>  
> +	pci_set_drvdata(dev, NULL);
> +
>  	DRM_DEBUG("lastclose completed\n");

Not necessary.  pci_devs are allocated initialised to 0.

> @@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t 
>  	if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
>  		sh_flags = IRQF_SHARED;
>  
> -	ret = request_irq(dev->irq, dev->driver->irq_handler,
> -			  sh_flags, dev->devname, dev);
> +	pci_set_drvdata(dev->pdev, dev);
> +
> +	ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> +			  sh_flags, dev->devname);

This seems like the wrong place to be setting the pci_drvdata.  It
should probably be done in each driver.  But then, requesting the IRQ
should also be done by each driver.  You've dragged us into the "wow,
what a mess DRI is" black hole here, I'm afraid.


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-02 17:52             ` 2.6.18-mm2 - oops in cache_alloc_refill() Jean Tourrilhes
@ 2006-10-02 19:57               ` Valdis.Kletnieks
  2006-10-03 15:58               ` Samuel Tardieu
  1 sibling, 0 replies; 140+ messages in thread
From: Valdis.Kletnieks @ 2006-10-02 19:57 UTC (permalink / raw)
  To: jt; +Cc: Andrew Morton, Pavel Roskin, John W. Linville, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]

On Mon, 02 Oct 2006 10:52:45 PDT, Jean Tourrilhes said:
> On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> > On Fri, 29 Sep 2006 20:01:54 -0400
> > > 
> > > % grep ioctl /tmp/foo2 | sort -u | more
> > > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c)     = 0
> > > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc)     = 0
> > > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)      = 0
> > 
> > Yes.  The main thing which those WE-21 patches do is to shorten the size of
> > various buffers which are used in wireless ioctls.
> 
> 	Ok, I've found it. Actually, I feel ashamed, as it is a fairly
> classical buffer overflow, we put one extra char in a buffer. Now, I
> don't understand why it did not blow up on my box ;-)
> 	New patch. I think it is right, but I would not mind Pavel to
> have a look at it. On my box it does not make thing worse.
> 	Valdis : would you mind trying if this patch fix the problem
> you are seeing with WE-21 ? If it fixes it, I'll send it to John...

Been up and running with we-21 configured in, and gkrellm doing the monitoring
that gave it indigestion.  It was dying in 1-2 minutes, now been up for 30 mins
with no issues....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 140+ messages in thread

* [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)
  2006-10-02  2:12                     ` Arjan van de Ven
@ 2006-10-02 20:00                       ` Frederik Deweerdt
  2006-10-02 18:15                         ` Matthew Wilcox
                                           ` (4 more replies)
  0 siblings, 5 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 20:00 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

Hi all,

I've tried to summarize the different proposals made by Jeff Garzik,
Matthew Wilcox and Arjan van de Ven in the "[-mm patch] aic7xxx: check
irq validity" thread. I've also added:
- some kerneldoc
- renamed valid_irq to is_irq_valid() 
- added pci_release_irq(). 

I'll send a follow-up patch showing the implied modifications for the
following - semi-randomly chosen :) - drivers: aic7xxx, aic79xx, tg3
and drm.

Regards,
Frederik

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a544997..ae20a3a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -15,6 +15,7 @@ #include <linux/init.h>
 #include <linux/pci.h>
 #include <linux/module.h>
 #include <linux/spinlock.h>
+#include <linux/interrupt.h>
 #include <linux/string.h>
 #include <asm/dma.h>	/* isa_dma_bridge_buggy */
 #include "pci.h"
@@ -810,6 +811,49 @@ err_out:
 }
 
 /**
+ * pci_request_irq - Reserve an IRQ for a PCI device
+ * @pdev: The PCI device whose irq is to be reserved
+ * handler: The interrupt handler function,
+ * pci_get_drvdata(pdev) shall be passed as an argument to that function
+ * @flags: The flags to be passed to request_irq()
+ * @name: The name of the device to be associated with the irq
+ *
+ * Returns 0 on success, or a negative value on error.  A warning
+ * message is also printed on failure.
+ */
+int pci_request_irq(struct pci_dev *pdev,
+		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
+		    unsigned long flags, const char *name)
+{
+	int rc;
+	const char *actual_name = name;
+
+	rc = is_irq_valid(pdev->irq);
+	if (!rc) {
+		dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
+		return -EINVAL;
+	}
+
+	if (!actual_name)
+		actual_name = pci_name(pdev);
+
+	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
+			   actual_name, pci_get_drvdata(pdev));
+}
+EXPORT_SYMBOL(pci_request_irq);
+
+/**
+ * pci_free_irq - releases the interrupt line reserved to the PCI
+ * device pointed by @pdev 
+ * @pdev: the PCI device whose interrupt is to be freed
+ */
+void pci_free_irq(struct pci_dev *pdev)
+{
+	free_irq(pdev->irq, pci_get_drvdata(pdev));
+}
+EXPORT_SYMBOL(pci_free_irq);
+
+/**
  * pci_set_master - enables bus-mastering for device dev
  * @dev: the PCI device to enable
  *
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 1f97e3d..c320b50 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -75,6 +75,13 @@ struct irqaction {
 	struct proc_dir_entry *dir;
 };
 
+#ifndef ARCH_VALIDATE_PCI_IRQ
+static inline int is_irq_valid(unsigned int irq)
+{
+	return irq ? 1 : 0;
+}
+#endif /* ARCH_VALIDATE_PCI_IRQ */
+
 extern irqreturn_t no_action(int cpl, void *dev_id, struct pt_regs *regs);
 extern int request_irq(unsigned int,
 		       irqreturn_t (*handler)(int, void *, struct pt_regs *),
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5bc4659..5e0f07a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -52,6 +52,7 @@ #include <linux/list.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/device.h>
+#include <linux/interrupt.h>
 
 /* File state for mmap()s on /proc/bus/pci/X/Y */
 enum pci_mmap_state {
@@ -531,6 +532,11 @@ void pci_release_regions(struct pci_dev 
 int __must_check pci_request_region(struct pci_dev *, int, const char *);
 void pci_release_region(struct pci_dev *, int);
 
+int __must_check pci_request_irq(struct pci_dev *pdev,
+		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
+		    unsigned long flags, const char *name);
+void pci_free_irq(struct pci_dev *pdev);
+
 /* drivers/pci/bus.c */
 int __must_check pci_bus_alloc_resource(struct pci_bus *bus,
 			struct resource *res, resource_size_t size,

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* [RFC PATCH] move aic7xxx to pci_request_irq
  2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
  2006-10-02 18:15                         ` Matthew Wilcox
@ 2006-10-02 20:07                         ` Frederik Deweerdt
  2006-10-02 18:27                           ` Matthew Wilcox
  2006-10-03  3:45                           ` Arjan van de Ven
  2006-10-02 20:11                         ` [RFC PATCH] move tg3 " Frederik Deweerdt
                                           ` (2 subsequent siblings)
  4 siblings, 2 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 20:07 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

Hi,

This proof-of-concept patch converts the aic7xxx drivers to use the
pci_request_irq() function.

Regards,
Frederik


diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
index 2001fe8..c934f30 100644
--- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
@@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
 {
 	int error;
 
-	error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
-			    IRQF_SHARED, "aic79xx", ahd);
+	error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
+			    IRQF_SHARED, "aic79xx");
 	if (!error)
 		ahd->platform_data->irq = ahd->dev_softc->irq;
 	
-	return (-error);
+	return error;
 }
 
 void
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index ea5687d..d5c402e 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -368,16 +368,14 @@ ahc_pci_map_registers(struct ahc_softc *
 	return (error);
 }
 
-int
-ahc_pci_map_int(struct ahc_softc *ahc)
+int ahc_pci_map_int(struct ahc_softc *ahc)
 {
 	int error;
 
-	error = request_irq(ahc->dev_softc->irq, ahc_linux_isr,
-			    IRQF_SHARED, "aic7xxx", ahc);
+	error = pci_request_irq(ahc->dev_softc, ahc_linux_isr, IRQF_SHARED,
+			    	"aic7xxx");
 	if (error == 0)
 		ahc->platform_data->irq = ahc->dev_softc->irq;
-	
-	return (-error);
-}
 
+	return error;
+}

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* [RFC PATCH] move tg3 to pci_request_irq
  2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
  2006-10-02 18:15                         ` Matthew Wilcox
  2006-10-02 20:07                         ` [RFC PATCH] move aic7xxx to pci_request_irq Frederik Deweerdt
@ 2006-10-02 20:11                         ` Frederik Deweerdt
  2006-10-02 18:28                           ` Matthew Wilcox
  2006-10-03  7:18                           ` Arjan van de Ven
  2006-10-02 20:12                         ` [RFC PATCH] move drm " Frederik Deweerdt
  2006-10-03  3:58                         ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Randy Dunlap
  4 siblings, 2 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 20:11 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

Hi,

This proof-of-concept patch converts the tg3 driver to use the
pci_request_irq() function.

Regards,
Frederik


diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index c25ba27..23660c6 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6838,9 +6838,9 @@ restart_timer:
 
 static int tg3_request_irq(struct tg3 *tp)
 {
+	struct net_device *dev = tp->dev;
 	irqreturn_t (*fn)(int, void *, struct pt_regs *);
 	unsigned long flags;
-	struct net_device *dev = tp->dev;
 
 	if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
 		fn = tg3_msi;
@@ -6853,7 +6853,7 @@ static int tg3_request_irq(struct tg3 *t
 			fn = tg3_interrupt_tagged;
 		flags = IRQF_SHARED | IRQF_SAMPLE_RANDOM;
 	}
-	return (request_irq(tp->pdev->irq, fn, flags, dev->name, dev));
+	return pci_request_irq(tp->pdev, fn, flags, dev->name);
 }
 
 static int tg3_test_interrupt(struct tg3 *tp)
@@ -6866,10 +6866,10 @@ static int tg3_test_interrupt(struct tg3
 
 	tg3_disable_ints(tp);
 
-	free_irq(tp->pdev->irq, dev);
+	pci_free_irq(tp->pdev);
 
-	err = request_irq(tp->pdev->irq, tg3_test_isr,
-			  IRQF_SHARED | IRQF_SAMPLE_RANDOM, dev->name, dev);
+	err = pci_request_irq(tp->pdev, tg3_test_isr, 
+			      IRQF_SHARED | IRQF_SAMPLE_RANDOM, dev->name);
 	if (err)
 		return err;
 
@@ -6897,7 +6897,7 @@ static int tg3_test_interrupt(struct tg3
 
 	tg3_disable_ints(tp);
 
-	free_irq(tp->pdev->irq, dev);
+	pci_free_irq(tp->pdev);
 
 	err = tg3_request_irq(tp);
 
@@ -6915,7 +6915,6 @@ static int tg3_test_interrupt(struct tg3
  */
 static int tg3_test_msi(struct tg3 *tp)
 {
-	struct net_device *dev = tp->dev;
 	int err;
 	u16 pci_cmd;
 
@@ -6946,7 +6945,7 @@ static int tg3_test_msi(struct tg3 *tp)
 	       "the PCI maintainer and include system chipset information.\n",
 		       tp->dev->name);
 
-	free_irq(tp->pdev->irq, dev);
+	pci_free_irq(tp->pdev);
 	pci_disable_msi(tp->pdev);
 
 	tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;
@@ -6966,7 +6965,7 @@ static int tg3_test_msi(struct tg3 *tp)
 	tg3_full_unlock(tp);
 
 	if (err)
-		free_irq(tp->pdev->irq, dev);
+		pci_free_irq(tp->pdev);
 
 	return err;
 }
@@ -7051,7 +7050,7 @@ static int tg3_open(struct net_device *d
 	tg3_full_unlock(tp);
 
 	if (err) {
-		free_irq(tp->pdev->irq, dev);
+		pci_free_irq(tp->pdev);
 		if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
 			pci_disable_msi(tp->pdev);
 			tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;
@@ -7363,7 +7362,7 @@ #endif
 
 	tg3_full_unlock(tp);
 
-	free_irq(tp->pdev->irq, dev);
+	pci_free_irq(tp->pdev);
 	if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
 		pci_disable_msi(tp->pdev);
 		tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* [RFC PATCH] move drm to pci_request_irq
  2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
                                           ` (2 preceding siblings ...)
  2006-10-02 20:11                         ` [RFC PATCH] move tg3 " Frederik Deweerdt
@ 2006-10-02 20:12                         ` Frederik Deweerdt
  2006-10-02 18:37                           ` Matthew Wilcox
                                             ` (2 more replies)
  2006-10-03  3:58                         ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Randy Dunlap
  4 siblings, 3 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 20:12 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

Hi,

This proof-of-concept patch converts the drm driver to use the
pci_request_irq() function.

Regards,
Frederik



diff --git a/drivers/char/drm/drm_drv.c b/drivers/char/drm/drm_drv.c
index b366c5b..5b000cd 100644
--- a/drivers/char/drm/drm_drv.c
+++ b/drivers/char/drm/drm_drv.c
@@ -234,6 +234,8 @@ int drm_lastclose(drm_device_t * dev)
 	}
 	mutex_unlock(&dev->struct_mutex);
 
+	pci_set_drvdata(dev, NULL);
+
 	DRM_DEBUG("lastclose completed\n");
 	return 0;
 }
diff --git a/drivers/char/drm/drm_irq.c b/drivers/char/drm/drm_irq.c
index 4553a3a..5dd12cb 100644
--- a/drivers/char/drm/drm_irq.c
+++ b/drivers/char/drm/drm_irq.c
@@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t 
 	if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
 		sh_flags = IRQF_SHARED;
 
-	ret = request_irq(dev->irq, dev->driver->irq_handler,
-			  sh_flags, dev->devname, dev);
+	pci_set_drvdata(dev->pdev, dev);
+
+	ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
+			  sh_flags, dev->devname);
 	if (ret < 0) {
 		mutex_lock(&dev->struct_mutex);
 		dev->irq_enabled = 0;
@@ -173,7 +175,7 @@ int drm_irq_uninstall(drm_device_t * dev
 
 	dev->driver->irq_uninstall(dev);
 
-	free_irq(dev->irq, dev);
+	pci_free_irq(dev->pdev);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move drm to pci_request_irq
  2006-10-02 20:12                         ` [RFC PATCH] move drm " Frederik Deweerdt
  2006-10-02 18:37                           ` Matthew Wilcox
@ 2006-10-02 20:36                           ` Alan Cox
  2006-10-02 22:26                             ` Frederik Deweerdt
  2006-10-02 23:54                           ` Dave Airlie
  2 siblings, 1 reply; 140+ messages in thread
From: Alan Cox @ 2006-10-02 20:36 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Andrew Morton, Jeff Garzik

Ar Llu, 2006-10-02 am 20:12 +0000, ysgrifennodd Frederik Deweerdt:
> Hi,
> 
> This proof-of-concept patch converts the drm driver to use the
> pci_request_irq() function.

0 isn't invalid - it means no IRQ was assigned so wants a different
message.


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move aic7xxx to pci_request_irq
  2006-10-02 18:27                           ` Matthew Wilcox
@ 2006-10-02 21:02                             ` Frederik Deweerdt
  0 siblings, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 21:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 12:27:44PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:07:03PM +0000, Frederik Deweerdt wrote:
> > +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> > @@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
> >  {
> >  	int error;
> >  
> > -	error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
> > -			    IRQF_SHARED, "aic79xx", ahd);
> > +	error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
> > +			    IRQF_SHARED, "aic79xx");
> >  	if (!error)
> >  		ahd->platform_data->irq = ahd->dev_softc->irq;
> >  	
> > -	return (-error);
> > +	return error;
> 
> Seems unsafe to me.
It is, it slipped through the patches, I didn't mean to send it to the
list :(. Please ignore that.
> 
> > -	
> > -	return (-error);
> > -}
> >  
> > +	return error;
> > +}
> 
> Ditto.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move tg3 to pci_request_irq
  2006-10-02 18:28                           ` Matthew Wilcox
@ 2006-10-02 21:04                             ` Frederik Deweerdt
  0 siblings, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 21:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 12:28:47PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:11:34PM +0000, Frederik Deweerdt wrote:
> > @@ -6838,9 +6838,9 @@ restart_timer:
> >  
> >  static int tg3_request_irq(struct tg3 *tp)
> >  {
> > +	struct net_device *dev = tp->dev;
> >  	irqreturn_t (*fn)(int, void *, struct pt_regs *);
> >  	unsigned long flags;
> > -	struct net_device *dev = tp->dev;
> >  
> >  	if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
> >  		fn = tg3_msi;
> 
> Is there any reason for this noise?
You mean, besides my awkwardness ? ;)
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move drm to pci_request_irq
  2006-10-02 18:37                           ` Matthew Wilcox
@ 2006-10-02 21:07                             ` Frederik Deweerdt
  0 siblings, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 21:07 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 12:37:49PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:12:29PM +0000, Frederik Deweerdt wrote:
> >  
> > +	pci_set_drvdata(dev, NULL);
> > +
> >  	DRM_DEBUG("lastclose completed\n");
> 
> Not necessary.  pci_devs are allocated initialised to 0.
Actually, this is the exit path, I felt like it could be safer if it was
set to NULL before freeing it.
> 
> > @@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t 
> >  	if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
> >  		sh_flags = IRQF_SHARED;
> >  
> > -	ret = request_irq(dev->irq, dev->driver->irq_handler,
> > -			  sh_flags, dev->devname, dev);
> > +	pci_set_drvdata(dev->pdev, dev);
> > +
> > +	ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> > +			  sh_flags, dev->devname);
> 
> This seems like the wrong place to be setting the pci_drvdata.  It
> should probably be done in each driver.  But then, requesting the IRQ
> should also be done by each driver.  You've dragged us into the "wow,
> what a mess DRI is" black hole here, I'm afraid.
I must admit that I had no idea where to initialize it. Do you have a
better place in mind?
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)
  2006-10-02 18:15                         ` Matthew Wilcox
@ 2006-10-02 21:09                           ` Frederik Deweerdt
  0 siblings, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 21:09 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 12:15:22PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:00:48PM +0000, Frederik Deweerdt wrote:
> >  /**
> > + * pci_request_irq - Reserve an IRQ for a PCI device
> > + * @pdev: The PCI device whose irq is to be reserved
> > + * handler: The interrupt handler function,
> 
> > + * pci_get_drvdata(pdev) shall be passed as an argument to that function
> 
> I don't think you can (or should) do this.  Move it to the body of the
> comment below.
OK, thanks for pointing this, will do.
> 
> > + * @flags: The flags to be passed to request_irq()
> > + * @name: The name of the device to be associated with the irq
> > + *
> > + * Returns 0 on success, or a negative value on error.  A warning
> > + * message is also printed on failure.
> > + */
> > +int pci_request_irq(struct pci_dev *pdev,
> > +		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
> > +		    unsigned long flags, const char *name)
> > +{
> > +	int rc;
> > +	const char *actual_name = name;
> > +
> > +	rc = is_irq_valid(pdev->irq);
> > +	if (!rc) {
> > +		dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
> > +		return -EINVAL;
> > +	}
> 
> Why is that more readable than
> 
> 	if (!is_irq_valid(pdev->irq)) {
> 		dev_err(&pdev->dev, "invalid irq #%d\n", pdev->irq);
> 		return -EINVAL;
> 	}
> 
Better too.
> > +	if (!actual_name)
> > +		actual_name = pci_name(pdev);
> > +
> > +	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> > +			   actual_name, pci_get_drvdata(pdev));
> 
> The driver name is a far more common usage than the pci_name.
> 
> 	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> 			name ? name : pdev->driver->name,
> 			pci_get_drvdata(pdev));
OK, thanks for the feedback,
Frederik
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move drm to pci_request_irq
  2006-10-02 20:36                           ` Alan Cox
@ 2006-10-02 22:26                             ` Frederik Deweerdt
  0 siblings, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-02 22:26 UTC (permalink / raw)
  To: Alan Cox
  Cc: Arjan van de Ven, Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Andrew Morton, Jeff Garzik

On Mon, Oct 02, 2006 at 09:36:38PM +0100, Alan Cox wrote:
> Ar Llu, 2006-10-02 am 20:12 +0000, ysgrifennodd Frederik Deweerdt:
> > Hi,
> > 
> > This proof-of-concept patch converts the drm driver to use the
> > pci_request_irq() function.
> 
> 0 isn't invalid - it means no IRQ was assigned so wants a different
> message.
> 
I understand, what about:

("No usable irq line was found (got #%d)\n", irqno)

This is generic enough, so that if on some arches a given irq (other
than 0) is invalid, the message still makes sense.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move drm to pci_request_irq
  2006-10-02 20:12                         ` [RFC PATCH] move drm " Frederik Deweerdt
  2006-10-02 18:37                           ` Matthew Wilcox
  2006-10-02 20:36                           ` Alan Cox
@ 2006-10-02 23:54                           ` Dave Airlie
  2006-10-03  7:17                             ` Frederik Deweerdt
  2 siblings, 1 reply; 140+ messages in thread
From: Dave Airlie @ 2006-10-02 23:54 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On 10/3/06, Frederik Deweerdt <deweerdt@free.fr> wrote:
> Hi,
>
> This proof-of-concept patch converts the drm driver to use the
> pci_request_irq() function.

NAK.
Wow nice CC'list and no DRM maintainer in sight :-)

This will break framebuffer drivers, the DRM is not a proper PCI
device driver as we don't have PCI device sharing, take a look at the
gpu-2.6.git tree on kernel.org for the "correct" solution, which needs
more attention before merging..

Dave.
>
> Regards,
> Frederik
>
>
>
> diff --git a/drivers/char/drm/drm_drv.c b/drivers/char/drm/drm_drv.c
> index b366c5b..5b000cd 100644
> --- a/drivers/char/drm/drm_drv.c
> +++ b/drivers/char/drm/drm_drv.c
> @@ -234,6 +234,8 @@ int drm_lastclose(drm_device_t * dev)
>         }
>         mutex_unlock(&dev->struct_mutex);
>
> +       pci_set_drvdata(dev, NULL);
> +
>         DRM_DEBUG("lastclose completed\n");
>         return 0;
>  }
> diff --git a/drivers/char/drm/drm_irq.c b/drivers/char/drm/drm_irq.c
> index 4553a3a..5dd12cb 100644
> --- a/drivers/char/drm/drm_irq.c
> +++ b/drivers/char/drm/drm_irq.c
> @@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
>         if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
>                 sh_flags = IRQF_SHARED;
>
> -       ret = request_irq(dev->irq, dev->driver->irq_handler,
> -                         sh_flags, dev->devname, dev);
> +       pci_set_drvdata(dev->pdev, dev);
> +
> +       ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> +                         sh_flags, dev->devname);
>         if (ret < 0) {
>                 mutex_lock(&dev->struct_mutex);
>                 dev->irq_enabled = 0;
> @@ -173,7 +175,7 @@ int drm_irq_uninstall(drm_device_t * dev
>
>         dev->driver->irq_uninstall(dev);
>
> -       free_irq(dev->irq, dev);
> +       pci_free_irq(dev->pdev);
>
>         return 0;
>  }
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move aic7xxx to pci_request_irq
  2006-10-02 20:07                         ` [RFC PATCH] move aic7xxx to pci_request_irq Frederik Deweerdt
  2006-10-02 18:27                           ` Matthew Wilcox
@ 2006-10-03  3:45                           ` Arjan van de Ven
  1 sibling, 0 replies; 140+ messages in thread
From: Arjan van de Ven @ 2006-10-03  3:45 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, 2006-10-02 at 20:07 +0000, Frederik Deweerdt wrote:
> Hi,
> 
> This proof-of-concept patch converts the aic7xxx drivers to use the
> pci_request_irq() function.
> 
> Regards,
> Frederik
> 
> 
> diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> index 2001fe8..c934f30 100644
> --- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> @@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
>  {
>  	int error;
>  
> -	error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
> -			    IRQF_SHARED, "aic79xx", ahd);
> +	error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
> +			    IRQF_SHARED, "aic79xx");
>  	if (!error)
>  		ahd->platform_data->irq = ahd->dev_softc->irq;
>  	
> -	return (-error);
> +	return error;
>  }

might as well kill this entire wrapper...



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)
  2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
                                           ` (3 preceding siblings ...)
  2006-10-02 20:12                         ` [RFC PATCH] move drm " Frederik Deweerdt
@ 2006-10-03  3:58                         ` Randy Dunlap
  4 siblings, 0 replies; 140+ messages in thread
From: Randy Dunlap @ 2006-10-03  3:58 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Arjan van de Ven, Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, 2 Oct 2006 20:00:48 +0000 Frederik Deweerdt wrote:

> Hi all,
> 
> I've tried to summarize the different proposals made by Jeff Garzik,
> Matthew Wilcox and Arjan van de Ven in the "[-mm patch] aic7xxx: check
> irq validity" thread. I've also added:
> - some kerneldoc

The kernel-doc needs some repair -- see below.

> - renamed valid_irq to is_irq_valid() 
> - added pci_release_irq(). 
> 
> I'll send a follow-up patch showing the implied modifications for the
> following - semi-randomly chosen :) - drivers: aic7xxx, aic79xx, tg3
> and drm.
> 
> Regards,
> Frederik
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a544997..ae20a3a 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -15,6 +15,7 @@ #include <linux/init.h>
>  #include <linux/pci.h>
>  #include <linux/module.h>
>  #include <linux/spinlock.h>
> +#include <linux/interrupt.h>
>  #include <linux/string.h>
>  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
>  #include "pci.h"
> @@ -810,6 +811,49 @@ err_out:
>  }
>  
>  /**
> + * pci_request_irq - Reserve an IRQ for a PCI device
> + * @pdev: The PCI device whose irq is to be reserved
> + * handler: The interrupt handler function,

 * @handler: ...

> + * pci_get_drvdata(pdev) shall be passed as an argument to that function
> + * @flags: The flags to be passed to request_irq()
> + * @name: The name of the device to be associated with the irq
> + *
> + * Returns 0 on success, or a negative value on error.  A warning
> + * message is also printed on failure.
> + */
> +int pci_request_irq(struct pci_dev *pdev,
> +		    irqreturn_t (*handler)(int, void *, struct pt_regs *),
> +		    unsigned long flags, const char *name)
> +{
> +	int rc;
> +	const char *actual_name = name;
> +
> +	rc = is_irq_valid(pdev->irq);
> +	if (!rc) {
> +		dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
> +		return -EINVAL;
> +	}
> +
> +	if (!actual_name)
> +		actual_name = pci_name(pdev);
> +
> +	return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> +			   actual_name, pci_get_drvdata(pdev));
> +}
> +EXPORT_SYMBOL(pci_request_irq);
> +
> +/**
> + * pci_free_irq - releases the interrupt line reserved to the PCI
> + * device pointed by @pdev 

The first line is function name and <<short>> function description.
It cannot extend more than one line (combined).
If you want to use more text for function description,
you can do so after the list of parameters.  See example below.

> + * @pdev: the PCI device whose interrupt is to be freed
 *
 * This froofroo_irq function only does this on odd phases of
 * the moon.

> + */
> +void pci_free_irq(struct pci_dev *pdev)
> +{
> +	free_irq(pdev->irq, pci_get_drvdata(pdev));
> +}
> +EXPORT_SYMBOL(pci_free_irq);
> +
> +/**
>   * pci_set_master - enables bus-mastering for device dev
>   * @dev: the PCI device to enable
>   *

---
~Randy

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move drm to pci_request_irq
  2006-10-02 23:54                           ` Dave Airlie
@ 2006-10-03  7:17                             ` Frederik Deweerdt
  0 siblings, 0 replies; 140+ messages in thread
From: Frederik Deweerdt @ 2006-10-03  7:17 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Arjan van de Ven, Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Tue, Oct 03, 2006 at 09:54:07AM +1000, Dave Airlie wrote:
> On 10/3/06, Frederik Deweerdt <deweerdt@free.fr> wrote:
> >Hi,
> >
> >This proof-of-concept patch converts the drm driver to use the
> >pci_request_irq() function.
> 
> NAK.
> Wow nice CC'list and no DRM maintainer in sight :-)
:), this was just meant as an illustration of the needed modifications
to use pci_request_irq.
> 
> This will break framebuffer drivers, the DRM is not a proper PCI
> device driver as we don't have PCI device sharing, take a look at the
> gpu-2.6.git tree on kernel.org for the "correct" solution, which needs
> more attention before merging..
I'll look, thanks,
Frederik
> 
> Dave.
> >
> >Regards,
> >Frederik
> >
> >
> >
> >diff --git a/drivers/char/drm/drm_drv.c b/drivers/char/drm/drm_drv.c
> >index b366c5b..5b000cd 100644
> >--- a/drivers/char/drm/drm_drv.c
> >+++ b/drivers/char/drm/drm_drv.c
> >@@ -234,6 +234,8 @@ int drm_lastclose(drm_device_t * dev)
> >        }
> >        mutex_unlock(&dev->struct_mutex);
> >
> >+       pci_set_drvdata(dev, NULL);
> >+
> >        DRM_DEBUG("lastclose completed\n");
> >        return 0;
> > }
> >diff --git a/drivers/char/drm/drm_irq.c b/drivers/char/drm/drm_irq.c
> >index 4553a3a..5dd12cb 100644
> >--- a/drivers/char/drm/drm_irq.c
> >+++ b/drivers/char/drm/drm_irq.c
> >@@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
> >        if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
> >                sh_flags = IRQF_SHARED;
> >
> >-       ret = request_irq(dev->irq, dev->driver->irq_handler,
> >-                         sh_flags, dev->devname, dev);
> >+       pci_set_drvdata(dev->pdev, dev);
> >+
> >+       ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> >+                         sh_flags, dev->devname);
> >        if (ret < 0) {
> >                mutex_lock(&dev->struct_mutex);
> >                dev->irq_enabled = 0;
> >@@ -173,7 +175,7 @@ int drm_irq_uninstall(drm_device_t * dev
> >
> >        dev->driver->irq_uninstall(dev);
> >
> >-       free_irq(dev->irq, dev);
> >+       pci_free_irq(dev->pdev);
> >
> >        return 0;
> > }
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/
> >
> 

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: [RFC PATCH] move tg3 to pci_request_irq
  2006-10-02 20:11                         ` [RFC PATCH] move tg3 " Frederik Deweerdt
  2006-10-02 18:28                           ` Matthew Wilcox
@ 2006-10-03  7:18                           ` Arjan van de Ven
  1 sibling, 0 replies; 140+ messages in thread
From: Arjan van de Ven @ 2006-10-03  7:18 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Matthew Wilcox, linux-scsi, Linux-Kernel,,
	J.A. Magall??n, Alan Cox, Andrew Morton, Jeff Garzik

On Mon, 2006-10-02 at 20:11 +0000, Frederik Deweerdt wrote:
> Hi,
> 
> This proof-of-concept patch converts the tg3 driver to use the
> pci_request_irq() function.
> 
> Regards,
> Frederik
> 
> 
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index c25ba27..23660c6 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -6838,9 +6838,9 @@ restart_timer:
>  
>  static int tg3_request_irq(struct tg3 *tp)
>  {
> +	struct net_device *dev = tp->dev;
>  	irqreturn_t (*fn)(int, void *, struct pt_regs *);
>  	unsigned long flags;
> -	struct net_device *dev = tp->dev;
>  
>  	if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
>  		fn = tg3_msi;
> @@ -6853,7 +6853,7 @@ static int tg3_request_irq(struct tg3 *t
>  			fn = tg3_interrupt_tagged;
>  		flags = IRQF_SHARED | IRQF_SAMPLE_RANDOM;
>  	}
> -	return (request_irq(tp->pdev->irq, fn, flags, dev->name, dev));
> +	return pci_request_irq(tp->pdev, fn, flags, dev->name);

since pci_request_irq sets IRQF_SHARED... might as well drop that above.



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-02 17:52             ` 2.6.18-mm2 - oops in cache_alloc_refill() Jean Tourrilhes
  2006-10-02 19:57               ` Valdis.Kletnieks
@ 2006-10-03 15:58               ` Samuel Tardieu
  2006-10-03 16:34                 ` Jean Tourrilhes
  1 sibling, 1 reply; 140+ messages in thread
From: Samuel Tardieu @ 2006-10-03 15:58 UTC (permalink / raw)
  To: jt; +Cc: Valdis.Kletnieks, John W. Linville, linux-kernel, netdev

>>>>> "Jean" == Jean Tourrilhes <jt@hpl.hp.com> writes:

Jean> @@ -2500,9 +2501,9 @@ static int orinoco_hw_get_essid(struct o
Jean>  	len = le16_to_cpu(essidbuf.len);
Jean>  	BUG_ON(len > IW_ESSID_MAX_SIZE);
Jean>  
Jean> -	memset(buf, 0, IW_ESSID_MAX_SIZE+1);
Jean> +	memset(buf, 0, IW_ESSID_MAX_SIZE);
Jean>  	memcpy(buf, p, len);
Jean> -	buf[len] = '\0';
Jean> +	err = len;

Jean,

something bugs me here:

  - either buf is supposed to be a nul-terminated string, in which
    case if p is IW_ESSID_MAX_SIZE long there may be a bug (no '\0' at
    the end of buf)

  - either buf is not-supposed to be nul-terminated and the length
    value will always be used, in which case the memset() looks
    useless

I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
the last byte is cleared as well. Or am I missing something?

 Sam
-- 
Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-03 15:58               ` Samuel Tardieu
@ 2006-10-03 16:34                 ` Jean Tourrilhes
  2006-10-03 16:45                   ` Samuel Tardieu
  2006-10-05 22:37                   ` Pavel Roskin
  0 siblings, 2 replies; 140+ messages in thread
From: Jean Tourrilhes @ 2006-10-03 16:34 UTC (permalink / raw)
  To: Samuel Tardieu; +Cc: Pavel Roskin, John W. Linville, linux-kernel, netdev

On Tue, Oct 03, 2006 at 05:58:31PM +0200, Samuel Tardieu wrote:
> >>>>> "Jean" == Jean Tourrilhes <jt@hpl.hp.com> writes:
> 
> Jean> @@ -2500,9 +2501,9 @@ static int orinoco_hw_get_essid(struct o
> Jean>  	len = le16_to_cpu(essidbuf.len);
> Jean>  	BUG_ON(len > IW_ESSID_MAX_SIZE);
> Jean>  
> Jean> -	memset(buf, 0, IW_ESSID_MAX_SIZE+1);
> Jean> +	memset(buf, 0, IW_ESSID_MAX_SIZE);
> Jean>  	memcpy(buf, p, len);
> Jean> -	buf[len] = '\0';
> Jean> +	err = len;
> 
> Jean,
> 
> something bugs me here:
> 
>   - either buf is supposed to be a nul-terminated string, in which
>     case if p is IW_ESSID_MAX_SIZE long there may be a bug (no '\0' at
>     the end of buf)

	ESSID is supposed to be up to 32 char, so we need to full
buffer size.

>   - either buf is not-supposed to be nul-terminated and the length
>     value will always be used, in which case the memset() looks
>     useless

	Yes, it is entirely useless, but not incorrect. Note that the
code was not very efficient to start with, the last char of the string
was set to NUL twice.
	I don't really want to overstep my authority there, my goal
was to minimise the changes. Pavel will have to clean up my mess, so I
don't want change things too much.

> I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
> the last byte is cleared as well. Or am I missing something?

	No, that would bring back the slab/memory overflow we are
trying to get rid of.

>  Sam
> -- 
> Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/

	Strange, this name remind me someone. Must be a previous life ;-)

	A+

	Jean

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-03 16:34                 ` Jean Tourrilhes
@ 2006-10-03 16:45                   ` Samuel Tardieu
  2006-10-03 17:07                     ` Jean Tourrilhes
  2006-10-05 22:37                   ` Pavel Roskin
  1 sibling, 1 reply; 140+ messages in thread
From: Samuel Tardieu @ 2006-10-03 16:45 UTC (permalink / raw)
  To: Jean Tourrilhes; +Cc: Pavel Roskin, John W. Linville, linux-kernel, netdev

On  3/10, Jean Tourrilhes wrote:

| > I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
| > the last byte is cleared as well. Or am I missing something?
| 
| No, that would bring back the slab/memory overflow we are
| trying to get rid of.

Then I am puzzled by the function declaration:

static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
                                char buf[IW_ESSID_MAX_SIZE+1])

Do you mean that this function is called with a buf parameter which
doesn't have the expected size? (as far as the function declaration is
concerned) Shouldn't the declaration be changed to

static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
                                char buf[IW_ESSID_MAX_SIZE])

then to reflect the reality? (it won't change the code but would be
clearer from a documentation point of view)

 Sam


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-03 16:45                   ` Samuel Tardieu
@ 2006-10-03 17:07                     ` Jean Tourrilhes
  0 siblings, 0 replies; 140+ messages in thread
From: Jean Tourrilhes @ 2006-10-03 17:07 UTC (permalink / raw)
  To: Samuel Tardieu; +Cc: Pavel Roskin, John W. Linville, linux-kernel, netdev

On Tue, Oct 03, 2006 at 06:45:35PM +0200, Samuel Tardieu wrote:
> On  3/10, Jean Tourrilhes wrote:
> 
> | > I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
> | > the last byte is cleared as well. Or am I missing something?
> | 
> | No, that would bring back the slab/memory overflow we are
> | trying to get rid of.
> 
> Then I am puzzled by the function declaration:
> 
> static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
>                                 char buf[IW_ESSID_MAX_SIZE+1])
> 
> Do you mean that this function is called with a buf parameter which
> doesn't have the expected size? (as far as the function declaration is
> concerned) Shouldn't the declaration be changed to
> 
> static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
>                                 char buf[IW_ESSID_MAX_SIZE])
> 
> then to reflect the reality? (it won't change the code but would be
> clearer from a documentation point of view)

	Yep, that one is a bug.
	Thanks !

>  Sam

	Jean

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-09-28 21:01   ` 2.6.18-mm2 Andrew Morton
  2006-09-28 22:45     ` 2.6.18-mm2 Stephen Hemminger
@ 2006-10-04 13:42     ` Steve Fox
  2006-10-04 15:45       ` Andrew Morton
  1 sibling, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-04 13:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev

On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> "Steve Fox" <drfickle@us.ibm.com> wrote:
> 
> > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > 
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > 
> > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > 
> > TCP bic registered
> > TCP westwood registered
> > TCP htcp registered
> > NET: Registered protocol family 1
> > NET: Registered protocol family 17
> > Unable to handle kernel paging request at ffffffffffffffff RIP: 
> >  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > PGD 203027 PUD 2b031067 PMD 0 
> > Oops: 0000 [1] SMP 
> > last sysfs file: 
> > CPU 0 
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> >  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> >  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > Call Trace:
> >  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> >  [<ffffffff8061c68d>] packet_init+0x2d/0x53
> >  [<ffffffff80207182>] init+0x162/0x330
> >  [<ffffffff8020a9d8>] child_rip+0xa/0x12
> >  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> >  [<ffffffff80207020>] init+0x0/0x330
> >  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > 
> > 
> > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
> > RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> >  RSP <ffff810bffcbde90>
> > CR2: ffffffffffffffff
> >  <0>Kernel panic - not syncing: Attempted to kill init!
> > 
> 
> I'm really struggling to work out what went wrong there.  Comparing your
> miserable 20 bytes of code to my object code makes me think that this:
> 
> 		struct packet_sock *po = pkt_sk(sk);
> 
> returned -1, perhaps in %ebp.  But it's all very crude.
> 
> Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> addresses might change) then have a poke around with `gdb vmlinux' (or
> maybe just addr2line) to work out where it's really oopsing?
> 
> I don't see much which has changed in that area recently.

Sorry for the delay. I was finally able to perform a bisect on this. It
turns out the patch that causes this is
x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
strange candidate, but sure enough I can boot to login: right up until
that patch is applied.

P.S. I had to comment usb-hubc-build-fix.patch out of the series file
because it would not apply cleanly and caused quilt (0.45) to simply
abort its 'push' operation.

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-04 13:42     ` 2.6.18-mm2 boot failure on x86-64 Steve Fox
@ 2006-10-04 15:45       ` Andrew Morton
  2006-10-04 15:55         ` Vivek Goyal
                           ` (2 more replies)
  0 siblings, 3 replies; 140+ messages in thread
From: Andrew Morton @ 2006-10-04 15:45 UTC (permalink / raw)
  To: Steve Fox; +Cc: linux-kernel, netdev, Andi Kleen, Vivek Goyal

On Wed, 04 Oct 2006 08:42:28 -0500
Steve Fox <drfickle@us.ibm.com> wrote:

> On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > "Steve Fox" <drfickle@us.ibm.com> wrote:
> > 
> > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > 
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > 
> > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > 
> > > TCP bic registered
> > > TCP westwood registered
> > > TCP htcp registered
> > > NET: Registered protocol family 1
> > > NET: Registered protocol family 17
> > > Unable to handle kernel paging request at ffffffffffffffff RIP: 
> > >  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > PGD 203027 PUD 2b031067 PMD 0 
> > > Oops: 0000 [1] SMP 
> > > last sysfs file: 
> > > CPU 0 
> > > Modules linked in:
> > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > >  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > >  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > Call Trace:
> > >  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > >  [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > >  [<ffffffff80207182>] init+0x162/0x330
> > >  [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > >  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > >  [<ffffffff80207020>] init+0x0/0x330
> > >  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > 
> > > 
> > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
> > > RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > >  RSP <ffff810bffcbde90>
> > > CR2: ffffffffffffffff
> > >  <0>Kernel panic - not syncing: Attempted to kill init!
> > > 
> > 
> > I'm really struggling to work out what went wrong there.  Comparing your
> > miserable 20 bytes of code to my object code makes me think that this:
> > 
> > 		struct packet_sock *po = pkt_sk(sk);
> > 
> > returned -1, perhaps in %ebp.  But it's all very crude.
> > 
> > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > addresses might change) then have a poke around with `gdb vmlinux' (or
> > maybe just addr2line) to work out where it's really oopsing?
> > 
> > I don't see much which has changed in that area recently.
> 
> Sorry for the delay. I was finally able to perform a bisect on this. It
> turns out the patch that causes this is
> x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> strange candidate, but sure enough I can boot to login: right up until
> that patch is applied.

hm, that patch was merged into mainline September 29.  Does mainline work?

> P.S. I had to comment usb-hubc-build-fix.patch out of the series file
> because it would not apply cleanly and caused quilt (0.45) to simply
> abort its 'push' operation.

Sorry about that.

If mainline _does_ work then perhaps it's an interaction between that patch
and something else in the -mm2 lineup (and at that point in the bisection,
it'll be one of the git trees or something else in the x86_64 tree).  Could
be that the problem remains in -mm3.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-04 15:45       ` Andrew Morton
@ 2006-10-04 15:55         ` Vivek Goyal
  2006-10-04 15:56         ` Andi Kleen
  2006-10-04 16:41         ` Steve Fox
  2 siblings, 0 replies; 140+ messages in thread
From: Vivek Goyal @ 2006-10-04 15:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steve Fox, linux-kernel, netdev, Andi Kleen

On Wed, Oct 04, 2006 at 08:45:40AM -0700, Andrew Morton wrote:
> On Wed, 04 Oct 2006 08:42:28 -0500
> Steve Fox <drfickle@us.ibm.com> wrote:
> 
> > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > > "Steve Fox" <drfickle@us.ibm.com> wrote:
> > > 
> > > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > > 
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > > 
> > > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > > 
> > > > TCP bic registered
> > > > TCP westwood registered
> > > > TCP htcp registered
> > > > NET: Registered protocol family 1
> > > > NET: Registered protocol family 17
> > > > Unable to handle kernel paging request at ffffffffffffffff RIP: 
> > > >  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > PGD 203027 PUD 2b031067 PMD 0 
> > > > Oops: 0000 [1] SMP 
> > > > last sysfs file: 
> > > > CPU 0 
> > > > Modules linked in:
> > > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > > RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> > > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > > FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > > Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > >  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > >  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > > Call Trace:
> > > >  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > >  [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > >  [<ffffffff80207182>] init+0x162/0x330
> > > >  [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > >  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > >  [<ffffffff80207020>] init+0x0/0x330
> > > >  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > > 
> > > > 
> > > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
> > > > RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > >  RSP <ffff810bffcbde90>
> > > > CR2: ffffffffffffffff
> > > >  <0>Kernel panic - not syncing: Attempted to kill init!
> > > > 
> > > 
> > > I'm really struggling to work out what went wrong there.  Comparing your
> > > miserable 20 bytes of code to my object code makes me think that this:
> > > 
> > > 		struct packet_sock *po = pkt_sk(sk);
> > > 
> > > returned -1, perhaps in %ebp.  But it's all very crude.
> > > 
> > > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > > addresses might change) then have a poke around with `gdb vmlinux' (or
> > > maybe just addr2line) to work out where it's really oopsing?
> > > 
> > > I don't see much which has changed in that area recently.
> > 
> > Sorry for the delay. I was finally able to perform a bisect on this. It
> > turns out the patch that causes this is
> > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > strange candidate, but sure enough I can boot to login: right up until
> > that patch is applied.
> 
> hm, that patch was merged into mainline September 29.  Does mainline work?
> 

I thought above patch was dropped because Keith ran into some boot issues
on one of the machines. Though there seems to be nothing wrong with the
patch as such but it might have triggered some existing bug. At that point
of time I looked into the issue but nothing was conclusive.

So looks like this patch has come back. I am not sure how.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-04 15:45       ` Andrew Morton
  2006-10-04 15:55         ` Vivek Goyal
@ 2006-10-04 15:56         ` Andi Kleen
  2006-10-05  1:57           ` Keith Mannthey
  2006-10-04 16:41         ` Steve Fox
  2 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-04 15:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steve Fox, linux-kernel, netdev, Vivek Goyal, Ian Campbell

On Wednesday 04 October 2006 17:45, Andrew Morton wrote:
> On Wed, 04 Oct 2006 08:42:28 -0500
> Steve Fox <drfickle@us.ibm.com> wrote:
> 
> > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > > "Steve Fox" <drfickle@us.ibm.com> wrote:
> > > 
> > > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > > 
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > > 
> > > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > > 
> > > > TCP bic registered
> > > > TCP westwood registered
> > > > TCP htcp registered
> > > > NET: Registered protocol family 1
> > > > NET: Registered protocol family 17
> > > > Unable to handle kernel paging request at ffffffffffffffff RIP: 
> > > >  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > PGD 203027 PUD 2b031067 PMD 0 
> > > > Oops: 0000 [1] SMP 
> > > > last sysfs file: 
> > > > CPU 0 
> > > > Modules linked in:
> > > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > > RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> > > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > > FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > > Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > >  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > >  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > > Call Trace:
> > > >  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > >  [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > >  [<ffffffff80207182>] init+0x162/0x330
> > > >  [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > >  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > >  [<ffffffff80207020>] init+0x0/0x330
> > > >  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > > 
> > > > 
> > > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
> > > > RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > >  RSP <ffff810bffcbde90>
> > > > CR2: ffffffffffffffff
> > > >  <0>Kernel panic - not syncing: Attempted to kill init!
> > > > 
> > > 
> > > I'm really struggling to work out what went wrong there.  Comparing your
> > > miserable 20 bytes of code to my object code makes me think that this:
> > > 
> > > 		struct packet_sock *po = pkt_sk(sk);
> > > 
> > > returned -1, perhaps in %ebp.  But it's all very crude.
> > > 
> > > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > > addresses might change) then have a poke around with `gdb vmlinux' (or
> > > maybe just addr2line) to work out where it's really oopsing?
> > > 
> > > I don't see much which has changed in that area recently.
> > 
> > Sorry for the delay. I was finally able to perform a bisect on this. It
> > turns out the patch that causes this is
> > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > strange candidate, but sure enough I can boot to login: right up until
> > that patch is applied.
> 
> hm, that patch was merged into mainline September 29.  Does mainline work?

Yes we had this earlier already. But without this patch it doesn't 
compile for some people. So it was readded.

And nobody knows why the reposition-bss patch actually breaks things :/

In theory the reposition is ok, so it must be some marginal code
somewhere else that just ends up failing over.

-Andi


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-04 15:45       ` Andrew Morton
  2006-10-04 15:55         ` Vivek Goyal
  2006-10-04 15:56         ` Andi Kleen
@ 2006-10-04 16:41         ` Steve Fox
  2006-10-05  0:06           ` Andrew Morton
  2 siblings, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-04 16:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev, Andi Kleen, Vivek Goyal

On Wed, 2006-10-04 at 08:45 -0700, Andrew Morton wrote:
> On Wed, 04 Oct 2006 08:42:28 -0500
> Steve Fox <drfickle@us.ibm.com> wrote:
> > Sorry for the delay. I was finally able to perform a bisect on this. It
> > turns out the patch that causes this is
> > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > strange candidate, but sure enough I can boot to login: right up until
> > that patch is applied.
> 
> hm, that patch was merged into mainline September 29.  Does mainline work?

-git21 also fails with this same error.

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-04 16:41         ` Steve Fox
@ 2006-10-05  0:06           ` Andrew Morton
  2006-10-05  0:51             ` Vivek Goyal
  0 siblings, 1 reply; 140+ messages in thread
From: Andrew Morton @ 2006-10-05  0:06 UTC (permalink / raw)
  To: Steve Fox; +Cc: linux-kernel, netdev, Andi Kleen, Vivek Goyal

On Wed, 04 Oct 2006 11:41:59 -0500
Steve Fox <drfickle@us.ibm.com> wrote:

> On Wed, 2006-10-04 at 08:45 -0700, Andrew Morton wrote:
> > On Wed, 04 Oct 2006 08:42:28 -0500
> > Steve Fox <drfickle@us.ibm.com> wrote:
> > > Sorry for the delay. I was finally able to perform a bisect on this. It
> > > turns out the patch that causes this is
> > > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > > strange candidate, but sure enough I can boot to login: right up until
> > > that patch is applied.
> > 
> > hm, that patch was merged into mainline September 29.  Does mainline work?
> 
> -git21 also fails with this same error.
> 

OK, thanks.  And we know that
x86_64-mm-re-positioning-the-bss-segment.patch triggered this failure.  And
that patch is non-buggy, and the xfrm code is probably non-buggy.  So we don't
know squat, and we're going to need to debug this crash.

Well.  There is one trick we could use: apply
x86_64-mm-re-positioning-the-bss-segment.patch to 2.6.18 base and see if it
crashes.  If it doesn't, then we can theorise that the bug is some buggy
post 2.6.18 patch which is being exposed by
x86_64-mm-re-positioning-the-bss-segment.patch.  A technique I've used
before for identifying the buggy patch is to do a git-bisect, but apply
x86_64-mm-re-positioning-the-bss-segment.patch by hand at each bisection
step.  It's pretty straightforward as long as the patch roughly applies at
each step.  

Or we could debug it.  Can you send the .config?  Let's see if it happens
with my toolchain+machine first.

Thanks.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05  0:06           ` Andrew Morton
@ 2006-10-05  0:51             ` Vivek Goyal
  2006-10-05  0:57               ` Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: Vivek Goyal @ 2006-10-05  0:51 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steve Fox, linux-kernel, netdev, Andi Kleen, kmannth

On Wed, Oct 04, 2006 at 05:06:59PM -0700, Andrew Morton wrote:
> On Wed, 04 Oct 2006 11:41:59 -0500
> Steve Fox <drfickle@us.ibm.com> wrote:
> 
> > On Wed, 2006-10-04 at 08:45 -0700, Andrew Morton wrote:
> > > On Wed, 04 Oct 2006 08:42:28 -0500
> > > Steve Fox <drfickle@us.ibm.com> wrote:
> > > > Sorry for the delay. I was finally able to perform a bisect on this. It
> > > > turns out the patch that causes this is
> > > > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > > > strange candidate, but sure enough I can boot to login: right up until
> > > > that patch is applied.
> > > 
> > > hm, that patch was merged into mainline September 29.  Does mainline work?
> > 
> > -git21 also fails with this same error.
> > 
> 
> OK, thanks.  And we know that
> x86_64-mm-re-positioning-the-bss-segment.patch triggered this failure.  And
> that patch is non-buggy, and the xfrm code is probably non-buggy.  So we don't
> know squat, and we're going to need to debug this crash.
> 
> Well.  There is one trick we could use: apply
> x86_64-mm-re-positioning-the-bss-segment.patch to 2.6.18 base and see if it
> crashes.  If it doesn't, then we can theorise that the bug is some buggy
> post 2.6.18 patch which is being exposed by

I think most likely it would crash on 2.6.18. Keith mannthey had reported
a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
time. Following is the link to the thread.

http://marc.theaimsgroup.com/?l=linux-kernel&m=115629369729911&w=2

Following is the backtrace he had reported.

 Unable to handle kernel NULL pointer dereference at 0000000000000007
 RIP:
  [<ffffffff803d45b0>] __unix_insert_socket+0x49/0x5a
 PGD 115c934067 PUD 115c935067 PMD 0
 Oops: 0002 [1] SMP
 last sysfs file:
 CPU 14
 Modules linked in:
 Pid: 1, comm: init Not tainted 2.6.18-rc4-mm2-smp #3
 RIP: 0010:[<ffffffff803d45b0>]  [<ffffffff803d45b0>]
 __unix_insert_socket+0x49/0x5a
 RSP: 0018:ffff810460605eb8  EFLAGS: 00010286
 RAX: ffffffffffffffff RBX: ffff81115c171c80 RCX: 0000000000000000
 RDX: ffff81115c171c88 RSI: ffff81115c171c80 RDI: ffffffff806656e0
 RBP: ffffffff806656e0 R08: ffff81115c069200 R09: ffff8110700b4000
 R10: 0000000000000000 R11: 0000000000000002 R12: ffff81115c170d00
 R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
 FS:  00002b793a4fd6d0(0000) GS:ffff81115c910e40(0000)
 knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000007 CR3: 000000115c92d000 CR4: 00000000000006e0
 Process init (pid: 1, threadinfo ffff810460604000, task
 ffff81115cb10040)
 Stack:  0000000100000001 00000000ffffffff ffff81115c171c80
 ffffffff803d58e9
  ffffffff8045bb30 0000000180298f61 ffffffff80498080 0000000000000001
  ffff81115c170d00 ffffffff803d595d 0000000000000004 ffffffff80376061
 Call Trace:
  [<ffffffff803d58e9>] unix_create1+0xf3/0x107
  [<ffffffff803d595d>] unix_create+0x60/0x6b
  [<ffffffff80376061>] __sock_create+0x12f/0x227
  [<ffffffff80376429>] sys_socket+0xf/0x37
  [<ffffffff8020968e>] system_call+0x7e/0x83


 Code: 48 89 50 08 48 89 55 00 48 89 6a 08 41 58 5b 5d c3 c7 47 08
 RIP  [<ffffffff803d45b0>] __unix_insert_socket+0x49/0x5a
  RSP <ffff810460605eb8>
 CR2: 0000000000000007
  <0>Kernel panic - not syncing: Attempted to kill init!

Thanks
Vivek

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05  0:51             ` Vivek Goyal
@ 2006-10-05  0:57               ` Andi Kleen
  2006-10-05  1:08                 ` Martin Bligh
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05  0:57 UTC (permalink / raw)
  To: vgoyal; +Cc: Andrew Morton, Steve Fox, linux-kernel, netdev, kmannth


> I think most likely it would crash on 2.6.18. Keith mannthey had reported
> a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> time. Following is the link to the thread.

Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05  0:57               ` Andi Kleen
@ 2006-10-05  1:08                 ` Martin Bligh
  2006-10-05  2:05                   ` Keith Mannthey
  2006-10-05 14:53                   ` Steve Fox
  0 siblings, 2 replies; 140+ messages in thread
From: Martin Bligh @ 2006-10-05  1:08 UTC (permalink / raw)
  To: Andi Kleen
  Cc: vgoyal, Andrew Morton, Steve Fox, linux-kernel, netdev, kmannth,
	Andy Whitcroft

Andi Kleen wrote:
>>I think most likely it would crash on 2.6.18. Keith mannthey had reported
>>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
>>time. Following is the link to the thread.
> 
> 
> Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?

I think it's fixed already in -git22, or at least it is for the IBM box
reporting to test.kernel.org. You might want to try that one ...

M.

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-04 15:56         ` Andi Kleen
@ 2006-10-05  1:57           ` Keith Mannthey
  0 siblings, 0 replies; 140+ messages in thread
From: Keith Mannthey @ 2006-10-05  1:57 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Steve Fox, linux-kernel, netdev, Vivek Goyal,
	Ian Campbell

On 10/4/06, Andi Kleen <ak@suse.de> wrote:
> On Wednesday 04 October 2006 17:45, Andrew Morton wrote:
> > On Wed, 04 Oct 2006 08:42:28 -0500
> > Steve Fox <drfickle@us.ibm.com> wrote:
> >
> > > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > > > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > > > "Steve Fox" <drfickle@us.ibm.com> wrote:
> > > >
> > > > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > > >
> > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > > >
> > > > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > > >
> > > > > TCP bic registered
> > > > > TCP westwood registered
> > > > > TCP htcp registered
> > > > > NET: Registered protocol family 1
> > > > > NET: Registered protocol family 17
> > > > > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > > > >  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > > PGD 203027 PUD 2b031067 PMD 0
> > > > > Oops: 0000 [1] SMP
> > > > > last sysfs file:
> > > > > CPU 0
> > > > > Modules linked in:
> > > > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > > > RIP: 0010:[<ffffffff8047ef93>]  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > > RSP: 0000:ffff810bffcbde90  EFLAGS: 00010286
> > > > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > > > FS:  0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > > > Stack:  ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > > >  0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > > >  0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > > > Call Trace:
> > > > >  [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > > >  [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > > >  [<ffffffff80207182>] init+0x162/0x330
> > > > >  [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > > >  [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > > >  [<ffffffff80207020>] init+0x0/0x330
> > > > >  [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > > >
> > > > >
> > > > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > > > > RIP  [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > >  RSP <ffff810bffcbde90>
> > > > > CR2: ffffffffffffffff
> > > > >  <0>Kernel panic - not syncing: Attempted to kill init!
> > > > >
> > > >
> > > > I'm really struggling to work out what went wrong there.  Comparing your
> > > > miserable 20 bytes of code to my object code makes me think that this:
> > > >
> > > >           struct packet_sock *po = pkt_sk(sk);
> > > >
> > > > returned -1, perhaps in %ebp.  But it's all very crude.
> > > >
> > > > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > > > addresses might change) then have a poke around with `gdb vmlinux' (or
> > > > maybe just addr2line) to work out where it's really oopsing?
> > > >
> > > > I don't see much which has changed in that area recently.
> > >
> > > Sorry for the delay. I was finally able to perform a bisect on this. It
> > > turns out the patch that causes this is
> > > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > > strange candidate, but sure enough I can boot to login: right up until
> > > that patch is applied.
> >
> > hm, that patch was merged into mainline September 29.  Does mainline work?
>
> Yes we had this earlier already. But without this patch it doesn't
> compile for some people. So it was readded.
>
> And nobody knows why the reposition-bss patch actually breaks things :/

I just wanted to add a chaned up my config file and things went away.
It was not at all clear as to what was causing it.


Thanks,
  Keith

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05  1:08                 ` Martin Bligh
@ 2006-10-05  2:05                   ` Keith Mannthey
  2006-10-05 14:53                   ` Steve Fox
  1 sibling, 0 replies; 140+ messages in thread
From: Keith Mannthey @ 2006-10-05  2:05 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Andi Kleen, vgoyal, Andrew Morton, Steve Fox, linux-kernel,
	netdev, kmannth, Andy Whitcroft

On 10/4/06, Martin Bligh <mbligh@mbligh.org> wrote:
> Andi Kleen wrote:
> >>I think most likely it would crash on 2.6.18. Keith mannthey had reported
> >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> >>time. Following is the link to the thread.
> >
> >
> > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?
>
> I think it's fixed already in -git22, or at least it is for the IBM box
> reporting to test.kernel.org. You might want to try that one ...

Fixed or hidden... hard to say at this point.   I think it could be a
werid interaction between patches and or config options.  I will see
tommorrow if I can recreate again.

Thanks,
  Keith

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05  1:08                 ` Martin Bligh
  2006-10-05  2:05                   ` Keith Mannthey
@ 2006-10-05 14:53                   ` Steve Fox
  2006-10-05 15:12                     ` Badari Pulavarty
  1 sibling, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-05 14:53 UTC (permalink / raw)
  To: Martin Bligh
  Cc: Andi Kleen, vgoyal, Andrew Morton, linux-kernel, netdev, kmannth,
	Andy Whitcroft

On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote:
> Andi Kleen wrote:
> >>I think most likely it would crash on 2.6.18. Keith mannthey had reported
> >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> >>time. Following is the link to the thread.
> > 
> > 
> > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?
> 
> I think it's fixed already in -git22, or at least it is for the IBM box
> reporting to test.kernel.org. You might want to try that one ...

-git22 also panics for me.

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 14:53                   ` Steve Fox
@ 2006-10-05 15:12                     ` Badari Pulavarty
  2006-10-05 15:32                       ` Steve Fox
  0 siblings, 1 reply; 140+ messages in thread
From: Badari Pulavarty @ 2006-10-05 15:12 UTC (permalink / raw)
  To: Steve Fox
  Cc: Martin Bligh, Andi Kleen, vgoyal, Andrew Morton, lkml, netdev,
	kmannth, Andy Whitcroft

On Thu, 2006-10-05 at 09:53 -0500, Steve Fox wrote:
> On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote:
> > Andi Kleen wrote:
> > >>I think most likely it would crash on 2.6.18. Keith mannthey had reported
> > >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> > >>time. Following is the link to the thread.
> > > 
> > > 
> > > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?
> > 
> > I think it's fixed already in -git22, or at least it is for the IBM box
> > reporting to test.kernel.org. You might want to try that one ...
> 
> -git22 also panics for me.
> 

Steve,

Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? 
Last time I couldn't match your instruction dump to any code segment
in the routine. And also, can you post your .config file. I have
an amd64 and em64t machine and both work fine...

Thanks,
Badari


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 15:12                     ` Badari Pulavarty
@ 2006-10-05 15:32                       ` Steve Fox
  2006-10-05 15:40                         ` Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-05 15:32 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: Martin Bligh, Andi Kleen, vgoyal, Andrew Morton, lkml, netdev,
	kmannth, Andy Whitcroft

On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote:

> Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? 

CONFIG_DEBUG_KERNEL should be on

> Last time I couldn't match your instruction dump to any code segment
> in the routine. And also, can you post your .config file. I have
> an amd64 and em64t machine and both work fine...

Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
 [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1
RIP: 0010:[<ffffffff804705e6>]  [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
RSP: 0000:ffff810bffcbded0  EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 000000003f924371 R09: 0000000000000000
R10: ffff810bffcbdcb0 R11: 0000000000000154 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
Stack:  0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000090000

The base config file I'm using is at
http://flooterbu.net/kernel/elm3b239-2.6.17.config

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 15:32                       ` Steve Fox
@ 2006-10-05 15:40                         ` Andi Kleen
  2006-10-05 17:57                           ` Steve Fox
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 15:40 UTC (permalink / raw)
  To: Steve Fox
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thursday 05 October 2006 17:32, Steve Fox wrote:
> On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote:
> 
> > Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? 
> 
> CONFIG_DEBUG_KERNEL should be on
> 
> > Last time I couldn't match your instruction dump to any code segment
> > in the routine. And also, can you post your .config file. I have
> > an amd64 and em64t machine and both work fine...
> 
> Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
>  [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
> PGD 0
> Oops: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1
> RIP: 0010:[<ffffffff804705e6>]  [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
> RSP: 0000:ffff810bffcbded0  EFLAGS: 00010286
> RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
> RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
> RBP: 00000000ffffffef R08: 000000003f924371 R09: 0000000000000000
> R10: ffff810bffcbdcb0 R11: 0000000000000154 R12: 0000000000000000
> R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
> Stack:  0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000090000

Please don't snip the Code: line. It is fairly important.

> 
> The base config file I'm using is at
> http://flooterbu.net/kernel/elm3b239-2.6.17.config

My guess is that something is wrong with the global variable it is accessing.
Can you post the output of grep -5 xfrm_policy_afinfo ? 

I wonder if that variable overlaps something else.

And please add a 
printk("global %p\n",  xfrm_policy_afinfo[family]);
at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo
and post the output.

If not then it's possible
that some nearby variable is overflowing or similar. Adding some padding
around xfrm_policy_afinfo would show that. 

Another way if that global is proven to be corrupted will be to add
checks all over the boot process to track down where it gets corrupted.

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 15:40                         ` Andi Kleen
@ 2006-10-05 17:57                           ` Steve Fox
  2006-10-05 18:27                             ` Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-05 17:57 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:

> Please don't snip the Code: line. It is fairly important.

Sorry about that. The remote console I was using appears to overwrite
some text after I force the reboot. Here's a clean one.

global ffffffffffffffff
Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
 [<ffffffff80470766>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #3
RIP: 0010:[<ffffffff80470766>]  [<ffffffff80470766>] xfrm_register_mode+0x36/0x60
RSP: 0000:ffff810bffcbded0  EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
RDX: ffffffffffffffff RSI: 0000000000000046 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 0000000000007a02 R09: 000000000000000e
R10: 0000000000000006 R11: ffffffff80334660 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
Stack:  0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000090000
Call Trace:
 [<ffffffff80207182>] init+0x162/0x330
 [<ffffffff8020a9a8>] child_rip+0xa/0x12
 [<ffffffff803394c2>] acpi_ds_init_one_object+0x0/0x82
 [<ffffffff80207020>] init+0x0/0x330
 [<ffffffff8020a99e>] child_rip+0x0/0x12


Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 65 fd ff
RIP  [<ffffffff80470766>] xfrm_register_mode+0x36/0x60
 RSP <ffff810bffcbded0>
CR2: 0000000000000827
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

> My guess is that something is wrong with the global variable it is accessing.
> Can you post the output of grep -5 xfrm_policy_afinfo ? 

elm3b239:/boot # grep -5 xfrm_policy_afinfo System.map-2.6.18-git22
ffffffff805594c0 d xfrm4_state_afinfo
ffffffff80559500 D xfrm_cfg_mutex
ffffffff80559530 d xfrm_dev_notifier
ffffffff80559548 d xfrm_policy_lock
ffffffff8055954c d xfrm_policy_gc_lock
ffffffff80559550 d xfrm_policy_afinfo_lock
ffffffff80559560 d xfrm_hash_work
ffffffff805595c0 d hash_resize_mutex
ffffffff80559600 D sysctl_xfrm_aevent_etime
ffffffff80559604 D sysctl_xfrm_aevent_rseqth
ffffffff80559610 D km_waitq
--
ffffffff8075bfd8 b idiagnl
ffffffff8075bfe0 B xfrm_policy_count
ffffffff8075bff8 b xfrm_policy_gc_list
ffffffff8075c000 b dummy.28400
ffffffff8075c038 b idx_generator.27450
ffffffff8075c040 b xfrm_policy_afinfo
ffffffff8075c140 b xfrm_policy_gc_work
ffffffff8075c1a0 b xfrm_policy_inexact
ffffffff8075c1e0 B xfrm_nl
ffffffff8075c1e8 b xfrm_state_gc_list
ffffffff8075c1f0 b acqseq.27386

> And please add a 
> printk("global %p\n",  xfrm_policy_afinfo[family]);
> at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo
> and post the output.

Included above.

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 17:57                           ` Steve Fox
@ 2006-10-05 18:27                             ` Andi Kleen
  2006-10-05 18:51                               ` Steve Fox
  2006-10-05 18:52                               ` Vivek Goyal
  0 siblings, 2 replies; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 18:27 UTC (permalink / raw)
  To: Steve Fox
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thursday 05 October 2006 19:57, Steve Fox wrote:
> On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
> 
> > Please don't snip the Code: line. It is fairly important.
> 
> Sorry about that. The remote console I was using appears to overwrite
> some text after I force the reboot. Here's a clean one.
> 
> global ffffffffffffffff

Ok that definitely shouldn't be in there.

I guess we need to track when it gets corrupted. Can you send the full
boot log with this patch applied?


-Andi

Index: linux-2.6.19-rc1-hack/init/main.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/init/main.c
+++ linux-2.6.19-rc1-hack/init/main.c
@@ -75,6 +75,9 @@
 
 static int init(void *);
 
+extern void bugcheck(char *, int);
+#define CHECK bugcheck(__FILE__, __LINE__)
+
 extern void init_IRQ(void);
 extern void fork_init(unsigned long);
 extern void mca_init(void);
@@ -480,6 +483,8 @@ asmlinkage void __init start_kernel(void
 	char * command_line;
 	extern struct kernel_param __start___param[], __stop___param[];
 
+	CHECK;
+
 	smp_setup_processor_id();
 
 	/*
@@ -502,7 +507,9 @@ asmlinkage void __init start_kernel(void
 	page_address_init();
 	printk(KERN_NOTICE);
 	printk(linux_banner);
+	CHECK;
 	setup_arch(&command_line);
+	CHECK;
 	setup_per_cpu_areas();
 	smp_prepare_boot_cpu();	/* arch-specific boot-cpu hooks */
 
@@ -517,6 +524,7 @@ asmlinkage void __init start_kernel(void
 	 * fragile until we cpu_idle() for the first time.
 	 */
 	preempt_disable();
+	CHECK;
 	build_all_zonelists();
 	page_alloc_init();
 	printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line);
@@ -525,6 +533,7 @@ asmlinkage void __init start_kernel(void
 		   __stop___param - __start___param,
 		   &unknown_bootoption);
 	sort_main_extable();
+	CHECK;
 	trap_init();
 	rcu_init();
 	init_IRQ();
@@ -533,8 +542,10 @@ asmlinkage void __init start_kernel(void
 	hrtimers_init();
 	softirq_init();
 	timekeeping_init();
+	CHECK;
 	time_init();
 	profile_init();
+	CHECK;
 	if (!irqs_disabled())
 		printk("start_kernel(): bug: interrupts were enabled early\n");
 	early_boot_irqs_on();
@@ -568,7 +579,9 @@ asmlinkage void __init start_kernel(void
 #endif
 	vfs_caches_init_early();
 	cpuset_init_early();
+	CHECK;
 	mem_init();
+	CHECK;
 	kmem_cache_init();
 	setup_per_cpu_pageset();
 	numa_policy_init();
@@ -577,6 +590,7 @@ asmlinkage void __init start_kernel(void
 	calibrate_delay();
 	pidmap_init();
 	pgtable_cache_init();
+	CHECK;
 	prio_tree_init();
 	anon_vma_init();
 #ifdef CONFIG_X86
@@ -586,12 +600,14 @@ asmlinkage void __init start_kernel(void
 	fork_init(num_physpages);
 	proc_caches_init();
 	buffer_init();
+	CHECK;
 	unnamed_dev_init();
 	key_init();
 	security_init();
 	vfs_caches_init(num_physpages);
 	radix_tree_init();
 	signals_init();
+	CHECK;
 	/* rootfs populating might need page-writeback */
 	page_writeback_init();
 #ifdef CONFIG_PROC_FS
@@ -599,6 +615,7 @@ asmlinkage void __init start_kernel(void
 #endif
 	cpuset_init();
 	taskstats_init_early();
+	CHECK;
 	delayacct_init();
 
 	check_bugs();
@@ -609,7 +626,7 @@ asmlinkage void __init start_kernel(void
 	rest_init();
 }
 
-static int __initdata initcall_debug;
+static int __initdata initcall_debug = 1;
 
 static int __init initcall_debug_setup(char *str)
 {
@@ -639,7 +656,11 @@ static void __init do_initcalls(void)
 			printk("\n");
 		}
 
+		CHECK;
+
 		result = (*call)();
+		
+		CHECK;
 
 		if (result && result != -ENODEV && initcall_debug) {
 			sprintf(msgbuf, "error code %d", result);
@@ -725,21 +746,32 @@ static int init(void * unused)
 
 	smp_prepare_cpus(max_cpus);
 
+	CHECK;
+
 	do_pre_smp_initcalls();
 
 	smp_init();
+
+	CHECK;
+
 	sched_init_smp();
 
 	cpuset_init_smp();
 
+	CHECK;
+
 	/*
 	 * Do this before initcalls, because some drivers want to access
 	 * firmware files.
 	 */
 	populate_rootfs();
 
+	CHECK;
+
 	do_basic_setup();
 
+	CHECK;
+
 	/*
 	 * check if there is an early userspace init.  If yes, let it do all
 	 * the work
Index: linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/net/xfrm/xfrm_policy.c
+++ linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
@@ -39,6 +39,16 @@ EXPORT_SYMBOL(xfrm_policy_count);
 static DEFINE_RWLOCK(xfrm_policy_afinfo_lock);
 static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO];
 
+void bugcheck(char *where, int line)
+{
+	int i;
+	for (i = 0; i < NPROTO; i++)
+		if (xfrm_policy_afinfo[i] == (void *)-1UL) {
+			printk("afinfo corrupted at %s:%d\n",where,line);
+			return;
+		}
+}
+
 static kmem_cache_t *xfrm_dst_cache __read_mostly;
 
 static struct work_struct xfrm_policy_gc_work;

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 18:27                             ` Andi Kleen
@ 2006-10-05 18:51                               ` Steve Fox
  2006-10-05 19:05                                 ` Andi Kleen
  2006-10-05 18:52                               ` Vivek Goyal
  1 sibling, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-05 18:51 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote:

> I guess we need to track when it gets corrupted. Can you send the full
> boot log with this patch applied?

Here she blows!

root (hd0,0)
 Filesystem type is reiserfs, partition type 0x83
kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791
ip=9.47.67.239:9.47.67.5
0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts console=tty0
console=ttyS0,
57600 autobench_args: root=/dev/sda1 ABAT:1160073474
   [Linux-bzImage, setup=0x1400, size=0x1dd755]
initrd /boot/initrd-autobench.img
   [Linux-initrd @ 0x37ceb000, 0x304c57 bytes]

Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE
Linux)) #4 SMP Thu Oct 5 11:36:21 PDT 2006
Command line: root=/dev/sda1 vga=791
ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1
showopts console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1
ABAT:1160073474
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
 BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
 BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
 BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
end_pfn_map = 12582912
DMI 2.3 present.
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 -> 12582912
early_node_map[3] active PFN ranges
    0:        0 ->      154
    0:      256 ->   786294
    0:  1048576 -> 12582912
ACPI: PM-Timer IO Port: 0x9c
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
Processor #17
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
Processor #22
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
Processor #23
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
Processor #32
ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
Processor #33
ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
Processor #38
ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
Processor #39
ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
Processor #48
ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
Processor #49
ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
Processor #54
ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
Processor #55
ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
Processor #64
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
Processor #65
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
Processor #70
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
Processor #71
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
Processor #80
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
Processor #81
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
Processor #86
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
Processor #87
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
Processor #96
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
Processor #97
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
Processor #102
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
Processor #103
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
Processor #112
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
Processor #113
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
Processor #118
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
Processor #119
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
Setting APIC routing to clustered
ACPI: HPET id: 0x10142201 base: 0xfde84000
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at init/main.c:512
SMP: Allowing 16 CPUs, 0 hotplug CPUs
PERCPU: Allocating 33920 bytes of per cpu data
afinfo corrupted at init/main.c:527
Built 1 zonelists.  Total pages: 12147064
Kernel command line: root=/dev/sda1 vga=791
ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1
showopts console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1
ABAT:1160073474
afinfo corrupted at init/main.c:536
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
afinfo corrupted at init/main.c:545
afinfo corrupted at init/main.c:548
Console: colour VGA+ 80x25
Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
afinfo corrupted at init/main.c:582
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x310c2000 - 0x350c2000
Memory: 48422908k/50331648k available (2566k kernel code, 858868k
reserved, 1345k data, 184k init)
afinfo corrupted at init/main.c:584
Calibrating delay using timer specific routine.. 5677.94 BogoMIPS
(lpj=11355895)
afinfo corrupted at init/main.c:593
afinfo corrupted at init/main.c:603
Mount-cache hash table entries: 256
afinfo corrupted at init/main.c:610
afinfo corrupted at init/main.c:618
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Using local APIC timer interrupts.
result 10425802
Detected 10.425 MHz APIC timer.
afinfo corrupted at init/main.c:749
SMP alternatives: switching to SMP code
Booting processor 1/16 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 5671.84 BogoMIPS
(lpj=11343680)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU1: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 799
cycles)
SMP alternatives: switching to SMP code
Booting processor 2/16 APIC 0x6
Initializing CPU#2
Calibrating delay using timer specific routine.. 5671.99 BogoMIPS
(lpj=11343984)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU2: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 2: Syncing TSC to CPU 0.
CPU 2: synchronized TSC with CPU 0 (last diff -13 cycles, maxerr 3341
cycles)
SMP alternatives: switching to SMP code
Booting processor 3/16 APIC 0x7
Initializing CPU#3
Calibrating delay using timer specific routine.. 5672.06 BogoMIPS
(lpj=11344129)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU3: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 3: Syncing TSC to CPU 0.
CPU 3: synchronized TSC with CPU 0 (last diff 178 cycles, maxerr 3171
cycles)
SMP alternatives: switching to SMP code
Booting processor 4/16 APIC 0x10
Initializing CPU#4
Calibrating delay using timer specific routine.. 5672.04 BogoMIPS
(lpj=11344087)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU4: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 4: Syncing TSC to CPU 0.
CPU 4: synchronized TSC with CPU 0 (last diff -420 cycles, maxerr 3510
cycles)
SMP alternatives: switching to SMP code
Booting processor 5/16 APIC 0x11
Initializing CPU#5
Calibrating delay using timer specific routine.. 5672.04 BogoMIPS
(lpj=11344081)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU5: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 5: Syncing TSC to CPU 0.
CPU 5: synchronized TSC with CPU 0 (last diff -801 cycles, maxerr 3315
cycles)
SMP alternatives: switching to SMP code
Booting processor 6/16 APIC 0x16
Initializing CPU#6
Calibrating delay using timer specific routine.. 5672.02 BogoMIPS
(lpj=11344046)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU6: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 6: Syncing TSC to CPU 0.
CPU 6: synchronized TSC with CPU 0 (last diff -287 cycles, maxerr 3281
cycles)
SMP alternatives: switching to SMP code
Booting processor 7/16 APIC 0x17
Initializing CPU#7
Calibrating delay using timer specific routine.. 5672.01 BogoMIPS
(lpj=11344028)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU7: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 7: Syncing TSC to CPU 0.
CPU 7: synchronized TSC with CPU 0 (last diff 238 cycles, maxerr 3391
cycles)
SMP alternatives: switching to SMP code
Booting processor 8/16 APIC 0x20
Initializing CPU#8
Calibrating delay using timer specific routine.. 5672.42 BogoMIPS
(lpj=11344847)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU8: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 8: Syncing TSC to CPU 0.
CPU 8: synchronized TSC with CPU 0 (last diff 101 cycles, maxerr 8577
cycles)
SMP alternatives: switching to SMP code
Booting processor 9/16 APIC 0x21
Initializing CPU#9
Calibrating delay using timer specific routine.. 5672.28 BogoMIPS
(lpj=11344576)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU9: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 9: Syncing TSC to CPU 0.
CPU 9: synchronized TSC with CPU 0 (last diff 200 cycles, maxerr 8109
cycles)
SMP alternatives: switching to SMP code
Booting processor 10/16 APIC 0x26
Initializing CPU#10
Calibrating delay using timer specific routine.. 5672.50 BogoMIPS
(lpj=11345012)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU10: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 10: Syncing TSC to CPU 0.
CPU 10: synchronized TSC with CPU 0 (last diff 72 cycles, maxerr 8551
cycles)
SMP alternatives: switching to SMP code
Booting processor 11/16 APIC 0x27
Initializing CPU#11
Calibrating delay using timer specific routine.. 5672.90 BogoMIPS
(lpj=11345804)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU11: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 11: Syncing TSC to CPU 0.
CPU 11: synchronized TSC with CPU 0 (last diff -548 cycles, maxerr 8526
cycles)
SMP alternatives: switching to SMP code
Booting processor 12/16 APIC 0x30
Initializing CPU#12
Calibrating delay using timer specific routine.. 5672.75 BogoMIPS
(lpj=11345516)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU12: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 12: Syncing TSC to CPU 0.
CPU 12: synchronized TSC with CPU 0 (last diff 35 cycles, maxerr 8636
cycles)
SMP alternatives: switching to SMP code
Booting processor 13/16 APIC 0x31
Initializing CPU#13
Calibrating delay using timer specific routine.. 5672.55 BogoMIPS
(lpj=11345119)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU13: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 13: Syncing TSC to CPU 0.
CPU 13: synchronized TSC with CPU 0 (last diff -1125 cycles, maxerr 7829
cycles)
SMP alternatives: switching to SMP code
Booting processor 14/16 APIC 0x36
Initializing CPU#14
Calibrating delay using timer specific routine.. 5672.25 BogoMIPS
(lpj=11344507)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU14: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 14: Syncing TSC to CPU 0.
CPU 14: synchronized TSC with CPU 0 (last diff -796 cycles, maxerr 8568
cycles)
SMP alternatives: switching to SMP code
Booting processor 15/16 APIC 0x37
Initializing CPU#15
Calibrating delay using timer specific routine.. 5672.24 BogoMIPS
(lpj=11344495)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU15: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 15: Syncing TSC to CPU 0.
CPU 15: synchronized TSC with CPU 0 (last diff -3 cycles, maxerr 7531
cycles)
Brought up 16 CPUs
testing NMI watchdog ... OK.
time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
time.c: Detected 2835.836 MHz processor.
afinfo corrupted at init/main.c:755
migration_cost=29,1007
afinfo corrupted at init/main.c:761
afinfo corrupted at init/main.c:769
Calling initcall 0xffffffff802166c0: init_smp_flush+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806077b0: helper_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607b40: pm_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607bc0: ksysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a490: filelock_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060afa0: init_script_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060afb0: init_elf_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614400: sock_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614ba0: netlink_proto_init+0x0/0x1a0()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 16
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c080: kobject_uevent_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c210: pcibus_class_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c7e0: pci_driver_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060eca0: tty_class_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f790: vtconsole_class_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c920: acpi_pci_init+0x0/0x40()
afinfo corrupted at init/main.c:659
ACPI: bus type pci registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d65f: init_acpi_device_notify+0x0/0x4b()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613810: pci_access_init+0x0/0x30()
afinfo corrupted at init/main.c:659
PCI: Using configuration type 1
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806054d0: topology_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806074e0: param_sysfs_init+0x0/0x200()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80249d00: pm_sysrq_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ac50: init_bio+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bf40: genhd_device_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d472: acpi_init+0x0/0x1ed()
afinfo corrupted at init/main.c:659
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d945: acpi_ec_init+0x0/0x62()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dd5e: acpi_pci_root_init+0x0/0x28()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dda6: acpi_pci_link_init+0x0/0x48()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060df2c: acpi_power_init+0x0/0x77()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dfa3: acpi_system_init+0x0/0xc6()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e069: acpi_event_init+0x0/0x3f()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e0a8: acpi_scan_init+0x0/0x1ac()
afinfo corrupted at init/main.c:659
ACPI: PCI Root Bridge [VP00] (0000:00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1
ACPI: PCI Root Bridge [VP01] (0000:01)
ACPI: PCI Root Bridge [VP02] (0000:02)
ACPI: PCI Root Bridge [VP03] (0000:04)
ACPI: PCI Root Bridge [VP04] (0000:06)
ACPI: PCI Root Bridge [VP05] (0000:08)
ACPI: PCI Root Bridge [VP06] (0000:0a)
ACPI: PCI Root Bridge [VP07] (0000:0c)
ACPI: PCI Root Bridge [VP10] (0000:0e)
ACPI: PCI Root Bridge [VP11] (0000:0f)
ACPI: PCI Root Bridge [VP12] (0000:10)
ACPI: PCI Root Bridge [VP13] (0000:12)
ACPI: PCI Root Bridge [VP14] (0000:14)
ACPI: PCI Root Bridge [VP15] (0000:16)
ACPI: PCI Root Bridge [VP16] (0000:18)
ACPI: PCI Root Bridge [VP17] (0000:1a)
ACPI: PCI Root Bridge [VP20] (0000:1c)
ACPI: PCI Root Bridge [VP21] (0000:1d)
ACPI: PCI Root Bridge [VP22] (0000:1e)
ACPI: PCI Root Bridge [VP23] (0000:20)
ACPI: PCI Root Bridge [VP24] (0000:22)
ACPI: PCI Root Bridge [VP25] (0000:24)
ACPI: PCI Root Bridge [VP26] (0000:26)
ACPI: PCI Root Bridge [VP27] (0000:28)
ACPI: PCI Root Bridge [VP30] (0000:2a)
ACPI: PCI Root Bridge [VP31] (0000:2b)
ACPI: PCI Root Bridge [VP32] (0000:2c)
ACPI: PCI Root Bridge [VP33] (0000:2e)
ACPI: PCI Root Bridge [VP34] (0000:30)
ACPI: PCI Root Bridge [VP35] (0000:32)
ACPI: PCI Root Bridge [VP36] (0000:34)
ACPI: PCI Root Bridge [VP37] (0000:36)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e3c4: acpi_cm_sbs_init+0x0/0xc()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e3d0: pnp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux Plug and Play Support v0.97 (c) Adam Belay
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e660: pnpacpi_init+0x0/0x70()
afinfo corrupted at init/main.c:659
pnp: PnP ACPI init
pnp: PnP ACPI: found 47 devices
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f200: misc_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80375670: cn_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611560: init_scsi+0x0/0x90()
afinfo corrupted at init/main.c:659
SCSI subsystem initialized
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612240: serio_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612660: input_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612a70: rtc_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612ac0: rtc_sysfs_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612ad0: rtc_proc_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612ae0: rtc_dev_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613840: pci_acpi_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a
report
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806138f0: pci_legacy_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613ea0: pcibios_irq_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614390: pcibios_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806144c0: proto_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614660: net_dev_init+0x0/0x210()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614d40: genl_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fdfc0: late_hpet_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
hpet0: at MMIO 0xfde84000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 3707069 Hz
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ffe20: pci_iommu_init+0x0/0x20()
afinfo corrupted at init/main.c:659
PCI-GART: No AMD northbridge found.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a410: init_pipe_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e294: acpi_motherboard_init+0x0/0x130()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e500: pnp_system_init+0x0/0x10()
afinfo corrupted at init/main.c:659
pnp: 00:0a: ioport range 0x400-0x47f has been reserved
pnp: 00:0a: ioport range 0x480-0x4ff could not be reserved
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e9e0: chr_dev_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806107b0: firmware_class_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613220: pcibios_assign_resources+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80615750: inet_init+0x0/0x400()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 2
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8020db10: time_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe760: i8259A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe730: init_timer_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fed80: vsyscall_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ff010: sbf_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ffdf0: i8237A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600270: periodic_mcheck_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806002a0: mce_init_device+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806003e0: thermal_throttle_init_device
+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600450: threshold_init_device+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80601c50: init_lapic_sysfs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806027f0: ioapic_init_sysfs+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8021d1f0: cache_sysfs_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806055e0: x8664_sysctl_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80606aa0: create_proc_profile+0x0/0x280()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80606ee0: ioresources_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607050: timekeeping_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607170: uid_cache_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806076e0: init_posix_timers+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806077f0: init_posix_cpu_timers+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607910: latency_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a00: init_clocksource_sysfs+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a60: init_jiffies_clocksource+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a70: init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607ae0: proc_dma_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80245840: percpu_modinit+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607b10: kallsyms_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607b80: ikconfig_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80608cd0: init_per_zone_pages_min+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609c40: pdflush_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609c90: kswapd_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609cc0: setup_vmstat+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609d30: procswaps_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609da0: hugetlb_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Total HugeTLB memory allocated, 0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609e10: init_tmpfs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609ef0: cpucache_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a460: fasync_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ab70: aio_setup+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060adf0: inotify_setup+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ae00: inotify_user_setup+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060aec0: eventpoll_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060afc0: init_mbcache+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060aff0: dnotify_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b4b0: init_devpts_fs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b4f0: init_reiserfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b570: init_ext3_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b6a0: journal_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b780: init_ext2_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b840: init_ramfs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b850: init_hugetlbfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b910: init_fat_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b960: init_vfat_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b970: init_nls_cp437+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b980: init_nls_iso8859_1+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b990: init_autofs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b9a0: init_autofs4_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
initcall at 0xffffffff8060b9a0: init_autofs4_fs+0x0/0x10(): returned
with error code -16
Calling initcall 0xffffffff8060b9b0: ipc_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc80: init_mqueue_fs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bd60: crypto_algapi_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bda0: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bdb0: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfa0: noop_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler noop registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfb0: as_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler anticipatory registered (default)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfc0: deadline_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler deadline registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfd0: cfq_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
io scheduler cfq registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8032c1d0: pci_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c7f0: pci_sysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c830: pci_proc_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d6aa: acpi_ac_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d6ef: acpi_battery_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dd00: acpi_video_init+0x0/0x5e()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ddee: irqrouter_init_sysfs+0x0/0x38()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ea80: rand_initialize+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060eab0: tty_init+0x0/0x1f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ed10: pty_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f850: hpet_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f8c0: agp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux agpgart interface v0.101 (c) Dave Jones
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fa20: cn_proc_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fe60: serial8250_init+0x0/0x150()
afinfo corrupted at init/main.c:659
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing
disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610090: serial8250_pnp_init+0x0/0x10()
afinfo corrupted at init/main.c:659
00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:04: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806100a0: serial8250_pci_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80384c90: topology_sysfs_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610830: e1000_init_module+0x0/0x50()
afinfo corrupted at init/main.c:659
Intel(R) PRO/1000 Network Driver - version 7.2.9-k2
Copyright (c) 1999-2006 Intel Corporation.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610880: tg3_init+0x0/0x10()
afinfo corrupted at init/main.c:659
tg3.c:v3.66 (September 23, 2006)
ACPI: PCI Interrupt 0000:01:01.0[A] -> GSI 24 (level, low) -> IRQ 24
eth0: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:0d:60:98:63:54
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth0: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:01:01.1[B] -> GSI 28 (level, low) -> IRQ 28
eth1: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:0d:60:98:63:55
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth1: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.0[A] -> GSI 96 (level, low) -> IRQ 96
eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:0c
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth2: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.1[B] -> GSI 100 (level, low) -> IRQ 100
eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:0d
eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth3: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.0[A] -> GSI 168 (level, low) -> IRQ 168
eth4: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:6c
eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth4: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.1[B] -> GSI 172 (level, low) -> IRQ 172
eth5: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:6d
eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth5: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.0[A] -> GSI 240 (level, low) -> IRQ 240
eth6: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:43:82
eth6: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth6: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.1[B] -> GSI 244 (level, low) -> IRQ 244
eth7: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:43:83
eth7: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth7: dma_rwctrl[769f0000] dma_mask[64-bit]
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610910: net_olddevs_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8630: init_netconsole+0x0/0x80()
afinfo corrupted at init/main.c:659
netconsole: not configured, aborting
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8710: cmd64x_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806109e0: piix_ide_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803aa810: svwks_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803ab480: generic_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610b20: ide_init+0x0/0x90()
afinfo corrupted at init/main.c:659
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
SvrWks CSB6: IDE controller at PCI slot 0000:00:0f.1
SvrWks CSB6: chipset revision 160
SvrWks CSB6: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA
SvrWks CSB6: simplex device: DMA disabled
ide1: SvrWks CSB6 Bus-Master DMA disabled (BIOS)
hda: MATSHITADVD-ROM SR-8178, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806114f0: ide_generic_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611510: idedisk_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611520: ide_cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
hda: ATAPI 24X DVD-ROM drive, 256kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.20
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611530: idefloppy_init+0x0/0x30()
afinfo corrupted at init/main.c:659
ide-floppy driver 0.99.newide
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611800: raid_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611810: spi_transport_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611850: fc_transport_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806118a0: iscsi_transport_init+0x0/0x120()
afinfo corrupted at init/main.c:659
Loading iSCSI transport class v2.0-685.afinfo corrupted at
init/main.c:663
Calling initcall 0xffffffff806119c0: sas_transport_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611a80: iscsi_tcp_init+0x0/0x50()
afinfo corrupted at init/main.c:659
iscsi: registered transport (tcp)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611ad0: aac_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Adaptec aacraid driver (1.1-5[2409]-mh2)
ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
AAC0: kernel 5.0-2[8264]
AAC0: monitor 5.0-2[8264]
AAC0: bios 5.0-2[8264]
AAC0: serial 162348
AAC0: 64bit support enabled.
AAC0: 64 Bit DAC enabled
scsi0 : ServeRAID
scsi 0:0:0:0: Direct-Access     IBM      Drive 1          V1.0 PQ: 0
ANSI: 2
scsi 0:0:1:0: Direct-Access     IBM      Drive 2          V1.0 PQ: 0
ANSI: 2
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611b40: qla1280_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611d10: sym2_init+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611e20: init_sd+0x0/0x60()
afinfo corrupted at init/main.c:659
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
 sda: sda1 sda2 sda3
sd 0:0:0:0: Attached scsi removable disk sda
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
 sdb: sdb1 sdb2 sdb3
sd 0:0:1:0: Attached scsi removable disk sdb
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611e80: fusion_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT base driver 3.04.01
Copyright (c) 1999-2005 LSI Logic Corporation
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611f80: mptspi_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
Fusion MPT SPI Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612040: mptfc_init+0x0/0xf0()
afinfo corrupted at init/main.c:659
Fusion MPT FC Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612130: mptctl_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT misc device (ioctl) driver 3.04.01
mptctl: Registered with Fusion MPT base driver
mptctl: /dev/mptctl @ (major,minor=10,220)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612230: cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612310: i8042_init+0x0/0x350()
afinfo corrupted at init/main.c:659
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612780: mousedev_init+0x0/0x100()
afinfo corrupted at init/main.c:659
mice: PS/2 mouse device common for all mice
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612880: atkbd_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612b90: hwmon_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806149d0: flow_cache_init+0x0/0x1d0()
afinfo corrupted at init/main.c:659
input: AT Translated Set 2 keyboard as /class/input/input0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80615e60: init_syncookies+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80615e80: xfrm4_beet_init+0x0/0x20()
afinfo corrupted at init/main.c:659
Unable to handle kernel NULL pointer dereference at 0000000000000827
RIP:
 [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #4
RIP: 0010:[<ffffffff80470666>]  [<ffffffff80470666>] xfrm_register_mode
+0x36/0x60
RSP: 0000:ffff810bffcbded0  EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000100000
RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 0000000000000002 R09: fffffffffffffffd
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff805d2000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task
ffff810bffcbb4e0)
Stack:  0000000000000000 0000000000000000 ffffffff8061fc48
ffffffff802071d6
 6f6320726f727265 000036312d206564 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000090000
Call Trace:
 [<ffffffff802071d6>] init+0x1b6/0x3b0
 [<ffffffff8020aa28>] child_rip+0xa/0x12
 [<ffffffff80339542>] acpi_ds_init_one_object+0x0/0x82
 [<ffffffff80207020>] init+0x0/0x3b0
 [<ffffffff8020aa1e>] child_rip+0x0/0x12


Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 e5 fe ff
RIP  [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
 RSP <ffff810bffcbded0>
CR2: 0000000000000827
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 18:27                             ` Andi Kleen
  2006-10-05 18:51                               ` Steve Fox
@ 2006-10-05 18:52                               ` Vivek Goyal
  2006-10-05 19:08                                 ` Andi Kleen
  1 sibling, 1 reply; 140+ messages in thread
From: Vivek Goyal @ 2006-10-05 18:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Steve Fox, Badari Pulavarty, Martin Bligh, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
> On Thursday 05 October 2006 19:57, Steve Fox wrote:
> > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
> > 
> > > Please don't snip the Code: line. It is fairly important.
> > 
> > Sorry about that. The remote console I was using appears to overwrite
> > some text after I force the reboot. Here's a clean one.
> > 
> > global ffffffffffffffff
> 
> Ok that definitely shouldn't be in there.
> 
> I guess we need to track when it gets corrupted. Can you send the full
> boot log with this patch applied?
> 

Just recalled one more observation about the problem when keith had
reported it last. If I just move .bss before .data_nosave instead
of it being at the end, keith's problem had disappeared.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 18:51                               ` Steve Fox
@ 2006-10-05 19:05                                 ` Andi Kleen
  2006-10-05 20:42                                   ` Steve Fox
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 19:05 UTC (permalink / raw)
  To: Steve Fox
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thursday 05 October 2006 20:51, Steve Fox wrote:
> On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote:
> 
> > I guess we need to track when it gets corrupted. Can you send the full
> > boot log with this patch applied?
> 
> Here she blows!

Can you please try it again with this patch to narrow it down further?

-Andi

Index: linux-2.6.19-rc1-hack/init/main.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/init/main.c
+++ linux-2.6.19-rc1-hack/init/main.c
@@ -75,6 +75,9 @@
 
 static int init(void *);
 
+extern void bugcheck(char *, int);
+#define CHECK bugcheck(__FILE__, __LINE__)
+
 extern void init_IRQ(void);
 extern void fork_init(unsigned long);
 extern void mca_init(void);
@@ -480,6 +483,8 @@ asmlinkage void __init start_kernel(void
 	char * command_line;
 	extern struct kernel_param __start___param[], __stop___param[];
 
+	CHECK;
+
 	smp_setup_processor_id();
 
 	/*
@@ -502,7 +507,9 @@ asmlinkage void __init start_kernel(void
 	page_address_init();
 	printk(KERN_NOTICE);
 	printk(linux_banner);
+	CHECK;
 	setup_arch(&command_line);
+	CHECK;
 	setup_per_cpu_areas();
 	smp_prepare_boot_cpu();	/* arch-specific boot-cpu hooks */
 
@@ -517,6 +524,7 @@ asmlinkage void __init start_kernel(void
 	 * fragile until we cpu_idle() for the first time.
 	 */
 	preempt_disable();
+	CHECK;
 	build_all_zonelists();
 	page_alloc_init();
 	printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line);
@@ -525,6 +533,7 @@ asmlinkage void __init start_kernel(void
 		   __stop___param - __start___param,
 		   &unknown_bootoption);
 	sort_main_extable();
+	CHECK;
 	trap_init();
 	rcu_init();
 	init_IRQ();
@@ -533,8 +542,10 @@ asmlinkage void __init start_kernel(void
 	hrtimers_init();
 	softirq_init();
 	timekeeping_init();
+	CHECK;
 	time_init();
 	profile_init();
+	CHECK;
 	if (!irqs_disabled())
 		printk("start_kernel(): bug: interrupts were enabled early\n");
 	early_boot_irqs_on();
@@ -568,7 +579,9 @@ asmlinkage void __init start_kernel(void
 #endif
 	vfs_caches_init_early();
 	cpuset_init_early();
+	CHECK;
 	mem_init();
+	CHECK;
 	kmem_cache_init();
 	setup_per_cpu_pageset();
 	numa_policy_init();
@@ -577,6 +590,7 @@ asmlinkage void __init start_kernel(void
 	calibrate_delay();
 	pidmap_init();
 	pgtable_cache_init();
+	CHECK;
 	prio_tree_init();
 	anon_vma_init();
 #ifdef CONFIG_X86
@@ -586,12 +600,14 @@ asmlinkage void __init start_kernel(void
 	fork_init(num_physpages);
 	proc_caches_init();
 	buffer_init();
+	CHECK;
 	unnamed_dev_init();
 	key_init();
 	security_init();
 	vfs_caches_init(num_physpages);
 	radix_tree_init();
 	signals_init();
+	CHECK;
 	/* rootfs populating might need page-writeback */
 	page_writeback_init();
 #ifdef CONFIG_PROC_FS
@@ -599,6 +615,7 @@ asmlinkage void __init start_kernel(void
 #endif
 	cpuset_init();
 	taskstats_init_early();
+	CHECK;
 	delayacct_init();
 
 	check_bugs();
@@ -609,7 +626,7 @@ asmlinkage void __init start_kernel(void
 	rest_init();
 }
 
-static int __initdata initcall_debug;
+static int __initdata initcall_debug = 1;
 
 static int __init initcall_debug_setup(char *str)
 {
@@ -639,7 +656,11 @@ static void __init do_initcalls(void)
 			printk("\n");
 		}
 
+		CHECK;
+
 		result = (*call)();
+		
+		CHECK;
 
 		if (result && result != -ENODEV && initcall_debug) {
 			sprintf(msgbuf, "error code %d", result);
@@ -725,21 +746,32 @@ static int init(void * unused)
 
 	smp_prepare_cpus(max_cpus);
 
+	CHECK;
+
 	do_pre_smp_initcalls();
 
 	smp_init();
+
+	CHECK;
+
 	sched_init_smp();
 
 	cpuset_init_smp();
 
+	CHECK;
+
 	/*
 	 * Do this before initcalls, because some drivers want to access
 	 * firmware files.
 	 */
 	populate_rootfs();
 
+	CHECK;
+
 	do_basic_setup();
 
+	CHECK;
+
 	/*
 	 * check if there is an early userspace init.  If yes, let it do all
 	 * the work
Index: linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/net/xfrm/xfrm_policy.c
+++ linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
@@ -39,6 +39,16 @@ EXPORT_SYMBOL(xfrm_policy_count);
 static DEFINE_RWLOCK(xfrm_policy_afinfo_lock);
 static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO];
 
+void bugcheck(char *where, int line)
+{
+	int i;
+	for (i = 0; i < NPROTO; i++)
+		if (xfrm_policy_afinfo[i] == (void *)-1UL) {
+			panic("afinfo corrupted at %s:%d\n",where,line);
+			return;
+		}
+}
+
 static kmem_cache_t *xfrm_dst_cache __read_mostly;
 
 static struct work_struct xfrm_policy_gc_work;
Index: linux-2.6.19-rc1-hack/arch/x86_64/kernel/setup.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/arch/x86_64/kernel/setup.c
+++ linux-2.6.19-rc1-hack/arch/x86_64/kernel/setup.c
@@ -65,6 +65,12 @@
 #include <asm/sections.h>
 #include <asm/dmi.h>
 
+
+
+extern void bugcheck(char *, int);
+#define CHECK bugcheck(__FILE__, __LINE__)
+
+
 /*
  * Machine setup..
  */
@@ -351,14 +357,22 @@ void __init setup_arch(char **cmdline_p)
 	saved_video_mode = SAVED_VIDEO_MODE;
 	bootloader_type = LOADER_TYPE;
 
+	CHECK;
+
 #ifdef CONFIG_BLK_DEV_RAM
 	rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK;
 	rd_prompt = ((RAMDISK_FLAGS & RAMDISK_PROMPT_FLAG) != 0);
 	rd_doload = ((RAMDISK_FLAGS & RAMDISK_LOAD_FLAG) != 0);
 #endif
+
+	CHECK;
+
 	setup_memory_region();
+	CHECK;
 	copy_edd();
 
+	CHECK;
+
 	if (!MOUNT_ROOT_RDONLY)
 		root_mountflags &= ~MS_RDONLY;
 	init_mm.start_code = (unsigned long) &_text;
@@ -373,14 +387,25 @@ void __init setup_arch(char **cmdline_p)
 
 	early_identify_cpu(&boot_cpu_data);
 
+	CHECK;
+
+
 	strlcpy(command_line, saved_command_line, COMMAND_LINE_SIZE);
 	*cmdline_p = command_line;
 
+	CHECK;
+
+
 	parse_early_param();
 
+	CHECK;
+
 	finish_e820_parsing();
+	CHECK;
 
 	e820_register_active_regions(0, 0, -1UL);
+	CHECK;
+
 	/*
 	 * partially used pages are not usable - thus
 	 * we are rounding upwards:
@@ -389,14 +414,19 @@ void __init setup_arch(char **cmdline_p)
 	num_physpages = end_pfn;
 
 	check_efer();
+	CHECK;
 
 	discover_ebda();
+	CHECK;
 
 	init_memory_mapping(0, (end_pfn_map << PAGE_SHIFT));
+	CHECK;
 
 	dmi_scan_machine();
+	CHECK;
 
 	zap_low_mappings(0);
+	CHECK;
 
 #ifdef CONFIG_ACPI
 	/*
@@ -405,6 +435,7 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	acpi_boot_table_init();
 #endif
+	CHECK;
 
 	/* How many end-of-memory variables you have, grandma! */
 	max_low_pfn = end_pfn;
@@ -413,6 +444,7 @@ void __init setup_arch(char **cmdline_p)
 
 	/* Remove active ranges so rediscovery with NUMA-awareness happens */
 	remove_all_active_ranges();
+	CHECK;
 
 #ifdef CONFIG_ACPI_NUMA
 	/*
@@ -420,20 +452,24 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	acpi_numa_init();
 #endif
+	CHECK;
 
 #ifdef CONFIG_NUMA
 	numa_initmem_init(0, end_pfn); 
 #else
 	contig_initmem_init(0, end_pfn);
 #endif
+	CHECK;
 
 	/* Reserve direct mapping */
 	reserve_bootmem_generic(table_start << PAGE_SHIFT, 
 				(table_end - table_start) << PAGE_SHIFT);
+	CHECK;
 
 	/* reserve kernel */
 	reserve_bootmem_generic(__pa_symbol(&_text),
 				__pa_symbol(&_end) - __pa_symbol(&_text));
+	CHECK;
 
 	/*
 	 * reserve physical page 0 - it's a special BIOS page on many boxes,
@@ -444,6 +480,7 @@ void __init setup_arch(char **cmdline_p)
 	/* reserve ebda region */
 	if (ebda_addr)
 		reserve_bootmem_generic(ebda_addr, ebda_size);
+	CHECK;
 
 #ifdef CONFIG_SMP
 	/*
@@ -456,6 +493,7 @@ void __init setup_arch(char **cmdline_p)
 	/* Reserve SMP trampoline */
 	reserve_bootmem_generic(SMP_TRAMPOLINE_BASE, PAGE_SIZE);
 #endif
+	CHECK;
 
 #ifdef CONFIG_ACPI_SLEEP
        /*
@@ -463,10 +501,14 @@ void __init setup_arch(char **cmdline_p)
         */
        acpi_reserve_bootmem();
 #endif
+	CHECK;
+
 	/*
 	 * Find and reserve possible boot-time SMP configuration:
 	 */
 	find_smp_config();
+	CHECK;
+
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (LOADER_TYPE && INITRD_START) {
 		if (INITRD_START + INITRD_SIZE <= (end_pfn << PAGE_SHIFT)) {
@@ -484,18 +526,23 @@ void __init setup_arch(char **cmdline_p)
 		}
 	}
 #endif
+	CHECK;
+
 #ifdef CONFIG_KEXEC
 	if (crashk_res.start != crashk_res.end) {
 		reserve_bootmem_generic(crashk_res.start,
 			crashk_res.end - crashk_res.start + 1);
 	}
 #endif
+	CHECK;
 
 	paging_init();
+	CHECK;
 
 #ifdef CONFIG_PCI
 	early_quirks();
 #endif
+	CHECK;
 
 	/*
 	 * set this early, so we dont allocate cpu0
@@ -509,25 +556,36 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	acpi_boot_init();
 #endif
+	CHECK;
 
 	init_cpu_to_node();
+	CHECK;
 
 	/*
 	 * get boot-time SMP configuration:
 	 */
 	if (smp_found_config)
 		get_smp_config();
+	CHECK;
+
 	init_apic_mappings();
+	CHECK;
 
 	/*
 	 * Request address space for all standard RAM and ROM resources
 	 * and also for regions reported as reserved by the e820.
 	 */
 	probe_roms();
+	CHECK;
+
 	e820_reserve_resources(); 
+	CHECK;
+
 	e820_mark_nosave_regions();
+	CHECK;
 
 	request_resource(&iomem_resource, &video_ram_resource);
+	CHECK;
 
 	{
 	unsigned i;
@@ -535,8 +593,10 @@ void __init setup_arch(char **cmdline_p)
 	for (i = 0; i < ARRAY_SIZE(standard_io_resources); i++)
 		request_resource(&ioport_resource, &standard_io_resources[i]);
 	}
+	CHECK;
 
 	e820_setup_gap();
+	CHECK;
 
 #ifdef CONFIG_VT
 #if defined(CONFIG_VGA_CONSOLE)

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 18:52                               ` Vivek Goyal
@ 2006-10-05 19:08                                 ` Andi Kleen
  2006-10-05 20:25                                   ` Steve Fox
  2006-10-05 20:39                                   ` Mel Gorman
  0 siblings, 2 replies; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 19:08 UTC (permalink / raw)
  To: vgoyal
  Cc: Steve Fox, Badari Pulavarty, Martin Bligh, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft, Mel Gorman

On Thursday 05 October 2006 20:52, Vivek Goyal wrote:
> On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
> > On Thursday 05 October 2006 19:57, Steve Fox wrote:
> > > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
> > > 
> > > > Please don't snip the Code: line. It is fairly important.
> > > 
> > > Sorry about that. The remote console I was using appears to overwrite
> > > some text after I force the reboot. Here's a clean one.
> > > 
> > > global ffffffffffffffff
> > 
> > Ok that definitely shouldn't be in there.
> > 
> > I guess we need to track when it gets corrupted. Can you send the full
> > boot log with this patch applied?
> > 
> 
> Just recalled one more observation about the problem when keith had
> reported it last. If I just move .bss before .data_nosave instead
> of it being at the end, keith's problem had disappeared.

Yes, that could well be that it's something in the new bootmap 
management.  Steve's box failed at

Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at init/main.c:512

which is directly after that code does lots of stuff.

Mel might want to take a look (and perhaps
also cut down a little on the ugly printks ...) 

BTW I found one of my test systems too now which does a lot of:
I'm about to leave for vacation so i won't have time to track it down
any time soon. But here is it for reference.

-Andi

Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
Bad page state in process 'swapper'
page:ffff810003ee5480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:   

Call Trace:  
 [<ffffffff8020ac84>] show_trace+0x34/0x47
 [<ffffffff8020aca9>] dump_stack+0x12/0x17
 [<ffffffff802586a7>] bad_page+0x57/0x81
 [<ffffffff80258791>] __free_pages_ok+0x64/0x247
 [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
 [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
 [<ffffffff807c915e>] mem_init+0x44/0x186
 [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
 [<ffffffff807bc168>] _sinittext+0x168/0x16c

Bad page state in process 'swapper'
page:ffff810003ee54b8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:   

Call Trace:  
 [<ffffffff8020ac84>] show_trace+0x34/0x47
 [<ffffffff8020aca9>] dump_stack+0x12/0x17
 [<ffffffff802586a7>] bad_page+0x57/0x81
 [<ffffffff80258791>] __free_pages_ok+0x64/0x247
 [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
 [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
 [<ffffffff807c915e>] mem_init+0x44/0x186
 [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
 [<ffffffff807bc168>] _sinittext+0x168/0x16c


... lots more of those ...

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 19:08                                 ` Andi Kleen
@ 2006-10-05 20:25                                   ` Steve Fox
  2006-10-05 20:39                                   ` Mel Gorman
  1 sibling, 0 replies; 140+ messages in thread
From: Steve Fox @ 2006-10-05 20:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: vgoyal, Badari Pulavarty, Martin Bligh, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft, Mel Gorman

On Thu, 2006-10-05 at 21:08 +0200, Andi Kleen wrote:

> Mel might want to take a look (and perhaps
> also cut down a little on the ugly printks ...) 

I tested a patch from Mel which backs out the arch independent zone
sizing and got the same results (to my inexperienced eye). I've sent him
the boot log to verify they really are the same as without this
back-out.

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 19:08                                 ` Andi Kleen
  2006-10-05 20:25                                   ` Steve Fox
@ 2006-10-05 20:39                                   ` Mel Gorman
  2006-10-05 20:51                                     ` Andi Kleen
  1 sibling, 1 reply; 140+ messages in thread
From: Mel Gorman @ 2006-10-05 20:39 UTC (permalink / raw)
  To: Andi Kleen
  Cc: vgoyal, Steve Fox, Badari Pulavarty, Martin Bligh, Andrew Morton,
	lkml, netdev, kmannth, Andy Whitcroft

On Thu, 5 Oct 2006, Andi Kleen wrote:

> On Thursday 05 October 2006 20:52, Vivek Goyal wrote:
>> On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
>>> On Thursday 05 October 2006 19:57, Steve Fox wrote:
>>>> On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
>>>>
>>>>> Please don't snip the Code: line. It is fairly important.
>>>>
>>>> Sorry about that. The remote console I was using appears to overwrite
>>>> some text after I force the reboot. Here's a clean one.
>>>>
>>>> global ffffffffffffffff
>>>
>>> Ok that definitely shouldn't be in there.
>>>
>>> I guess we need to track when it gets corrupted. Can you send the full
>>> boot log with this patch applied?
>>>
>>
>> Just recalled one more observation about the problem when keith had
>> reported it last. If I just move .bss before .data_nosave instead
>> of it being at the end, keith's problem had disappeared.
>
> Yes, that could well be that it's something in the new bootmap
> management.  Steve's box failed at
>
> Using ACPI (MADT) for SMP configuration information
> Nosave address range: 000000000009a000 - 000000000009b000
> Nosave address range: 000000000009b000 - 00000000000a0000
> Nosave address range: 00000000000a0000 - 00000000000e0000
> Nosave address range: 00000000000e0000 - 0000000000100000
> Nosave address range: 00000000bff76000 - 00000000bff77000
> Nosave address range: 00000000bff77000 - 00000000bff98000
> Nosave address range: 00000000bff98000 - 00000000bff99000
> Nosave address range: 00000000bff99000 - 00000000c0000000
> Nosave address range: 00000000c0000000 - 00000000fec00000
> Nosave address range: 00000000fec00000 - 0000000100000000
> Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
> afinfo corrupted at init/main.c:512
>
> which is directly after that code does lots of stuff.
>
> Mel might want to take a look (and perhaps
> also cut down a little on the ugly printks ...)
>

Steve tested a patch with arch-independent zone-sizing backed out for 
x86_64 and things looked ok but that is no guarantee it is not a 
contributary factor. The "Nosave address range:" printks are related to a 
suspend problem that was reported .... end of June I believe.

I'll pick this up in the morning because I should have access to the same 
machine Steve does and see what I can come up with.

> BTW I found one of my test systems too now which does a lot of:
> I'm about to leave for vacation so i won't have time to track it down
> any time soon. But here is it for reference.
>

hmm, rather than bugging you with patches now, I'll see what I can find 
with the x86_64 machines I have access to and see can I reproduce it.

> -Andi
>
> Please enable the IOMMU option in the BIOS setup
> This costs you 64 MB of RAM
> Mapping aperture over 65536 KB of RAM @ 8000000
> Bad page state in process 'swapper'
> page:ffff810003ee5480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
>
> Call Trace:
> [<ffffffff8020ac84>] show_trace+0x34/0x47
> [<ffffffff8020aca9>] dump_stack+0x12/0x17
> [<ffffffff802586a7>] bad_page+0x57/0x81
> [<ffffffff80258791>] __free_pages_ok+0x64/0x247
> [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
> [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
> [<ffffffff807c915e>] mem_init+0x44/0x186
> [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
> [<ffffffff807bc168>] _sinittext+0x168/0x16c
>
> Bad page state in process 'swapper'
> page:ffff810003ee54b8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
>
> Call Trace:
> [<ffffffff8020ac84>] show_trace+0x34/0x47
> [<ffffffff8020aca9>] dump_stack+0x12/0x17
> [<ffffffff802586a7>] bad_page+0x57/0x81
> [<ffffffff80258791>] __free_pages_ok+0x64/0x247
> [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
> [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
> [<ffffffff807c915e>] mem_init+0x44/0x186
> [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
> [<ffffffff807bc168>] _sinittext+0x168/0x16c
>
>
> ... lots more of those ...
>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 19:05                                 ` Andi Kleen
@ 2006-10-05 20:42                                   ` Steve Fox
  2006-10-05 20:50                                     ` Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-05 20:42 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:

> Can you please try it again with this patch to narrow it down further?

Unfortunately this is as far as it got before it hung.

root (hd0,0)
 Filesystem type is reiserfs, partition type 0x83
kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.5
0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts console=tty0 console=ttyS0,
57600 autobench_args: root=/dev/sda1 ABAT:1160080320
   [Linux-bzImage, setup=0x1400, size=0x1dd871]
initrd /boot/initrd-autobench.img
   [Linux-initrd @ 0x37ceb000, 0x304c57 bytes]


-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 20:42                                   ` Steve Fox
@ 2006-10-05 20:50                                     ` Andi Kleen
  2006-10-06  2:23                                       ` Steve Fox
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 20:50 UTC (permalink / raw)
  To: Steve Fox
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thursday 05 October 2006 22:42, Steve Fox wrote:
> On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:
> 
> > Can you please try it again with this patch to narrow it down further?
> 
> Unfortunately this is as far as it got before it hung.

Boot with earlyprintk=serial,ttyS0,57600
(or change the panic in the checkfunction back to a printk) 

-Andi


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 20:39                                   ` Mel Gorman
@ 2006-10-05 20:51                                     ` Andi Kleen
  2006-10-05 23:14                                       ` 2.6.18-mm2 boot failure on x86-64 II Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 20:51 UTC (permalink / raw)
  To: Mel Gorman
  Cc: vgoyal, Steve Fox, Badari Pulavarty, Martin Bligh, Andrew Morton,
	lkml, netdev, kmannth, Andy Whitcroft


> hmm, rather than bugging you with patches now, I'll see what I can find 
> with the x86_64 machines I have access to and see can I reproduce it.

I started the bisect, should finish soon.

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-03 16:34                 ` Jean Tourrilhes
  2006-10-03 16:45                   ` Samuel Tardieu
@ 2006-10-05 22:37                   ` Pavel Roskin
  2006-10-05 22:42                     ` Jean Tourrilhes
  1 sibling, 1 reply; 140+ messages in thread
From: Pavel Roskin @ 2006-10-05 22:37 UTC (permalink / raw)
  To: jt; +Cc: Samuel Tardieu, John W. Linville, linux-kernel, netdev

Hello!

On Tue, 2006-10-03 at 09:34 -0700, Jean Tourrilhes wrote:
> 	I don't really want to overstep my authority there, my goal
> was to minimise the changes. Pavel will have to clean up my mess, so I
> don't want change things too much.

Sorry for a long delay.

I'm actually not very interested in the Wireless Extension interface of
the driver.  The less I touch that code, the better I feel.  I won't add
to the criticism for the latest changes; enough has been said.

Its fine with me that your are changing the orinoco driver to update
Wireless Extensions compatibility.

I'm trying to maintain a Subversion repository with the driver modified
to be compatible with a few latest kernels.  But it looks like it's an
uphill battle that I'm not going to win.

-- 
Regards,
Pavel Roskin


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 - oops in cache_alloc_refill()
  2006-10-05 22:37                   ` Pavel Roskin
@ 2006-10-05 22:42                     ` Jean Tourrilhes
  0 siblings, 0 replies; 140+ messages in thread
From: Jean Tourrilhes @ 2006-10-05 22:42 UTC (permalink / raw)
  To: Pavel Roskin; +Cc: Samuel Tardieu, John W. Linville, linux-kernel, netdev

On Thu, Oct 05, 2006 at 06:37:53PM -0400, Pavel Roskin wrote:
> Hello!
> 
> On Tue, 2006-10-03 at 09:34 -0700, Jean Tourrilhes wrote:
> > 	I don't really want to overstep my authority there, my goal
> > was to minimise the changes. Pavel will have to clean up my mess, so I
> > don't want change things too much.
> 
> Sorry for a long delay.

	That's ok, we all have a real life ;-)

> I'm actually not very interested in the Wireless Extension interface of
> the driver.  The less I touch that code, the better I feel.  I won't add
> to the criticism for the latest changes; enough has been said.
> 
> Its fine with me that your are changing the orinoco driver to update
> Wireless Extensions compatibility.
> 
> I'm trying to maintain a Subversion repository with the driver modified
> to be compatible with a few latest kernels.  But it looks like it's an
> uphill battle that I'm not going to win.

	I'll try to come up with a patch for you. It's not as bad as
it looks like. It will look like the patch for the external ipw
drivers I sent on the list.

> Pavel Roskin

	Have fun...

	Jean

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64 II
  2006-10-05 20:51                                     ` Andi Kleen
@ 2006-10-05 23:14                                       ` Andi Kleen
  2006-10-05 23:32                                         ` keith mannthey
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 23:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: vgoyal, Steve Fox, Badari Pulavarty, Martin Bligh, Andrew Morton,
	lkml, netdev, kmannth, Andy Whitcroft

On Thursday 05 October 2006 22:51, Andi Kleen wrote:
> 
> > hmm, rather than bugging you with patches now, I'll see what I can find 
> > with the x86_64 machines I have access to and see can I reproduce it.
> 
> I started the bisect, should finish soon.

It ended at 

diff-tree d5cdb67236dba94496de052c9f9f431e1fc658f4 (from 0dad3510ee82bcf8a380b81
a2184a664a911ef9c)
Author: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Date:   Tue Sep 12 10:19:00 2006 -0700

    acpiphp: disable bridges
    
    Currently acpiphp calls pci_enable_device() against all
    hot-added bridges, but acpiphp does not call pci_disable_device()
    against them in hot-remove. So ioapic hot-remove would fail.
    This patch fixes this issue.

Not sure that is it really, it is possible i made a mistake during bisect
(the symptoms changed from bad page to just networking doesn't work
somewhere at 4cfee88ad30acc47f02b8b7ba3db8556262dce1e) 

I don't have time to rerun unfortunately
for some time. Anyone else looking would be useful.

-Andi


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64 II
  2006-10-05 23:14                                       ` 2.6.18-mm2 boot failure on x86-64 II Andi Kleen
@ 2006-10-05 23:32                                         ` keith mannthey
  2006-10-05 23:35                                           ` Andi Kleen
  0 siblings, 1 reply; 140+ messages in thread
From: keith mannthey @ 2006-10-05 23:32 UTC (permalink / raw)
  To: Andi Kleen
  Cc: mel gorman, Vivek goyal, Steve Fox, Badari Pulavarty,
	Martin Bligh, Andrew Morton, lkml, netdev, Andy Whitcroft

On Fri, 2006-10-06 at 01:14 +0200, Andi Kleen wrote:
> On Thursday 05 October 2006 22:51, Andi Kleen wrote:
> > 
> > > hmm, rather than bugging you with patches now, I'll see what I can find 
> > > with the x86_64 machines I have access to and see can I reproduce it.
> > 
> > I started the bisect, should finish soon.
> 
> It ended at 
> 
> diff-tree d5cdb67236dba94496de052c9f9f431e1fc658f4 (from 0dad3510ee82bcf8a380b81
> a2184a664a911ef9c)
> Author: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
> Date:   Tue Sep 12 10:19:00 2006 -0700
> 
>     acpiphp: disable bridges
>     
>     Currently acpiphp calls pci_enable_device() against all
>     hot-added bridges, but acpiphp does not call pci_disable_device()
>     against them in hot-remove. So ioapic hot-remove would fail.
>     This patch fixes this issue.
> 
> Not sure that is it really, it is possible i made a mistake during bisect
> (the symptoms changed from bad page to just networking doesn't work
> somewhere at 4cfee88ad30acc47f02b8b7ba3db8556262dce1e) 
> 
> I don't have time to rerun unfortunately
> for some time. Anyone else looking would be useful.

As of yet I haven't been able to recreate the hang.  I am running
similar HW to Steve. 

Thanks,
  Keith 


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64 II
  2006-10-05 23:32                                         ` keith mannthey
@ 2006-10-05 23:35                                           ` Andi Kleen
  2006-10-05 23:58                                             ` keith mannthey
  0 siblings, 1 reply; 140+ messages in thread
From: Andi Kleen @ 2006-10-05 23:35 UTC (permalink / raw)
  To: kmannth
  Cc: mel gorman, Vivek goyal, Steve Fox, Badari Pulavarty,
	Martin Bligh, Andrew Morton, lkml, netdev, Andy Whitcroft


> As of yet I haven't been able to recreate the hang.  I am running
> similar HW to Steve. 

That was on a 4 core Opteron with Tyan board  (S2881) and AMD-8111 
chipset.

-Andi

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64 II
  2006-10-05 23:35                                           ` Andi Kleen
@ 2006-10-05 23:58                                             ` keith mannthey
  2006-10-06  0:02                                               ` Badari Pulavarty
  0 siblings, 1 reply; 140+ messages in thread
From: keith mannthey @ 2006-10-05 23:58 UTC (permalink / raw)
  To: Andi Kleen
  Cc: mel gorman, Vivek goyal, Steve Fox, Badari Pulavarty,
	Martin Bligh, Andrew Morton, lkml, netdev, Andy Whitcroft

On Fri, 2006-10-06 at 01:35 +0200, Andi Kleen wrote:
> > As of yet I haven't been able to recreate the hang.  I am running
> > similar HW to Steve. 

I ran into this with -mm3

Memory: 24150368k/26738688k available (1933k kernel code, 490260k
reserved, 978k data, 308k init)
------------[ cut here ]------------
kernel BUG in init_list at mm/slab.c:1334!
invalid opcode: 0000 [1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.18-mm3-smp #1
RIP: 0010:[<ffffffff8027f8fa>]  [<ffffffff8027f8fa>] init_list+0x1d/0xfd
RSP: 0018:ffffffff80577f48  EFLAGS: 00010212
RAX: 0000000000000040 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff805ba848 RDI: ffff810460700040
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000003
R10: 0000000000000000 R11: ffffffff805bc268 R12: ffff810460700040
R13: ffffffff805ba848 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff804d8000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffffffff80576000, task
ffffffff80455840)
Stack:  0000000000000000 0000000000000000 0000000100000000
0000000000000001
 ffffffff805ba848 0000000000000000 0000000000000000 ffffffff80593aa8
 00000000000002c0 0000000100000001 000000000008ef00 000000000008c000
Call Trace:
 [<ffffffff80593aa8>] kmem_cache_init+0x344/0x406
 [<ffffffff805805ef>] start_kernel+0x180/0x21b
 [<ffffffff8058016a>] _sinittext+0x16a/0x16e


Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48
RIP  [<ffffffff8027f8fa>] init_list+0x1d/0xfd
 RSP <ffffffff80577f48>
 <0>Kernel panic - not syncing: Attempted to kill the idle task!


I am going to revert the patch and see if it works.  I ran -git22 just
fine. 

Thanks,
  Keith 


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64 II
  2006-10-05 23:58                                             ` keith mannthey
@ 2006-10-06  0:02                                               ` Badari Pulavarty
  2006-10-06  0:12                                                 ` Andrew Morton
  0 siblings, 1 reply; 140+ messages in thread
From: Badari Pulavarty @ 2006-10-06  0:02 UTC (permalink / raw)
  To: kmannth
  Cc: Andi Kleen, mel gorman, Vivek goyal, Steve Fox, Martin Bligh,
	Andrew Morton, lkml, netdev, Andy Whitcroft

keith mannthey wrote:
> On Fri, 2006-10-06 at 01:35 +0200, Andi Kleen wrote:
>   
>>> As of yet I haven't been able to recreate the hang.  I am running
>>> similar HW to Steve. 
>>>       
>
> I ran into this with -mm3
>
> Memory: 24150368k/26738688k available (1933k kernel code, 490260k
> reserved, 978k data, 308k init)
> ------------[ cut here ]------------
> kernel BUG in init_list at mm/slab.c:1334!
> invalid opcode: 0000 [1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 0, comm: swapper Not tainted 2.6.18-mm3-smp #1
> RIP: 0010:[<ffffffff8027f8fa>]  [<ffffffff8027f8fa>] init_list+0x1d/0xfd
> RSP: 0018:ffffffff80577f48  EFLAGS: 00010212
> RAX: 0000000000000040 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffffff805ba848 RDI: ffff810460700040
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000003
> R10: 0000000000000000 R11: ffffffff805bc268 R12: ffff810460700040
> R13: ffffffff805ba848 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffffffff804d8000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006a0
> Process swapper (pid: 0, threadinfo ffffffff80576000, task
> ffffffff80455840)
> Stack:  0000000000000000 0000000000000000 0000000100000000
> 0000000000000001
>  ffffffff805ba848 0000000000000000 0000000000000000 ffffffff80593aa8
>  00000000000002c0 0000000100000001 000000000008ef00 000000000008c000
> Call Trace:
>  [<ffffffff80593aa8>] kmem_cache_init+0x344/0x406
>  [<ffffffff805805ef>] start_kernel+0x180/0x21b
>  [<ffffffff8058016a>] _sinittext+0x16a/0x16e
>
>
> Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48
> RIP  [<ffffffff8027f8fa>] init_list+0x1d/0xfd
>  RSP <ffffffff80577f48>
>  <0>Kernel panic - not syncing: Attempted to kill the idle task!
>
>
> I am going to revert the patch and see if it works.  I ran -git22 just
> fine. 
>
> Thanks,
>   Keith 
>
>   
Keith,

I fixed this already. Can you look for it on lkml (look for 2.6.18-mm3 
in the subject line).
one typo in mm/slab.c

Thanks,
Badari


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64 II
  2006-10-06  0:02                                               ` Badari Pulavarty
@ 2006-10-06  0:12                                                 ` Andrew Morton
  0 siblings, 0 replies; 140+ messages in thread
From: Andrew Morton @ 2006-10-06  0:12 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: kmannth, Andi Kleen, mel gorman, Vivek goyal, Steve Fox,
	Martin Bligh, lkml, netdev, Andy Whitcroft

On Thu, 05 Oct 2006 17:02:54 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> > Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48
> > RIP  [<ffffffff8027f8fa>] init_list+0x1d/0xfd
> >  RSP <ffffffff80577f48>
> >  <0>Kernel panic - not syncing: Attempted to kill the idle task!
> >
> >
> > I am going to revert the patch and see if it works.  I ran -git22 just
> > fine. 
> >
> > Thanks,
> >   Keith 
> >
> >   
> Keith,
> 
> I fixed this already. Can you look for it on lkml (look for 2.6.18-mm3 
> in the subject line).
> one typo in mm/slab.c

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm3/hot-fixes

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-05 20:50                                     ` Andi Kleen
@ 2006-10-06  2:23                                       ` Steve Fox
  2006-10-06 14:33                                         ` Mel Gorman
  0 siblings, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-06  2:23 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Badari Pulavarty, Martin Bligh, vgoyal, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Thu, 2006-10-05 at 22:50 +0200, Andi Kleen wrote:
> On Thursday 05 October 2006 22:42, Steve Fox wrote:
> > On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:
> > 
> > > Can you please try it again with this patch to narrow it down further?
> > 
> > Unfortunately this is as far as it got before it hung.
> 
> Boot with earlyprintk=serial,ttyS0,57600
> (or change the panic in the checkfunction back to a printk) 

root (hd0,0)
 Filesystem type is reiserfs, partition type 0x83
kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.5
0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57
600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:116010
0417
   [Linux-bzImage, setup=0x1400, size=0x1dd855]
initrd /boot/initrd-autobench.img
   [Linux-initrd @ 0x37cec000, 0x303f80 bytes]

Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
 BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
 BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
 BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
end_pfn_map = 12582912
kernel direct mapping tables up to c00000000 @ 8000-39000
DMI 2.3 present.
afinfo corrupted at arch/x86_64/kernel/setup.c:462
afinfo corrupted at arch/x86_64/kernel/setup.c:467
afinfo corrupted at arch/x86_64/kernel/setup.c:472
afinfo corrupted at arch/x86_64/kernel/setup.c:483
afinfo corrupted at arch/x86_64/kernel/setup.c:496
afinfo corrupted at arch/x86_64/kernel/setup.c:504
afinfo corrupted at arch/x86_64/kernel/setup.c:510
afinfo corrupted at arch/x86_64/kernel/setup.c:529
afinfo corrupted at arch/x86_64/kernel/setup.c:537
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 -> 12582912
early_node_map[3] active PFN ranges
    0:        0 ->      154
    0:      256 ->   786294
    0:  1048576 -> 12582912
afinfo corrupted at arch/x86_64/kernel/setup.c:540
afinfo corrupted at arch/x86_64/kernel/setup.c:545
ACPI: PM-Timer IO Port: 0x9c
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
Processor #17
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
Processor #22
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
Processor #23
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
Processor #32
ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
Processor #33
ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
Processor #38
ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
Processor #39
ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
Processor #48
ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
Processor #49
ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
Processor #54
ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
Processor #55
ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
Processor #64
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
Processor #65
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
Processor #70
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
Processor #71
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
Processor #80
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
Processor #81
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
Processor #86
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
Processor #87
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
Processor #96
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
Processor #97
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
Processor #102
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
Processor #103
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
Processor #112
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
Processor #113
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
Processor #118
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
Processor #119
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
Setting APIC routing to clustered
ACPI: HPET id: 0x10142201 base: 0xfde84000
afinfo corrupted at arch/x86_64/kernel/setup.c:559
afinfo corrupted at arch/x86_64/kernel/setup.c:562
Using ACPI (MADT) for SMP configuration information
afinfo corrupted at arch/x86_64/kernel/setup.c:569
afinfo corrupted at arch/x86_64/kernel/setup.c:572
afinfo corrupted at arch/x86_64/kernel/setup.c:579
afinfo corrupted at arch/x86_64/kernel/setup.c:582
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
afinfo corrupted at arch/x86_64/kernel/setup.c:585
afinfo corrupted at arch/x86_64/kernel/setup.c:588
afinfo corrupted at arch/x86_64/kernel/setup.c:596
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at arch/x86_64/kernel/setup.c:599
afinfo corrupted at init/main.c:512
SMP: Allowing 16 CPUs, 0 hotplug CPUs
PERCPU: Allocating 33920 bytes of per cpu data
afinfo corrupted at init/main.c:527
Built 1 zonelists.  Total pages: 12147064
Kernel command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
afinfo corrupted at init/main.c:536
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
afinfo corrupted at init/main.c:545
afinfo corrupted at init/main.c:548
disabling early console
Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
 BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
 BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
 BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
end_pfn_map = 12582912
DMI 2.3 present.
afinfo corrupted at arch/x86_64/kernel/setup.c:462
afinfo corrupted at arch/x86_64/kernel/setup.c:467
afinfo corrupted at arch/x86_64/kernel/setup.c:472
afinfo corrupted at arch/x86_64/kernel/setup.c:483
afinfo corrupted at arch/x86_64/kernel/setup.c:496
afinfo corrupted at arch/x86_64/kernel/setup.c:504
afinfo corrupted at arch/x86_64/kernel/setup.c:510
afinfo corrupted at arch/x86_64/kernel/setup.c:529
afinfo corrupted at arch/x86_64/kernel/setup.c:537
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 -> 12582912
early_node_map[3] active PFN ranges
    0:        0 ->      154
    0:      256 ->   786294
    0:  1048576 -> 12582912
afinfo corrupted at arch/x86_64/kernel/setup.c:540
afinfo corrupted at arch/x86_64/kernel/setup.c:545
ACPI: PM-Timer IO Port: 0x9c
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
Processor #17
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
Processor #22
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
Processor #23
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
Processor #32
ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
Processor #33
ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
Processor #38
ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
Processor #39
ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
Processor #48
ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
Processor #49
ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
Processor #54
ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
Processor #55
ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
Processor #64
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
Processor #65
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
Processor #70
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
Processor #71
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
Processor #80
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
Processor #81
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
Processor #86
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
Processor #87
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
Processor #96
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
Processor #97
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
Processor #102
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
Processor #103
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
Processor #112
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
Processor #113
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
Processor #118
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
Processor #119
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
Setting APIC routing to clustered
ACPI: HPET id: 0x10142201 base: 0xfde84000
afinfo corrupted at arch/x86_64/kernel/setup.c:559
afinfo corrupted at arch/x86_64/kernel/setup.c:562
Using ACPI (MADT) for SMP configuration information
afinfo corrupted at arch/x86_64/kernel/setup.c:569
afinfo corrupted at arch/x86_64/kernel/setup.c:572
afinfo corrupted at arch/x86_64/kernel/setup.c:579
afinfo corrupted at arch/x86_64/kernel/setup.c:582
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
afinfo corrupted at arch/x86_64/kernel/setup.c:585
afinfo corrupted at arch/x86_64/kernel/setup.c:588
afinfo corrupted at arch/x86_64/kernel/setup.c:596
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at arch/x86_64/kernel/setup.c:599
afinfo corrupted at init/main.c:512
SMP: Allowing 16 CPUs, 0 hotplug CPUs
PERCPU: Allocating 33920 bytes of per cpu data
afinfo corrupted at init/main.c:527
Built 1 zonelists.  Total pages: 12147064
Kernel command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
afinfo corrupted at init/main.c:536
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
afinfo corrupted at init/main.c:545
afinfo corrupted at init/main.c:548
disabling early console
Console: colour VGA+ 80x25
Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
afinfo corrupted at init/main.c:582
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x310c2000 - 0x350c2000
Memory: 48422908k/50331648k available (2566k kernel code, 858868k reserved, 1345k data, 184k init)
afinfo corrupted at init/main.c:584
Calibrating delay using timer specific routine.. 5678.09 BogoMIPS (lpj=11356196)
afinfo corrupted at init/main.c:593
afinfo corrupted at init/main.c:603
Mount-cache hash table entries: 256
afinfo corrupted at init/main.c:610
afinfo corrupted at init/main.c:618
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Using local APIC timer interrupts.
result 10425595
Detected 10.425 MHz APIC timer.
afinfo corrupted at init/main.c:749
SMP alternatives: switching to SMP code
Booting processor 1/16 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 5671.84 BogoMIPS (lpj=11343696)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU1: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -2 cycles, maxerr 799 cycles)
SMP alternatives: switching to SMP code
Booting processor 2/16 APIC 0x6
Initializing CPU#2
Calibrating delay using timer specific routine.. 5671.98 BogoMIPS (lpj=11343971)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU2: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 2: Syncing TSC to CPU 0.
CPU 2: synchronized TSC with CPU 0 (last diff -184 cycles, maxerr 3349 cycles)
SMP alternatives: switching to SMP code
Booting processor 3/16 APIC 0x7
Initializing CPU#3
Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344041)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU3: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 3: Syncing TSC to CPU 0.
CPU 3: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 1989 cycles)
SMP alternatives: switching to SMP code
Booting processor 4/16 APIC 0x10
Initializing CPU#4
Calibrating delay using timer specific routine.. 5672.07 BogoMIPS (lpj=11344144)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU4: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 4: Syncing TSC to CPU 0.
CPU 4: synchronized TSC with CPU 0 (last diff 43 cycles, maxerr 3247 cycles)
SMP alternatives: switching to SMP code
Booting processor 5/16 APIC 0x11
Initializing CPU#5
Calibrating delay using timer specific routine.. 5672.01 BogoMIPS (lpj=11344024)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU5: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 5: Syncing TSC to CPU 0.
CPU 5: synchronized TSC with CPU 0 (last diff 21 cycles, maxerr 3349 cycles)
SMP alternatives: switching to SMP code
Booting processor 6/16 APIC 0x16
Initializing CPU#6
Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344042)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU6: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 6: Syncing TSC to CPU 0.
CPU 6: synchronized TSC with CPU 0 (last diff 257 cycles, maxerr 3383 cycles)
SMP alternatives: switching to SMP code
Booting processor 7/16 APIC 0x17
Initializing CPU#7
Calibrating delay using timer specific routine.. 5672.10 BogoMIPS (lpj=11344218)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU7: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 7: Syncing TSC to CPU 0.
CPU 7: synchronized TSC with CPU 0 (last diff 233 cycles, maxerr 3357 cycles)
SMP alternatives: switching to SMP code
Booting processor 8/16 APIC 0x20
Initializing CPU#8
Calibrating delay using timer specific routine.. 5672.35 BogoMIPS (lpj=11344712)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU8: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 8: Syncing TSC to CPU 0.
CPU 8: synchronized TSC with CPU 0 (last diff 140 cycles, maxerr 8509 cycles)
SMP alternatives: switching to SMP code
Booting processor 9/16 APIC 0x21
Initializing CPU#9
Calibrating delay using timer specific routine.. 5672.25 BogoMIPS (lpj=11344515)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU9: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 9: Syncing TSC to CPU 0.
CPU 9: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 7556 cycles)
SMP alternatives: switching to SMP code
Booting processor 10/16 APIC 0x26
Initializing CPU#10
Calibrating delay using timer specific routine.. 5672.33 BogoMIPS (lpj=11344676)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU10: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 10: Syncing TSC to CPU 0.
CPU 10: synchronized TSC with CPU 0 (last diff 405 cycles, maxerr 8126 cycles)
SMP alternatives: switching to SMP code
Booting processor 11/16 APIC 0x27
Initializing CPU#11
Calibrating delay using timer specific routine.. 5672.46 BogoMIPS (lpj=11344939)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU11: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 11: Syncing TSC to CPU 0.
CPU 11: synchronized TSC with CPU 0 (last diff -145 cycles, maxerr 8568 cycles)
SMP alternatives: switching to SMP code
Booting processor 12/16 APIC 0x30
Initializing CPU#12
Calibrating delay using timer specific routine.. 5672.23 BogoMIPS (lpj=11344472)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU12: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 12: Syncing TSC to CPU 0.
CPU 12: synchronized TSC with CPU 0 (last diff 419 cycles, maxerr 8602 cycles)
SMP alternatives: switching to SMP code
Booting processor 13/16 APIC 0x31
Initializing CPU#13
Calibrating delay using timer specific routine.. 5672.34 BogoMIPS (lpj=11344689)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU13: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 13: Syncing TSC to CPU 0.
CPU 13: synchronized TSC with CPU 0 (last diff 242 cycles, maxerr 8636 cycles)
SMP alternatives: switching to SMP code
Booting processor 14/16 APIC 0x36
Initializing CPU#14
Calibrating delay using timer specific routine.. 5672.32 BogoMIPS (lpj=11344644)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU14: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 14: Syncing TSC to CPU 0.
CPU 14: synchronized TSC with CPU 0 (last diff -272 cycles, maxerr 8109 cycles)
SMP alternatives: switching to SMP code
Booting processor 15/16 APIC 0x37
Initializing CPU#15
Calibrating delay using timer specific routine.. 5672.21 BogoMIPS (lpj=11344423)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU15: Thermal monitoring enabled (TM1)
               Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 15: Syncing TSC to CPU 0.
CPU 15: synchronized TSC with CPU 0 (last diff -21 cycles, maxerr 8560 cycles)
Brought up 16 CPUs
testing NMI watchdog ... OK.
time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
time.c: Detected 2835.773 MHz processor.
afinfo corrupted at init/main.c:755
migration_cost=19,988
afinfo corrupted at init/main.c:761
afinfo corrupted at init/main.c:769
Calling initcall 0xffffffff802166c0: init_smp_flush+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a40: helper_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607dd0: pm_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607e50: ksysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a720: filelock_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b230: init_script_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b240: init_elf_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614690: sock_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614e30: netlink_proto_init+0x0/0x1a0()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 16
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c310: kobject_uevent_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c4a0: pcibus_class_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ca70: pci_driver_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ef30: tty_class_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fa20: vtconsole_class_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060cbb0: acpi_pci_init+0x0/0x40()
afinfo corrupted at init/main.c:659
ACPI: bus type pci registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d8ef: init_acpi_device_notify+0x0/0x4b()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613aa0: pci_access_init+0x0/0x30()
afinfo corrupted at init/main.c:659
PCI: Using configuration type 1
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80605760: topology_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607770: param_sysfs_init+0x0/0x200()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80249d00: pm_sysrq_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060aee0: init_bio+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c1d0: genhd_device_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d702: acpi_init+0x0/0x1ed()
afinfo corrupted at init/main.c:659
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dbd5: acpi_ec_init+0x0/0x62()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dfee: acpi_pci_root_init+0x0/0x28()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e036: acpi_pci_link_init+0x0/0x48()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e1bc: acpi_power_init+0x0/0x77()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e233: acpi_system_init+0x0/0xc6()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e2f9: acpi_event_init+0x0/0x3f()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e338: acpi_scan_init+0x0/0x1ac()
afinfo corrupted at init/main.c:659
ACPI: PCI Root Bridge [VP00] (0000:00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1
ACPI: PCI Root Bridge [VP01] (0000:01)
ACPI: PCI Root Bridge [VP02] (0000:02)
ACPI: PCI Root Bridge [VP03] (0000:04)
ACPI: PCI Root Bridge [VP04] (0000:06)
ACPI: PCI Root Bridge [VP05] (0000:08)
ACPI: PCI Root Bridge [VP06] (0000:0a)
ACPI: PCI Root Bridge [VP07] (0000:0c)
ACPI: PCI Root Bridge [VP10] (0000:0e)
ACPI: PCI Root Bridge [VP11] (0000:0f)
ACPI: PCI Root Bridge [VP12] (0000:10)
ACPI: PCI Root Bridge [VP13] (0000:12)
ACPI: PCI Root Bridge [VP14] (0000:14)
ACPI: PCI Root Bridge [VP15] (0000:16)
ACPI: PCI Root Bridge [VP16] (0000:18)
ACPI: PCI Root Bridge [VP17] (0000:1a)
ACPI: PCI Root Bridge [VP20] (0000:1c)
ACPI: PCI Root Bridge [VP21] (0000:1d)
ACPI: PCI Root Bridge [VP22] (0000:1e)
ACPI: PCI Root Bridge [VP23] (0000:20)
ACPI: PCI Root Bridge [VP24] (0000:22)
ACPI: PCI Root Bridge [VP25] (0000:24)
ACPI: PCI Root Bridge [VP26] (0000:26)
ACPI: PCI Root Bridge [VP27] (0000:28)
ACPI: PCI Root Bridge [VP30] (0000:2a)
ACPI: PCI Root Bridge [VP31] (0000:2b)
ACPI: PCI Root Bridge [VP32] (0000:2c)
ACPI: PCI Root Bridge [VP33] (0000:2e)
ACPI: PCI Root Bridge [VP34] (0000:30)
ACPI: PCI Root Bridge [VP35] (0000:32)
ACPI: PCI Root Bridge [VP36] (0000:34)
ACPI: PCI Root Bridge [VP37] (0000:36)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e654: acpi_cm_sbs_init+0x0/0xc()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e660: pnp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux Plug and Play Support v0.97 (c) Adam Belay
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e8f0: pnpacpi_init+0x0/0x70()
afinfo corrupted at init/main.c:659
pnp: PnP ACPI init
pnp: PnP ACPI: found 47 devices
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f490: misc_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80375670: cn_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117f0: init_scsi+0x0/0x90()
afinfo corrupted at init/main.c:659
SCSI subsystem initialized
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806124d0: serio_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806128f0: input_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d00: rtc_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d50: rtc_sysfs_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d60: rtc_proc_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d70: rtc_dev_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613ad0: pci_acpi_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613b80: pci_legacy_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614130: pcibios_irq_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614620: pcibios_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614750: proto_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806148f0: net_dev_init+0x0/0x210()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614fd0: genl_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fdfc0: late_hpet_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
hpet0: at MMIO 0xfde84000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 3707069 Hz
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806000b0: pci_iommu_init+0x0/0x20()
afinfo corrupted at init/main.c:659
PCI-GART: No AMD northbridge found.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a6a0: init_pipe_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e524: acpi_motherboard_init+0x0/0x130()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e790: pnp_system_init+0x0/0x10()
afinfo corrupted at init/main.c:659
pnp: 00:0a: ioport range 0x400-0x47f has been reserved
pnp: 00:0a: ioport range 0x480-0x4ff could not be reserved
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ec70: chr_dev_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610a40: firmware_class_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806134b0: pcibios_assign_resources+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806159e0: inet_init+0x0/0x400()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 2
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8020db10: time_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe9f0: i8259A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe9c0: init_timer_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ff010: vsyscall_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ff2a0: sbf_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600080: i8237A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600500: periodic_mcheck_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600530: mce_init_device+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600670: thermal_throttle_init_device+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806006e0: threshold_init_device+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80601ee0: init_lapic_sysfs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80602a80: ioapic_init_sysfs+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8021d1f0: cache_sysfs_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80605870: x8664_sysctl_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80606d30: create_proc_profile+0x0/0x280()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607170: ioresources_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806072e0: timekeeping_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607400: uid_cache_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607970: init_posix_timers+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a80: init_posix_cpu_timers+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607ba0: latency_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607c90: init_clocksource_sysfs+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607cf0: init_jiffies_clocksource+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607d00: init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607d70: proc_dma_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80245840: percpu_modinit+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607da0: kallsyms_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607e10: ikconfig_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80608f60: init_per_zone_pages_min+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609ed0: pdflush_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609f20: kswapd_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609f50: setup_vmstat+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609fc0: procswaps_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a030: hugetlb_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Total HugeTLB memory allocated, 0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a0a0: init_tmpfs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a180: cpucache_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a6f0: fasync_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ae00: aio_setup+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b080: inotify_setup+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b090: inotify_user_setup+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b150: eventpoll_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b250: init_mbcache+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b280: dnotify_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b740: init_devpts_fs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b780: init_reiserfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b800: init_ext3_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b930: journal_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ba10: init_ext2_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bad0: init_ramfs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bae0: init_hugetlbfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bba0: init_fat_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bbf0: init_vfat_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc00: init_nls_cp437+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc10: init_nls_iso8859_1+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc20: init_autofs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
initcall at 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10(): returned with error code -16
Calling initcall 0xffffffff8060bc40: ipc_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bf10: init_mqueue_fs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bff0: crypto_algapi_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c030: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c040: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c230: noop_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler noop registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c240: as_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler anticipatory registered (default)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c250: deadline_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler deadline registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c260: cfq_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
io scheduler cfq registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8032c1d0: pci_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ca80: pci_sysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060cac0: pci_proc_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d93a: acpi_ac_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d97f: acpi_battery_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060df90: acpi_video_init+0x0/0x5e()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e07e: irqrouter_init_sysfs+0x0/0x38()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ed10: rand_initialize+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ed40: tty_init+0x0/0x1f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060efa0: pty_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fae0: hpet_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fb50: agp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux agpgart interface v0.101 (c) Dave Jones
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fcb0: cn_proc_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806100f0: serial8250_init+0x0/0x150()
afinfo corrupted at init/main.c:659
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610320: serial8250_pnp_init+0x0/0x10()
afinfo corrupted at init/main.c:659
00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:04: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610330: serial8250_pci_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80384c90: topology_sysfs_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610ac0: e1000_init_module+0x0/0x50()
afinfo corrupted at init/main.c:659
Intel(R) PRO/1000 Network Driver - version 7.2.9-k2
Copyright (c) 1999-2006 Intel Corporation.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610b10: tg3_init+0x0/0x10()
afinfo corrupted at init/main.c:659
tg3.c:v3.66 (September 23, 2006)
ACPI: PCI Interrupt 0000:01:01.0[A] -> GSI 24 (level, low) -> IRQ 24
eth0: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:54
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth0: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:01:01.1[B] -> GSI 28 (level, low) -> IRQ 28
eth1: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:55
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.0[A] -> GSI 96 (level, low) -> IRQ 96
eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0c
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth2: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.1[B] -> GSI 100 (level, low) -> IRQ 100
eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0d
eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth3: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.0[A] -> GSI 168 (level, low) -> IRQ 168
eth4: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6c
eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth4: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.1[B] -> GSI 172 (level, low) -> IRQ 172
eth5: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6d
eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth5: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.0[A] -> GSI 240 (level, low) -> IRQ 240
eth6: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:82
eth6: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth6: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.1[B] -> GSI 244 (level, low) -> IRQ 244
eth7: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:83
eth7: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth7: dma_rwctrl[769f0000] dma_mask[64-bit]
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610ba0: net_olddevs_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8630: init_netconsole+0x0/0x80()
afinfo corrupted at init/main.c:659
netconsole: not configured, aborting
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8710: cmd64x_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610c70: piix_ide_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803aa810: svwks_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803ab480: generic_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610db0: ide_init+0x0/0x90()
afinfo corrupted at init/main.c:659
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SvrWks CSB6: IDE controller at PCI slot 0000:00:0f.1
SvrWks CSB6: chipset revision 160
SvrWks CSB6: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA
SvrWks CSB6: simplex device: DMA disabled
ide1: SvrWks CSB6 Bus-Master DMA disabled (BIOS)
hda: MATSHITADVD-ROM SR-8178, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611780: ide_generic_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117a0: idedisk_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117b0: ide_cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
hda: ATAPI 24X DVD-ROM drive, 256kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.20
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117c0: idefloppy_init+0x0/0x30()
afinfo corrupted at init/main.c:659
ide-floppy driver 0.99.newide
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611a90: raid_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611aa0: spi_transport_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611ae0: fc_transport_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611b30: iscsi_transport_init+0x0/0x120()
afinfo corrupted at init/main.c:659
Loading iSCSI transport class v2.0-685.afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611c50: sas_transport_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611d10: iscsi_tcp_init+0x0/0x50()
afinfo corrupted at init/main.c:659
iscsi: registered transport (tcp)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611d60: aac_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Adaptec aacraid driver (1.1-5[2409]-mh2)
ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
AAC0: kernel 5.0-2[8264]
AAC0: monitor 5.0-2[8264]
AAC0: bios 5.0-2[8264]
AAC0: serial 162348
AAC0: 64bit support enabled.
AAC0: 64 Bit DAC enabled
scsi0 : ServeRAID
scsi 0:0:0:0: Direct-Access     IBM      Drive 1          V1.0 PQ: 0 ANSI: 2
scsi 0:0:1:0: Direct-Access     IBM      Drive 2          V1.0 PQ: 0 ANSI: 2
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611dd0: qla1280_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611fa0: sym2_init+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806120b0: init_sd+0x0/0x60()
afinfo corrupted at init/main.c:659
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
 sda: sda1 sda2 sda3
sd 0:0:0:0: Attached scsi removable disk sda
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
 sdb: sdb1 sdb2 sdb3
sd 0:0:1:0: Attached scsi removable disk sdb
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612110: fusion_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT base driver 3.04.01
Copyright (c) 1999-2005 LSI Logic Corporation
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612210: mptspi_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
Fusion MPT SPI Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806122d0: mptfc_init+0x0/0xf0()
afinfo corrupted at init/main.c:659
Fusion MPT FC Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806123c0: mptctl_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT misc device (ioctl) driver 3.04.01
mptctl: Registered with Fusion MPT base driver
mptctl: /dev/mptctl @ (major,minor=10,220)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806124c0: cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806125a0: i8042_init+0x0/0x350()
afinfo corrupted at init/main.c:659
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612a10: mousedev_init+0x0/0x100()
afinfo corrupted at init/main.c:659
mice: PS/2 mouse device common for all mice
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612b10: atkbd_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612e20: hwmon_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614c60: flow_cache_init+0x0/0x1d0()
afinfo corrupted at init/main.c:659
input: AT Translated Set 2 keyboard as /class/input/input0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806160f0: init_syncookies+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80616110: xfrm4_beet_init+0x0/0x20()
afinfo corrupted at init/main.c:659
Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
 [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #2
RIP: 0010:[<ffffffff80470666>]  [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
RSP: 0000:ffff810bffcbded0  EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000100000
RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 0000000000000002 R09: fffffffffffffffd
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
Stack:  0000000000000000 0000000000000000 ffffffff8061fee8 ffffffff802071d6
 6f6320726f727265 000036312d206564 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000090000
Call Trace:
 [<ffffffff802071d6>] init+0x1b6/0x3b0
 [<ffffffff8020aa28>] child_rip+0xa/0x12
 [<ffffffff80339542>] acpi_ds_init_one_object+0x0/0x82
 [<ffffffff80207020>] init+0x0/0x3b0
 [<ffffffff8020aa1e>] child_rip+0x0/0x12


Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 e5 fe ff
RIP  [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
 RSP <ffff810bffcbded0>
CR2: 0000000000000827
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!


-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06  2:23                                       ` Steve Fox
@ 2006-10-06 14:33                                         ` Mel Gorman
  2006-10-06 15:36                                           ` Vivek Goyal
  0 siblings, 1 reply; 140+ messages in thread
From: Mel Gorman @ 2006-10-06 14:33 UTC (permalink / raw)
  To: Steve Fox
  Cc: Andi Kleen, Badari Pulavarty, Martin Bligh, vgoyal,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On (05/10/06 21:23), Steve Fox didst pronounce:
> On Thu, 2006-10-05 at 22:50 +0200, Andi Kleen wrote:
> > On Thursday 05 October 2006 22:42, Steve Fox wrote:
> > > On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:
> > > 
> > > > Can you please try it again with this patch to narrow it down further?
> > > 
> > > Unfortunately this is as far as it got before it hung.
> > 
> > Boot with earlyprintk=serial,ttyS0,57600
> > (or change the panic in the checkfunction back to a printk) 
> 
> root (hd0,0)
>  Filesystem type is reiserfs, partition type 0x83
> kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.5
> 0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57
> 600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:116010
> 0417
>    [Linux-bzImage, setup=0x1400, size=0x1dd855]
> initrd /boot/initrd-autobench.img
>    [Linux-initrd @ 0x37cec000, 0x303f80 bytes]
> 
> Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> BIOS-provided physical RAM map:
>  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
>  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
>  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
>  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
>  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
>  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
>  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
>  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)

I continued what Steve was doing this morning to see could this be
pinned down. After placing 'CHECK;' in a few places as suggested by
Andi's check, the problem code was identified as that following in
mm/bootmem.c#init_bootmem_core()

        mapsize = get_mapsize(bdata);
        memset(bdata->node_bootmem_map, 0xff, mapsize);

That explains the value in the array at least. A few more printfs around
this point printed out the following in the boot log

init_bootmem_core(0, 1909, 0, 12582912)
init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
AAGH: afinfo corrupted at mm/bootmem.c:121

where;

1909 == mapstart
0 == start
12582912 == end
1572864 == mapsize

mapstart, start and end being the parameters being passed to
init_bootmem_core(). This means we are calling memset for the physical
range 0x775000 -> 0x8F5000 which is in a usable range according to the
BIOS-e820 map it appears.

However with 2.6.18-git22, a backout of the patch
x86_64-mm-re-positioning-the-bss-segment.patch from 2.6.18-mm2 allowed the
machine to boot. As this patch moves the BSS past the end of the init section,
it seems that an unintentional side-effect of the patch that BSS ends up in
a place that init_bootmem clobbers it.

> end_pfn_map = 12582912
> kernel direct mapping tables up to c00000000 @ 8000-39000
> DMI 2.3 present.
> afinfo corrupted at arch/x86_64/kernel/setup.c:462
> afinfo corrupted at arch/x86_64/kernel/setup.c:467
> afinfo corrupted at arch/x86_64/kernel/setup.c:472
> afinfo corrupted at arch/x86_64/kernel/setup.c:483
> afinfo corrupted at arch/x86_64/kernel/setup.c:496
> afinfo corrupted at arch/x86_64/kernel/setup.c:504
> afinfo corrupted at arch/x86_64/kernel/setup.c:510
> afinfo corrupted at arch/x86_64/kernel/setup.c:529
> afinfo corrupted at arch/x86_64/kernel/setup.c:537
> Zone PFN ranges:
>   DMA             0 ->     4096
>   DMA32        4096 ->  1048576
>   Normal    1048576 -> 12582912
> early_node_map[3] active PFN ranges
>     0:        0 ->      154
>     0:      256 ->   786294
>     0:  1048576 -> 12582912
> afinfo corrupted at arch/x86_64/kernel/setup.c:540
> afinfo corrupted at arch/x86_64/kernel/setup.c:545
> ACPI: PM-Timer IO Port: 0x9c
> ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> Processor #0 (Bootup-CPU)
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> Processor #1
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
> Processor #6
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
> Processor #7
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
> Processor #16
> ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
> Processor #17
> ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
> Processor #22
> ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
> Processor #23
> ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
> Processor #32
> ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
> Processor #33
> ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
> Processor #38
> ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
> Processor #39
> ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
> Processor #48
> ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
> Processor #49
> ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
> Processor #54
> ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
> Processor #55
> ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
> Processor #64
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
> Processor #65
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
> Processor #70
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
> Processor #71
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
> Processor #80
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
> Processor #81
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
> Processor #86
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
> Processor #87
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
> Processor #96
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
> Processor #97
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
> Processor #102
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
> Processor #103
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
> Processor #112
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
> Processor #113
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
> Processor #118
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
> Processor #119
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
> ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
> IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
> ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
> IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
> ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
> IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
> ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
> IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
> ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
> IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
> ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
> IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
> ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
> IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
> Setting APIC routing to clustered
> ACPI: HPET id: 0x10142201 base: 0xfde84000
> afinfo corrupted at arch/x86_64/kernel/setup.c:559
> afinfo corrupted at arch/x86_64/kernel/setup.c:562
> Using ACPI (MADT) for SMP configuration information
> afinfo corrupted at arch/x86_64/kernel/setup.c:569
> afinfo corrupted at arch/x86_64/kernel/setup.c:572
> afinfo corrupted at arch/x86_64/kernel/setup.c:579
> afinfo corrupted at arch/x86_64/kernel/setup.c:582
> Nosave address range: 000000000009a000 - 000000000009b000
> Nosave address range: 000000000009b000 - 00000000000a0000
> Nosave address range: 00000000000a0000 - 00000000000e0000
> Nosave address range: 00000000000e0000 - 0000000000100000
> Nosave address range: 00000000bff76000 - 00000000bff77000
> Nosave address range: 00000000bff77000 - 00000000bff98000
> Nosave address range: 00000000bff98000 - 00000000bff99000
> Nosave address range: 00000000bff99000 - 00000000c0000000
> Nosave address range: 00000000c0000000 - 00000000fec00000
> Nosave address range: 00000000fec00000 - 0000000100000000
> afinfo corrupted at arch/x86_64/kernel/setup.c:585
> afinfo corrupted at arch/x86_64/kernel/setup.c:588
> afinfo corrupted at arch/x86_64/kernel/setup.c:596
> Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
> afinfo corrupted at arch/x86_64/kernel/setup.c:599
> afinfo corrupted at init/main.c:512
> SMP: Allowing 16 CPUs, 0 hotplug CPUs
> PERCPU: Allocating 33920 bytes of per cpu data
> afinfo corrupted at init/main.c:527
> Built 1 zonelists.  Total pages: 12147064
> Kernel command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> afinfo corrupted at init/main.c:536
> Initializing CPU#0
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> afinfo corrupted at init/main.c:545
> afinfo corrupted at init/main.c:548
> disabling early console
> Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> BIOS-provided physical RAM map:
>  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
>  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
>  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
>  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
>  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
>  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
>  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
>  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> end_pfn_map = 12582912
> DMI 2.3 present.
> afinfo corrupted at arch/x86_64/kernel/setup.c:462
> afinfo corrupted at arch/x86_64/kernel/setup.c:467
> afinfo corrupted at arch/x86_64/kernel/setup.c:472
> afinfo corrupted at arch/x86_64/kernel/setup.c:483
> afinfo corrupted at arch/x86_64/kernel/setup.c:496
> afinfo corrupted at arch/x86_64/kernel/setup.c:504
> afinfo corrupted at arch/x86_64/kernel/setup.c:510
> afinfo corrupted at arch/x86_64/kernel/setup.c:529
> afinfo corrupted at arch/x86_64/kernel/setup.c:537
> Zone PFN ranges:
>   DMA             0 ->     4096
>   DMA32        4096 ->  1048576
>   Normal    1048576 -> 12582912
> early_node_map[3] active PFN ranges
>     0:        0 ->      154
>     0:      256 ->   786294
>     0:  1048576 -> 12582912
> afinfo corrupted at arch/x86_64/kernel/setup.c:540
> afinfo corrupted at arch/x86_64/kernel/setup.c:545
> ACPI: PM-Timer IO Port: 0x9c
> ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> Processor #0 (Bootup-CPU)
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> Processor #1
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
> Processor #6
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
> Processor #7
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
> Processor #16
> ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
> Processor #17
> ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
> Processor #22
> ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
> Processor #23
> ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
> Processor #32
> ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
> Processor #33
> ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
> Processor #38
> ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
> Processor #39
> ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
> Processor #48
> ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
> Processor #49
> ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
> Processor #54
> ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
> Processor #55
> ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
> Processor #64
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
> Processor #65
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
> Processor #70
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
> Processor #71
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
> Processor #80
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
> Processor #81
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
> Processor #86
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
> Processor #87
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
> Processor #96
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
> Processor #97
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
> Processor #102
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
> Processor #103
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
> Processor #112
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
> Processor #113
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
> Processor #118
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
> Processor #119
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
> ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
> IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
> ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
> IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
> ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
> IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
> ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
> IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
> ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
> IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
> ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
> IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
> ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
> IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
> Setting APIC routing to clustered
> ACPI: HPET id: 0x10142201 base: 0xfde84000
> afinfo corrupted at arch/x86_64/kernel/setup.c:559
> afinfo corrupted at arch/x86_64/kernel/setup.c:562
> Using ACPI (MADT) for SMP configuration information
> afinfo corrupted at arch/x86_64/kernel/setup.c:569
> afinfo corrupted at arch/x86_64/kernel/setup.c:572
> afinfo corrupted at arch/x86_64/kernel/setup.c:579
> afinfo corrupted at arch/x86_64/kernel/setup.c:582
> Nosave address range: 000000000009a000 - 000000000009b000
> Nosave address range: 000000000009b000 - 00000000000a0000
> Nosave address range: 00000000000a0000 - 00000000000e0000
> Nosave address range: 00000000000e0000 - 0000000000100000
> Nosave address range: 00000000bff76000 - 00000000bff77000
> Nosave address range: 00000000bff77000 - 00000000bff98000
> Nosave address range: 00000000bff98000 - 00000000bff99000
> Nosave address range: 00000000bff99000 - 00000000c0000000
> Nosave address range: 00000000c0000000 - 00000000fec00000
> Nosave address range: 00000000fec00000 - 0000000100000000
> afinfo corrupted at arch/x86_64/kernel/setup.c:585
> afinfo corrupted at arch/x86_64/kernel/setup.c:588
> afinfo corrupted at arch/x86_64/kernel/setup.c:596
> Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
> afinfo corrupted at arch/x86_64/kernel/setup.c:599
> afinfo corrupted at init/main.c:512
> SMP: Allowing 16 CPUs, 0 hotplug CPUs
> PERCPU: Allocating 33920 bytes of per cpu data
> afinfo corrupted at init/main.c:527
> Built 1 zonelists.  Total pages: 12147064
> Kernel command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> afinfo corrupted at init/main.c:536
> Initializing CPU#0
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> afinfo corrupted at init/main.c:545
> afinfo corrupted at init/main.c:548
> disabling early console
> Console: colour VGA+ 80x25
> Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
> Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
> afinfo corrupted at init/main.c:582
> Checking aperture...
> PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> Placing software IO TLB between 0x310c2000 - 0x350c2000
> Memory: 48422908k/50331648k available (2566k kernel code, 858868k reserved, 1345k data, 184k init)
> afinfo corrupted at init/main.c:584
> Calibrating delay using timer specific routine.. 5678.09 BogoMIPS (lpj=11356196)
> afinfo corrupted at init/main.c:593
> afinfo corrupted at init/main.c:603
> Mount-cache hash table entries: 256
> afinfo corrupted at init/main.c:610
> afinfo corrupted at init/main.c:618
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> using mwait in idle threads.
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> CPU0: Thermal monitoring enabled (TM1)
> SMP alternatives: switching to UP code
> ACPI: Core revision 20060707
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> Using local APIC timer interrupts.
> result 10425595
> Detected 10.425 MHz APIC timer.
> afinfo corrupted at init/main.c:749
> SMP alternatives: switching to SMP code
> Booting processor 1/16 APIC 0x1
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 5671.84 BogoMIPS (lpj=11343696)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> CPU1: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 1: Syncing TSC to CPU 0.
> CPU 1: synchronized TSC with CPU 0 (last diff -2 cycles, maxerr 799 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 2/16 APIC 0x6
> Initializing CPU#2
> Calibrating delay using timer specific routine.. 5671.98 BogoMIPS (lpj=11343971)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 3
> CPU: Processor Core ID: 0
> CPU2: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 2: Syncing TSC to CPU 0.
> CPU 2: synchronized TSC with CPU 0 (last diff -184 cycles, maxerr 3349 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 3/16 APIC 0x7
> Initializing CPU#3
> Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344041)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 3
> CPU: Processor Core ID: 0
> CPU3: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 3: Syncing TSC to CPU 0.
> CPU 3: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 1989 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 4/16 APIC 0x10
> Initializing CPU#4
> Calibrating delay using timer specific routine.. 5672.07 BogoMIPS (lpj=11344144)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 8
> CPU: Processor Core ID: 0
> CPU4: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 4: Syncing TSC to CPU 0.
> CPU 4: synchronized TSC with CPU 0 (last diff 43 cycles, maxerr 3247 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 5/16 APIC 0x11
> Initializing CPU#5
> Calibrating delay using timer specific routine.. 5672.01 BogoMIPS (lpj=11344024)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 8
> CPU: Processor Core ID: 0
> CPU5: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 5: Syncing TSC to CPU 0.
> CPU 5: synchronized TSC with CPU 0 (last diff 21 cycles, maxerr 3349 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 6/16 APIC 0x16
> Initializing CPU#6
> Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344042)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 11
> CPU: Processor Core ID: 0
> CPU6: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 6: Syncing TSC to CPU 0.
> CPU 6: synchronized TSC with CPU 0 (last diff 257 cycles, maxerr 3383 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 7/16 APIC 0x17
> Initializing CPU#7
> Calibrating delay using timer specific routine.. 5672.10 BogoMIPS (lpj=11344218)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 11
> CPU: Processor Core ID: 0
> CPU7: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 7: Syncing TSC to CPU 0.
> CPU 7: synchronized TSC with CPU 0 (last diff 233 cycles, maxerr 3357 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 8/16 APIC 0x20
> Initializing CPU#8
> Calibrating delay using timer specific routine.. 5672.35 BogoMIPS (lpj=11344712)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 16
> CPU: Processor Core ID: 0
> CPU8: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 8: Syncing TSC to CPU 0.
> CPU 8: synchronized TSC with CPU 0 (last diff 140 cycles, maxerr 8509 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 9/16 APIC 0x21
> Initializing CPU#9
> Calibrating delay using timer specific routine.. 5672.25 BogoMIPS (lpj=11344515)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 16
> CPU: Processor Core ID: 0
> CPU9: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 9: Syncing TSC to CPU 0.
> CPU 9: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 7556 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 10/16 APIC 0x26
> Initializing CPU#10
> Calibrating delay using timer specific routine.. 5672.33 BogoMIPS (lpj=11344676)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 19
> CPU: Processor Core ID: 0
> CPU10: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 10: Syncing TSC to CPU 0.
> CPU 10: synchronized TSC with CPU 0 (last diff 405 cycles, maxerr 8126 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 11/16 APIC 0x27
> Initializing CPU#11
> Calibrating delay using timer specific routine.. 5672.46 BogoMIPS (lpj=11344939)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 19
> CPU: Processor Core ID: 0
> CPU11: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 11: Syncing TSC to CPU 0.
> CPU 11: synchronized TSC with CPU 0 (last diff -145 cycles, maxerr 8568 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 12/16 APIC 0x30
> Initializing CPU#12
> Calibrating delay using timer specific routine.. 5672.23 BogoMIPS (lpj=11344472)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 24
> CPU: Processor Core ID: 0
> CPU12: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 12: Syncing TSC to CPU 0.
> CPU 12: synchronized TSC with CPU 0 (last diff 419 cycles, maxerr 8602 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 13/16 APIC 0x31
> Initializing CPU#13
> Calibrating delay using timer specific routine.. 5672.34 BogoMIPS (lpj=11344689)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 24
> CPU: Processor Core ID: 0
> CPU13: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 13: Syncing TSC to CPU 0.
> CPU 13: synchronized TSC with CPU 0 (last diff 242 cycles, maxerr 8636 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 14/16 APIC 0x36
> Initializing CPU#14
> Calibrating delay using timer specific routine.. 5672.32 BogoMIPS (lpj=11344644)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 27
> CPU: Processor Core ID: 0
> CPU14: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 14: Syncing TSC to CPU 0.
> CPU 14: synchronized TSC with CPU 0 (last diff -272 cycles, maxerr 8109 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 15/16 APIC 0x37
> Initializing CPU#15
> Calibrating delay using timer specific routine.. 5672.21 BogoMIPS (lpj=11344423)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 27
> CPU: Processor Core ID: 0
> CPU15: Thermal monitoring enabled (TM1)
>                Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 15: Syncing TSC to CPU 0.
> CPU 15: synchronized TSC with CPU 0 (last diff -21 cycles, maxerr 8560 cycles)
> Brought up 16 CPUs
> testing NMI watchdog ... OK.
> time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
> time.c: Detected 2835.773 MHz processor.
> afinfo corrupted at init/main.c:755
> migration_cost=19,988
> afinfo corrupted at init/main.c:761
> afinfo corrupted at init/main.c:769
> Calling initcall 0xffffffff802166c0: init_smp_flush+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607a40: helper_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607dd0: pm_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607e50: ksysfs_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a720: filelock_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b230: init_script_binfmt+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b240: init_elf_binfmt+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614690: sock_init+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614e30: netlink_proto_init+0x0/0x1a0()
> afinfo corrupted at init/main.c:659
> NET: Registered protocol family 16
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c310: kobject_uevent_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c4a0: pcibus_class_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ca70: pci_driver_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ef30: tty_class_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fa20: vtconsole_class_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060cbb0: acpi_pci_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> ACPI: bus type pci registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d8ef: init_acpi_device_notify+0x0/0x4b()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80613aa0: pci_access_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> PCI: Using configuration type 1
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80605760: topology_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607770: param_sysfs_init+0x0/0x200()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80249d00: pm_sysrq_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060aee0: init_bio+0x0/0x110()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c1d0: genhd_device_init+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d702: acpi_init+0x0/0x1ed()
> afinfo corrupted at init/main.c:659
> ACPI: Interpreter enabled
> ACPI: Using IOAPIC for interrupt routing
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060dbd5: acpi_ec_init+0x0/0x62()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060dfee: acpi_pci_root_init+0x0/0x28()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e036: acpi_pci_link_init+0x0/0x48()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e1bc: acpi_power_init+0x0/0x77()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e233: acpi_system_init+0x0/0xc6()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e2f9: acpi_event_init+0x0/0x3f()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e338: acpi_scan_init+0x0/0x1ac()
> afinfo corrupted at init/main.c:659
> ACPI: PCI Root Bridge [VP00] (0000:00)
> PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1
> ACPI: PCI Root Bridge [VP01] (0000:01)
> ACPI: PCI Root Bridge [VP02] (0000:02)
> ACPI: PCI Root Bridge [VP03] (0000:04)
> ACPI: PCI Root Bridge [VP04] (0000:06)
> ACPI: PCI Root Bridge [VP05] (0000:08)
> ACPI: PCI Root Bridge [VP06] (0000:0a)
> ACPI: PCI Root Bridge [VP07] (0000:0c)
> ACPI: PCI Root Bridge [VP10] (0000:0e)
> ACPI: PCI Root Bridge [VP11] (0000:0f)
> ACPI: PCI Root Bridge [VP12] (0000:10)
> ACPI: PCI Root Bridge [VP13] (0000:12)
> ACPI: PCI Root Bridge [VP14] (0000:14)
> ACPI: PCI Root Bridge [VP15] (0000:16)
> ACPI: PCI Root Bridge [VP16] (0000:18)
> ACPI: PCI Root Bridge [VP17] (0000:1a)
> ACPI: PCI Root Bridge [VP20] (0000:1c)
> ACPI: PCI Root Bridge [VP21] (0000:1d)
> ACPI: PCI Root Bridge [VP22] (0000:1e)
> ACPI: PCI Root Bridge [VP23] (0000:20)
> ACPI: PCI Root Bridge [VP24] (0000:22)
> ACPI: PCI Root Bridge [VP25] (0000:24)
> ACPI: PCI Root Bridge [VP26] (0000:26)
> ACPI: PCI Root Bridge [VP27] (0000:28)
> ACPI: PCI Root Bridge [VP30] (0000:2a)
> ACPI: PCI Root Bridge [VP31] (0000:2b)
> ACPI: PCI Root Bridge [VP32] (0000:2c)
> ACPI: PCI Root Bridge [VP33] (0000:2e)
> ACPI: PCI Root Bridge [VP34] (0000:30)
> ACPI: PCI Root Bridge [VP35] (0000:32)
> ACPI: PCI Root Bridge [VP36] (0000:34)
> ACPI: PCI Root Bridge [VP37] (0000:36)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e654: acpi_cm_sbs_init+0x0/0xc()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e660: pnp_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> Linux Plug and Play Support v0.97 (c) Adam Belay
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e8f0: pnpacpi_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> pnp: PnP ACPI init
> pnp: PnP ACPI: found 47 devices
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060f490: misc_init+0x0/0x90()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80375670: cn_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117f0: init_scsi+0x0/0x90()
> afinfo corrupted at init/main.c:659
> SCSI subsystem initialized
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806124d0: serio_init+0x0/0xd0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806128f0: input_init+0x0/0x120()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d00: rtc_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d50: rtc_sysfs_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d60: rtc_proc_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d70: rtc_dev_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80613ad0: pci_acpi_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> PCI: Using ACPI for IRQ routing
> PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80613b80: pci_legacy_init+0x0/0x120()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614130: pcibios_irq_init+0x0/0x4f0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614620: pcibios_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614750: proto_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806148f0: net_dev_init+0x0/0x210()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614fd0: genl_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805fdfc0: late_hpet_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> hpet0: at MMIO 0xfde84000, IRQs 2, 8, 0
> hpet0: 3 64-bit timers, 3707069 Hz
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806000b0: pci_iommu_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> PCI-GART: No AMD northbridge found.
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a6a0: init_pipe_fs+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e524: acpi_motherboard_init+0x0/0x130()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e790: pnp_system_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> pnp: 00:0a: ioport range 0x400-0x47f has been reserved
> pnp: 00:0a: ioport range 0x480-0x4ff could not be reserved
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ec70: chr_dev_init+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610a40: firmware_class_init+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806134b0: pcibios_assign_resources+0x0/0x90()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806159e0: inet_init+0x0/0x400()
> afinfo corrupted at init/main.c:659
> NET: Registered protocol family 2
> IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
> TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
> TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
> TCP: Hash tables configured (established 262144 bind 65536)
> TCP reno registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8020db10: time_init_device+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805fe9f0: i8259A_init_sysfs+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805fe9c0: init_timer_sysfs+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805ff010: vsyscall_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805ff2a0: sbf_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600080: i8237A_init_sysfs+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600500: periodic_mcheck_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600530: mce_init_device+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600670: thermal_throttle_init_device+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806006e0: threshold_init_device+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80601ee0: init_lapic_sysfs+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80602a80: ioapic_init_sysfs+0x0/0xf0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8021d1f0: cache_sysfs_init+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80605870: x8664_sysctl_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80606d30: create_proc_profile+0x0/0x280()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607170: ioresources_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806072e0: timekeeping_init_device+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607400: uid_cache_init+0x0/0x90()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607970: init_posix_timers+0x0/0xd0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607a80: init_posix_cpu_timers+0x0/0xf0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607ba0: latency_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607c90: init_clocksource_sysfs+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607cf0: init_jiffies_clocksource+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607d00: init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607d70: proc_dma_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80245840: percpu_modinit+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607da0: kallsyms_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607e10: ikconfig_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80608f60: init_per_zone_pages_min+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609ed0: pdflush_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609f20: kswapd_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609f50: setup_vmstat+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609fc0: procswaps_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a030: hugetlb_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> Total HugeTLB memory allocated, 0
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a0a0: init_tmpfs+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a180: cpucache_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a6f0: fasync_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ae00: aio_setup+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b080: inotify_setup+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b090: inotify_user_setup+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b150: eventpoll_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b250: init_mbcache+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b280: dnotify_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b740: init_devpts_fs+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b780: init_reiserfs_fs+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b800: init_ext3_fs+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b930: journal_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ba10: init_ext2_fs+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bad0: init_ramfs_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bae0: init_hugetlbfs_fs+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bba0: init_fat_fs+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bbf0: init_vfat_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc00: init_nls_cp437+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc10: init_nls_iso8859_1+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc20: init_autofs_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> initcall at 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10(): returned with error code -16
> Calling initcall 0xffffffff8060bc40: ipc_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bf10: init_mqueue_fs+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bff0: crypto_algapi_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c030: init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c040: init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c230: noop_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> io scheduler noop registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c240: as_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> io scheduler anticipatory registered (default)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c250: deadline_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> io scheduler deadline registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c260: cfq_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> io scheduler cfq registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8032c1d0: pci_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ca80: pci_sysfs_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060cac0: pci_proc_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d93a: acpi_ac_init+0x0/0x45()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d97f: acpi_battery_init+0x0/0x45()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060df90: acpi_video_init+0x0/0x5e()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e07e: irqrouter_init_sysfs+0x0/0x38()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ed10: rand_initialize+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ed40: tty_init+0x0/0x1f0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060efa0: pty_init+0x0/0x4f0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fae0: hpet_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fb50: agp_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> Linux agpgart interface v0.101 (c) Dave Jones
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fcb0: cn_proc_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806100f0: serial8250_init+0x0/0x150()
> afinfo corrupted at init/main.c:659
> Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610320: serial8250_pnp_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> 00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> 00:04: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610330: serial8250_pci_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80384c90: topology_sysfs_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610ac0: e1000_init_module+0x0/0x50()
> afinfo corrupted at init/main.c:659
> Intel(R) PRO/1000 Network Driver - version 7.2.9-k2
> Copyright (c) 1999-2006 Intel Corporation.
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610b10: tg3_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> tg3.c:v3.66 (September 23, 2006)
> ACPI: PCI Interrupt 0000:01:01.0[A] -> GSI 24 (level, low) -> IRQ 24
> eth0: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:54
> eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth0: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:01:01.1[B] -> GSI 28 (level, low) -> IRQ 28
> eth1: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:55
> eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth1: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:0f:01.0[A] -> GSI 96 (level, low) -> IRQ 96
> eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0c
> eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth2: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:0f:01.1[B] -> GSI 100 (level, low) -> IRQ 100
> eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0d
> eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth3: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:1d:01.0[A] -> GSI 168 (level, low) -> IRQ 168
> eth4: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6c
> eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth4: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:1d:01.1[B] -> GSI 172 (level, low) -> IRQ 172
> eth5: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6d
> eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth5: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:2b:01.0[A] -> GSI 240 (level, low) -> IRQ 240
> eth6: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:82
> eth6: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth6: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:2b:01.1[B] -> GSI 244 (level, low) -> IRQ 244
> eth7: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:83
> eth7: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth7: dma_rwctrl[769f0000] dma_mask[64-bit]
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610ba0: net_olddevs_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803a8630: init_netconsole+0x0/0x80()
> afinfo corrupted at init/main.c:659
> netconsole: not configured, aborting
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803a8710: cmd64x_ide_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610c70: piix_ide_init+0x0/0xd0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803aa810: svwks_ide_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803ab480: generic_ide_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610db0: ide_init+0x0/0x90()
> afinfo corrupted at init/main.c:659
> Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> SvrWks CSB6: IDE controller at PCI slot 0000:00:0f.1
> SvrWks CSB6: chipset revision 160
> SvrWks CSB6: not 100% native mode: will probe irqs later
>     ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA
> SvrWks CSB6: simplex device: DMA disabled
> ide1: SvrWks CSB6 Bus-Master DMA disabled (BIOS)
> hda: MATSHITADVD-ROM SR-8178, ATAPI CD/DVD-ROM drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611780: ide_generic_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117a0: idedisk_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117b0: ide_cdrom_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> hda: ATAPI 24X DVD-ROM drive, 256kB Cache, UDMA(66)
> Uniform CD-ROM driver Revision: 3.20
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117c0: idefloppy_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> ide-floppy driver 0.99.newide
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611a90: raid_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611aa0: spi_transport_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611ae0: fc_transport_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611b30: iscsi_transport_init+0x0/0x120()
> afinfo corrupted at init/main.c:659
> Loading iSCSI transport class v2.0-685.afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611c50: sas_transport_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611d10: iscsi_tcp_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> iscsi: registered transport (tcp)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611d60: aac_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> Adaptec aacraid driver (1.1-5[2409]-mh2)
> ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
> AAC0: kernel 5.0-2[8264]
> AAC0: monitor 5.0-2[8264]
> AAC0: bios 5.0-2[8264]
> AAC0: serial 162348
> AAC0: 64bit support enabled.
> AAC0: 64 Bit DAC enabled
> scsi0 : ServeRAID
> scsi 0:0:0:0: Direct-Access     IBM      Drive 1          V1.0 PQ: 0 ANSI: 2
> scsi 0:0:1:0: Direct-Access     IBM      Drive 2          V1.0 PQ: 0 ANSI: 2
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611dd0: qla1280_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611fa0: sym2_init+0x0/0x110()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806120b0: init_sd+0x0/0x60()
> afinfo corrupted at init/main.c:659
> SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
> sda: assuming Write Enabled
> sda: assuming drive cache: write through
> SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
> sda: assuming Write Enabled
> sda: assuming drive cache: write through
>  sda: sda1 sda2 sda3
> sd 0:0:0:0: Attached scsi removable disk sda
> SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
> sdb: assuming Write Enabled
> sdb: assuming drive cache: write through
> SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
> sdb: assuming Write Enabled
> sdb: assuming drive cache: write through
>  sdb: sdb1 sdb2 sdb3
> sd 0:0:1:0: Attached scsi removable disk sdb
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612110: fusion_init+0x0/0x100()
> afinfo corrupted at init/main.c:659
> Fusion MPT base driver 3.04.01
> Copyright (c) 1999-2005 LSI Logic Corporation
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612210: mptspi_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> Fusion MPT SPI Host driver 3.04.01
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806122d0: mptfc_init+0x0/0xf0()
> afinfo corrupted at init/main.c:659
> Fusion MPT FC Host driver 3.04.01
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806123c0: mptctl_init+0x0/0x100()
> afinfo corrupted at init/main.c:659
> Fusion MPT misc device (ioctl) driver 3.04.01
> mptctl: Registered with Fusion MPT base driver
> mptctl: /dev/mptctl @ (major,minor=10,220)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806124c0: cdrom_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806125a0: i8042_init+0x0/0x350()
> afinfo corrupted at init/main.c:659
> PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612a10: mousedev_init+0x0/0x100()
> afinfo corrupted at init/main.c:659
> mice: PS/2 mouse device common for all mice
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612b10: atkbd_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612e20: hwmon_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614c60: flow_cache_init+0x0/0x1d0()
> afinfo corrupted at init/main.c:659
> input: AT Translated Set 2 keyboard as /class/input/input0
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806160f0: init_syncookies+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80616110: xfrm4_beet_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
>  [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
> PGD 0
> Oops: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-git22 #2
> RIP: 0010:[<ffffffff80470666>]  [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
> RSP: 0000:ffff810bffcbded0  EFLAGS: 00010286
> RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000100000
> RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
> RBP: 00000000ffffffef R08: 0000000000000002 R09: fffffffffffffffd
> R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
> Stack:  0000000000000000 0000000000000000 ffffffff8061fee8 ffffffff802071d6
>  6f6320726f727265 000036312d206564 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000090000
> Call Trace:
>  [<ffffffff802071d6>] init+0x1b6/0x3b0
>  [<ffffffff8020aa28>] child_rip+0xa/0x12
>  [<ffffffff80339542>] acpi_ds_init_one_object+0x0/0x82
>  [<ffffffff80207020>] init+0x0/0x3b0
>  [<ffffffff8020aa1e>] child_rip+0x0/0x12
> 
> 
> Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 e5 fe ff
> RIP  [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
>  RSP <ffff810bffcbded0>
> CR2: 0000000000000827
>  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> 
> 
> -- 
> 
> Steve Fox
> IBM Linux Technology Center
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 14:33                                         ` Mel Gorman
@ 2006-10-06 15:36                                           ` Vivek Goyal
  2006-10-06 17:11                                             ` Mel Gorman
  0 siblings, 1 reply; 140+ messages in thread
From: Vivek Goyal @ 2006-10-06 15:36 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > BIOS-provided physical RAM map:
> >  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> >  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> >  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> >  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> >  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> >  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> >  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> 
> I continued what Steve was doing this morning to see could this be
> pinned down. After placing 'CHECK;' in a few places as suggested by
> Andi's check, the problem code was identified as that following in
> mm/bootmem.c#init_bootmem_core()
> 
>         mapsize = get_mapsize(bdata);
>         memset(bdata->node_bootmem_map, 0xff, mapsize);
> 
> That explains the value in the array at least. A few more printfs around
> this point printed out the following in the boot log
> 
> init_bootmem_core(0, 1909, 0, 12582912)
> init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> AAGH: afinfo corrupted at mm/bootmem.c:121
> 
> where;
> 
> 1909 == mapstart
> 0 == start
> 12582912 == end
> 1572864 == mapsize
> 
> mapstart, start and end being the parameters being passed to
> init_bootmem_core(). This means we are calling memset for the physical
> range 0x775000 -> 0x8F5000 which is in a usable range according to the
> BIOS-e820 map it appears.
> 

Hi Mel,

Where is bss placed in physical memory? I guess bss_start and bss_stop
from System.map will tell us. That will confirm that above memset step is
stomping over bss. Then we have to just find that somewhere probably
we allocated wrong physical memory area for bootmem allocator map.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 15:36                                           ` Vivek Goyal
@ 2006-10-06 17:11                                             ` Mel Gorman
  2006-10-06 17:34                                               ` Vivek Goyal
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 140+ messages in thread
From: Mel Gorman @ 2006-10-06 17:11 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On (06/10/06 11:36), Vivek Goyal didst pronounce:
> On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > BIOS-provided physical RAM map:
> > >  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > >  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > >  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > >  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > >  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > >  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > >  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> > 
> > I continued what Steve was doing this morning to see could this be
> > pinned down. After placing 'CHECK;' in a few places as suggested by
> > Andi's check, the problem code was identified as that following in
> > mm/bootmem.c#init_bootmem_core()
> > 
> >         mapsize = get_mapsize(bdata);
> >         memset(bdata->node_bootmem_map, 0xff, mapsize);
> > 
> > That explains the value in the array at least. A few more printfs around
> > this point printed out the following in the boot log
> > 
> > init_bootmem_core(0, 1909, 0, 12582912)
> > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > AAGH: afinfo corrupted at mm/bootmem.c:121
> > 
> > where;
> > 
> > 1909 == mapstart
> > 0 == start
> > 12582912 == end
> > 1572864 == mapsize
> > 
> > mapstart, start and end being the parameters being passed to
> > init_bootmem_core(). This means we are calling memset for the physical
> > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > BIOS-e820 map it appears.
> > 
> 
> Hi Mel,
> 

Hi.

> Where is bss placed in physical memory? I guess bss_start and bss_stop
> from System.map will tell us. That will confirm that above memset step is
> stomping over bss. Then we have to just find that somewhere probably
> we allocated wrong physical memory area for bootmem allocator map.
> 

BSS is at 0x643000 -> 0x777BC4
init_bootmem wipes from 0x777000 -> 0x8F7000

So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
fix is below. It adds a check in bad_addr() to see if the BSS section is
about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
the source of the problem even if it's not the 100% correct fix.

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c
--- linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c	2006-10-05 20:42:07.000000000 +0100
+++ linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c	2006-10-06 17:39:51.000000000 +0100
@@ -51,6 +51,7 @@ extern struct resource code_resource, da
 static inline int bad_addr(unsigned long *addrp, unsigned long size)
 { 
 	unsigned long addr = *addrp, last = addr + size; 
+	unsigned long bss_start, bss_end;
 
 	/* various gunk below that needed for SMP startup */
 	if (addr < 0x8000) { 
@@ -77,6 +78,14 @@ static inline int bad_addr(unsigned long
 		*addrp = __pa_symbol(&_end);
 		return 1;
 	}
+	
+	/* bss section */
+	bss_start = __pa_symbol(&__bss_start);
+	bss_end = PAGE_ALIGN(__pa_symbol(&__bss_stop));
+	if (addr >= bss_start && addr < bss_end) {
+		*addrp = bss_end;
+		return 1;
+	}
 
 	if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
 		*addrp = ebda_addr + ebda_size;

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 17:11                                             ` Mel Gorman
@ 2006-10-06 17:34                                               ` Vivek Goyal
  2006-10-06 17:59                                               ` Vivek Goyal
  2006-10-06 18:03                                               ` Steve Fox
  2 siblings, 0 replies; 140+ messages in thread
From: Vivek Goyal @ 2006-10-06 17:34 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On Fri, Oct 06, 2006 at 06:11:05PM +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > > Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > > BIOS-provided physical RAM map:
> > > >  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > > >  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > > >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > > >  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > > >  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > > >  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > > >  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > > >  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> > > 
> > > I continued what Steve was doing this morning to see could this be
> > > pinned down. After placing 'CHECK;' in a few places as suggested by
> > > Andi's check, the problem code was identified as that following in
> > > mm/bootmem.c#init_bootmem_core()
> > > 
> > >         mapsize = get_mapsize(bdata);
> > >         memset(bdata->node_bootmem_map, 0xff, mapsize);
> > > 
> > > That explains the value in the array at least. A few more printfs around
> > > this point printed out the following in the boot log
> > > 
> > > init_bootmem_core(0, 1909, 0, 12582912)
> > > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > > AAGH: afinfo corrupted at mm/bootmem.c:121
> > > 
> > > where;
> > > 
> > > 1909 == mapstart
> > > 0 == start
> > > 12582912 == end
> > > 1572864 == mapsize
> > > 
> > > mapstart, start and end being the parameters being passed to
> > > init_bootmem_core(). This means we are calling memset for the physical
> > > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > > BIOS-e820 map it appears.
> > > 
> > 
> > Hi Mel,
> > 
> 
> Hi.
> 
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> > 
> 
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
> 
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.
> 
> diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c
> --- linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c	2006-10-05 20:42:07.000000000 +0100
> +++ linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c	2006-10-06 17:39:51.000000000 +0100
> @@ -51,6 +51,7 @@ extern struct resource code_resource, da
>  static inline int bad_addr(unsigned long *addrp, unsigned long size)
>  { 
>  	unsigned long addr = *addrp, last = addr + size; 
> +	unsigned long bss_start, bss_end;
>  
>  	/* various gunk below that needed for SMP startup */
>  	if (addr < 0x8000) { 
> @@ -77,6 +78,14 @@ static inline int bad_addr(unsigned long
>  		*addrp = __pa_symbol(&_end);
>  		return 1;
>  	}
> +	
> +	/* bss section */
> +	bss_start = __pa_symbol(&__bss_start);
> +	bss_end = PAGE_ALIGN(__pa_symbol(&__bss_stop));
> +	if (addr >= bss_start && addr < bss_end) {
> +		*addrp = bss_end;
> +		return 1;
> +	}
>  

Surprising, the kernel code check just before this should have taken care
of it.

 /* kernel code */
	if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
		*addrp = __pa_symbol(&_end);
		return 1;
	}
May be it can be changed to 
	if (last >= __pa_symbol(&_text) && last < PAGE_ALIGN(__pa_symbol(&_end))) {

But all this seem to be a stopgap fix. Still the real puzzle is exactly
where did it slip out and should be fixed there.

May be some more printks will help us.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 17:11                                             ` Mel Gorman
  2006-10-06 17:34                                               ` Vivek Goyal
@ 2006-10-06 17:59                                               ` Vivek Goyal
  2006-10-06 18:03                                               ` Steve Fox
  2 siblings, 0 replies; 140+ messages in thread
From: Vivek Goyal @ 2006-10-06 17:59 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On Fri, Oct 06, 2006 at 06:11:05PM +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > > Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > > BIOS-provided physical RAM map:
> > > >  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > > >  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > > >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > > >  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > > >  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > > >  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > > >  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > > >  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> > > 
> > > I continued what Steve was doing this morning to see could this be
> > > pinned down. After placing 'CHECK;' in a few places as suggested by
> > > Andi's check, the problem code was identified as that following in
> > > mm/bootmem.c#init_bootmem_core()
> > > 
> > >         mapsize = get_mapsize(bdata);
> > >         memset(bdata->node_bootmem_map, 0xff, mapsize);
> > > 
> > > That explains the value in the array at least. A few more printfs around
> > > this point printed out the following in the boot log
> > > 
> > > init_bootmem_core(0, 1909, 0, 12582912)
> > > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > > AAGH: afinfo corrupted at mm/bootmem.c:121
> > > 
> > > where;
> > > 
> > > 1909 == mapstart
> > > 0 == start
> > > 12582912 == end
> > > 1572864 == mapsize
> > > 
> > > mapstart, start and end being the parameters being passed to
> > > init_bootmem_core(). This means we are calling memset for the physical
> > > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > > BIOS-e820 map it appears.
> > > 
> > 
> > Hi Mel,
> > 
> 
> Hi.
> 
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> > 
> 
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
> 
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.
> 

Ok, it looks like that code is assuming that memory area returned by
find_e820_area() is page aligned. I found two such instances and that's
what is leading to problem.

        bootmap_size = init_bootmem_node(NODE_DATA(nodeid),
                                         bootmap_start >> PAGE_SHIFT,
                                         start_pfn, end_pfn);

Here bootmap_start is not page aligned and I guess  currently should
contain the value 0x777BC4 (just beyond _end). But the moement I do
bootmap_start>>PAGE_SHIFT, I start stomping bss.

Similar is the case here.

        bootmap = find_e820_area(0, end_pfn<<PAGE_SHIFT, bootmap_size);
        if (bootmap == -1L)
                panic("Cannot find bootmem map of size %ld\n",bootmap_size);
        bootmap_size = init_bootmem(bootmap >> PAGE_SHIFT, end_pfn);

So may be we should return a page aligned address from find_e820_area(). 
May be we can change bad_addr() to set *addrp to next page aligned 
boundary for every check?

 	*addrp = PAGE_ALIGN(__pa_symbol(&_end));

Thanks
Vivek

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 17:11                                             ` Mel Gorman
  2006-10-06 17:34                                               ` Vivek Goyal
  2006-10-06 17:59                                               ` Vivek Goyal
@ 2006-10-06 18:03                                               ` Steve Fox
  2006-10-06 20:04                                                 ` Vivek Goyal
  2 siblings, 1 reply; 140+ messages in thread
From: Steve Fox @ 2006-10-06 18:03 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Vivek Goyal, Andi Kleen, Badari Pulavarty, Martin Bligh,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> > 
> 
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
> 
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.

I was able to boot the machine with Mel's patch applied on top of
-git22.

-- 

Steve Fox
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 18:03                                               ` Steve Fox
@ 2006-10-06 20:04                                                 ` Vivek Goyal
  2006-10-09  9:53                                                   ` Mel Gorman
  0 siblings, 1 reply; 140+ messages in thread
From: Vivek Goyal @ 2006-10-06 20:04 UTC (permalink / raw)
  To: Steve Fox, mel
  Cc: Andi Kleen, Badari Pulavarty, Martin Bligh, Andrew Morton, lkml,
	netdev, kmannth, Andy Whitcroft

On Fri, Oct 06, 2006 at 01:03:50PM -0500, Steve Fox wrote:
> On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
> > On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > > from System.map will tell us. That will confirm that above memset step is
> > > stomping over bss. Then we have to just find that somewhere probably
> > > we allocated wrong physical memory area for bootmem allocator map.
> > > 
> > 
> > BSS is at 0x643000 -> 0x777BC4
> > init_bootmem wipes from 0x777000 -> 0x8F7000
> > 
> > So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> > pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> > fix is below. It adds a check in bad_addr() to see if the BSS section is
> > about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> > the source of the problem even if it's not the 100% correct fix.
> 
> I was able to boot the machine with Mel's patch applied on top of
> -git22.


Please have a look at the attached patch. Does it make some sense. 

Steve, can you please give this patch a try if it fixes the problem?

Thanks
Vivek




o Currently some code pieces assume that address returned by find_e820_area()
  are page aligned. But looks like find_e820_area() had no such intention
  and hence one might end up stomping over some of the data. One such
  case is bootmem allocator initialization code stomped over bss.

o This patch modified find_e820_area() to return page aligned address. This
  might be little wasteful of memory but at the same time probably it is
  easier to handle page aligned memory. 

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
---

 arch/x86_64/kernel/e820.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff -puN arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area arch/x86_64/kernel/e820.c
--- linux-2.6.19-rc1-1M/arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area	2006-10-06 15:28:13.000000000 -0400
+++ linux-2.6.19-rc1-1M-root/arch/x86_64/kernel/e820.c	2006-10-06 15:44:45.000000000 -0400
@@ -54,13 +54,13 @@ static inline int bad_addr(unsigned long
 
 	/* various gunk below that needed for SMP startup */
 	if (addr < 0x8000) { 
-		*addrp = 0x8000;
+		*addrp = PAGE_ALIGN(0x8000);
 		return 1; 
 	}
 
 	/* direct mapping tables of the kernel */
 	if (last >= table_start<<PAGE_SHIFT && addr < table_end<<PAGE_SHIFT) { 
-		*addrp = table_end << PAGE_SHIFT; 
+		*addrp = PAGE_ALIGN(table_end << PAGE_SHIFT);
 		return 1;
 	} 
 
@@ -68,18 +68,18 @@ static inline int bad_addr(unsigned long
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (LOADER_TYPE && INITRD_START && last >= INITRD_START && 
 	    addr < INITRD_START+INITRD_SIZE) { 
-		*addrp = INITRD_START + INITRD_SIZE; 
+		*addrp = PAGE_ALIGN(INITRD_START + INITRD_SIZE);
 		return 1;
 	} 
 #endif
 	/* kernel code */
-	if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
-		*addrp = __pa_symbol(&_end);
+	if (last >= __pa_symbol(&_text) && addr < __pa_symbol(&_end)) {
+		*addrp = PAGE_ALIGN(__pa_symbol(&_end));
 		return 1;
 	}
 
 	if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
-		*addrp = ebda_addr + ebda_size;
+		*addrp = PAGE_ALIGN(ebda_addr + ebda_size);
 		return 1;
 	}
 
@@ -152,7 +152,7 @@ unsigned long __init find_e820_area(unsi
 			continue; 
 		while (bad_addr(&addr, size) && addr+size <= ei->addr+ei->size)
 			;
-		last = addr + size;
+		last = PAGE_ALIGN(addr) + size;
 		if (last > ei->addr + ei->size)
 			continue;
 		if (last > end) 
_

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-06 20:04                                                 ` Vivek Goyal
@ 2006-10-09  9:53                                                   ` Mel Gorman
  2006-10-16 18:16                                                     ` Vivek Goyal
  0 siblings, 1 reply; 140+ messages in thread
From: Mel Gorman @ 2006-10-09  9:53 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh,
	Andrew Morton, lkml, netdev, kmannth, Andy Whitcroft

On Fri, 6 Oct 2006, Vivek Goyal wrote:

> On Fri, Oct 06, 2006 at 01:03:50PM -0500, Steve Fox wrote:
>> On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
>>> On (06/10/06 11:36), Vivek Goyal didst pronounce:
>>>> Where is bss placed in physical memory? I guess bss_start and bss_stop
>>>> from System.map will tell us. That will confirm that above memset step is
>>>> stomping over bss. Then we have to just find that somewhere probably
>>>> we allocated wrong physical memory area for bootmem allocator map.
>>>>
>>>
>>> BSS is at 0x643000 -> 0x777BC4
>>> init_bootmem wipes from 0x777000 -> 0x8F7000
>>>
>>> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
>>> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
>>> fix is below. It adds a check in bad_addr() to see if the BSS section is
>>> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
>>> the source of the problem even if it's not the 100% correct fix.
>>
>> I was able to boot the machine with Mel's patch applied on top of
>> -git22.
>
>
> Please have a look at the attached patch. Does it make some sense.
>

It makes some sense. As you state, it wastes memory but that is better 
than breaking.

> Steve, can you please give this patch a try if it fixes the problem?
>

I boottested the patch on the same machine as Steve was using and it 
completed successfully.

> Thanks
> Vivek
>
>
>
>
> o Currently some code pieces assume that address returned by find_e820_area()
>  are page aligned. But looks like find_e820_area() had no such intention
>  and hence one might end up stomping over some of the data. One such
>  case is bootmem allocator initialization code stomped over bss.
>
> o This patch modified find_e820_area() to return page aligned address. This
>  might be little wasteful of memory but at the same time probably it is
>  easier to handle page aligned memory.
>
> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
> ---
>
> arch/x86_64/kernel/e820.c |   14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff -puN arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area arch/x86_64/kernel/e820.c
> --- linux-2.6.19-rc1-1M/arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area	2006-10-06 15:28:13.000000000 -0400
> +++ linux-2.6.19-rc1-1M-root/arch/x86_64/kernel/e820.c	2006-10-06 15:44:45.000000000 -0400
> @@ -54,13 +54,13 @@ static inline int bad_addr(unsigned long
>
> 	/* various gunk below that needed for SMP startup */
> 	if (addr < 0x8000) {
> -		*addrp = 0x8000;
> +		*addrp = PAGE_ALIGN(0x8000);
> 		return 1;
> 	}
>
> 	/* direct mapping tables of the kernel */
> 	if (last >= table_start<<PAGE_SHIFT && addr < table_end<<PAGE_SHIFT) {
> -		*addrp = table_end << PAGE_SHIFT;
> +		*addrp = PAGE_ALIGN(table_end << PAGE_SHIFT);
> 		return 1;
> 	}
>
> @@ -68,18 +68,18 @@ static inline int bad_addr(unsigned long
> #ifdef CONFIG_BLK_DEV_INITRD
> 	if (LOADER_TYPE && INITRD_START && last >= INITRD_START &&
> 	    addr < INITRD_START+INITRD_SIZE) {
> -		*addrp = INITRD_START + INITRD_SIZE;
> +		*addrp = PAGE_ALIGN(INITRD_START + INITRD_SIZE);
> 		return 1;
> 	}
> #endif
> 	/* kernel code */
> -	if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
> -		*addrp = __pa_symbol(&_end);
> +	if (last >= __pa_symbol(&_text) && addr < __pa_symbol(&_end)) {
> +		*addrp = PAGE_ALIGN(__pa_symbol(&_end));
> 		return 1;
> 	}
>
> 	if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
> -		*addrp = ebda_addr + ebda_size;
> +		*addrp = PAGE_ALIGN(ebda_addr + ebda_size);
> 		return 1;
> 	}
>
> @@ -152,7 +152,7 @@ unsigned long __init find_e820_area(unsi
> 			continue;
> 		while (bad_addr(&addr, size) && addr+size <= ei->addr+ei->size)
> 			;
> -		last = addr + size;
> +		last = PAGE_ALIGN(addr) + size;
> 		if (last > ei->addr + ei->size)
> 			continue;
> 		if (last > end)
> _
>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: md deadlock (was Re: 2.6.18-mm2)
  2006-10-02 13:47         ` Peter Zijlstra
@ 2006-10-10  3:53           ` Neil Brown
  2006-10-10  6:42             ` Ingo Molnar
  0 siblings, 1 reply; 140+ messages in thread
From: Neil Brown @ 2006-10-10  3:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michal Piotrowski, Andrew Morton, Ingo Molnar, linux-raid, linux-kernel


Hi,
 would this be an appropriate fix do the warning lockdep gives about
 possible deadlocks in md.

 The warning is currently easily triggered with
   mdadm -C /dev/md1 -l1 -n1 /dev/sdc missing

 (assuming /dev/sdc is a device that you are happy to be scribbled on).

 This will take ->reconfig_mutex on md1 while holding bd_mutex,
 then will take bd_mutex on sdc while holding reconfig_mutex on md1

 This superficial deadlock isn't a real problem because the bd_mutexes
 are on different devices and there is an hierarchical relationship
 which avoids the loop necessary for a deadlock.

-----------------------
Avoid lockdep warning in md.

md_open takes ->reconfig_mutex which causes lockdep to complain.
This (normally) doesn't have deadlock potential as the possible
conflict is with a reconfig_mutex in a different device.

I say "normally" because if a loop were created in the array->member
hierarchy a deadlock could happen.  However that causes bigger
problems than a deadlock and should be fixed independently.

So we flag the lock in md_open as a nested lock.  This requires
defining mutex_lock_interruptible_nested.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c       |    2 +-
 ./include/linux/mutex.h |    3 ++-
 ./kernel/mutex.c        |    8 ++++++++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2006-10-09 14:25:11.000000000 +1000
+++ ./drivers/md/md.c	2006-10-10 12:28:35.000000000 +1000
@@ -4422,7 +4422,7 @@ static int md_open(struct inode *inode, 
 	mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
 	int err;
 
-	if ((err = mddev_lock(mddev)))
+	if ((err = mutex_lock_interruptible_nested(&mddev->reconfig_mutex, 1)))
 		goto out;
 
 	err = 0;

diff .prev/include/linux/mutex.h ./include/linux/mutex.h
--- .prev/include/linux/mutex.h	2006-10-10 12:37:04.000000000 +1000
+++ ./include/linux/mutex.h	2006-10-10 12:40:20.000000000 +1000
@@ -125,8 +125,9 @@ extern int fastcall mutex_lock_interrupt
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
+extern int mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass);
 #else
-# define mutex_lock_nested(lock, subclass) mutex_lock(lock)
+# define mutex_lock_interruptible_nested(lock, subclass) mutex_interruptible_lock(lock)
 #endif
 
 /*

diff .prev/kernel/mutex.c ./kernel/mutex.c
--- .prev/kernel/mutex.c	2006-10-10 12:35:54.000000000 +1000
+++ ./kernel/mutex.c	2006-10-10 13:20:04.000000000 +1000
@@ -206,6 +206,14 @@ mutex_lock_nested(struct mutex *lock, un
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_nested);
+int __sched
+mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
+{
+	might_sleep();
+	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, subclass);
+}
+
+EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
 #endif
 
 /*

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: md deadlock (was Re: 2.6.18-mm2)
  2006-10-10  3:53           ` Neil Brown
@ 2006-10-10  6:42             ` Ingo Molnar
  0 siblings, 0 replies; 140+ messages in thread
From: Ingo Molnar @ 2006-10-10  6:42 UTC (permalink / raw)
  To: Neil Brown
  Cc: Peter Zijlstra, Michal Piotrowski, Andrew Morton, linux-raid,
	linux-kernel


* Neil Brown <neilb@suse.de> wrote:

> --- .prev/include/linux/mutex.h	2006-10-10 12:37:04.000000000 +1000
> +++ ./include/linux/mutex.h	2006-10-10 12:40:20.000000000 +1000
> @@ -125,8 +125,9 @@ extern int fastcall mutex_lock_interrupt
>  
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
> +extern int mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass);
>  #else
> -# define mutex_lock_nested(lock, subclass) mutex_lock(lock)
> +# define mutex_lock_interruptible_nested(lock, subclass) mutex_interruptible_lock(lock)
>  #endif

>  EXPORT_SYMBOL_GPL(mutex_lock_nested);
> +int __sched
> +mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
> +{
> +	might_sleep();
> +	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, subclass);
> +}
> +
> +EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);

looks good to me. (small style nit: maybe insert a newline after the 
first EXPORT_SYMBOL_GPL line)

Acked-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-09  9:53                                                   ` Mel Gorman
@ 2006-10-16 18:16                                                     ` Vivek Goyal
  2006-10-16 23:58                                                       ` Andrew Morton
  0 siblings, 1 reply; 140+ messages in thread
From: Vivek Goyal @ 2006-10-16 18:16 UTC (permalink / raw)
  To: Morton Andrew Morton
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh, lkml,
	netdev, kmannth, Andy Whitcroft, Adrian Bunk, Mel Gorman

On Mon, Oct 09, 2006 at 10:53:58AM +0100, Mel Gorman wrote:
> On Fri, 6 Oct 2006, Vivek Goyal wrote:
> 
> >On Fri, Oct 06, 2006 at 01:03:50PM -0500, Steve Fox wrote:
> >>On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
> >>>On (06/10/06 11:36), Vivek Goyal didst pronounce:
> >>>>Where is bss placed in physical memory? I guess bss_start and bss_stop
> >>>>from System.map will tell us. That will confirm that above memset step 
> >>>>is
> >>>>stomping over bss. Then we have to just find that somewhere probably
> >>>>we allocated wrong physical memory area for bootmem allocator map.
> >>>>
> >>>
> >>>BSS is at 0x643000 -> 0x777BC4
> >>>init_bootmem wipes from 0x777000 -> 0x8F7000
> >>>
> >>>So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> >>>pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> >>>fix is below. It adds a check in bad_addr() to see if the BSS section is
> >>>about to be used for bootmap. It Seems To Work For Me (tm) and 
> >>>illustrates
> >>>the source of the problem even if it's not the 100% correct fix.
> >>
> >>I was able to boot the machine with Mel's patch applied on top of
> >>-git22.
> >
> >
> >Please have a look at the attached patch. Does it make some sense.
> >
> 
> It makes some sense. As you state, it wastes memory but that is better 
> than breaking.
> 
> >Steve, can you please give this patch a try if it fixes the problem?
> >
> 
> I boottested the patch on the same machine as Steve was using and it 
> completed successfully.
>

Hi Andrew,

Can you please have a look at the attached patch and include it in -mm.
This fixes the issue for steve. It also figures in the list of Adrian Bunk
of known regressions.

Subject    : oops in xfrm_register_mode
References : http://lkml.org/lkml/2006/10/4/170
Submitter  : Steve Fox <drfickle@us.ibm.com>
Handled-By : Vivek Goyal <vgoyal@in.ibm.com>
Status     : patch available



o Currently some code pieces assume that address returned by find_e820_area()
  are page aligned. But looks like find_e820_area() had no such intention
  and hence one might end up stomping over some of the data. One such
  case is bootmem allocator initialization code stomped over bss.

o This patch modified find_e820_area() to return page aligned address. This
  might be little wasteful of memory but at the same time probably it is
  easier to handle page aligned memory. 

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
---

 arch/x86_64/kernel/e820.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff -puN arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area arch/x86_64/kernel/e820.c
--- linux-2.6.19-rc1-1M/arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area	2006-10-06 15:28:13.000000000 -0400
+++ linux-2.6.19-rc1-1M-root/arch/x86_64/kernel/e820.c	2006-10-06 15:44:45.000000000 -0400
@@ -54,13 +54,13 @@ static inline int bad_addr(unsigned long
 
 	/* various gunk below that needed for SMP startup */
 	if (addr < 0x8000) { 
-		*addrp = 0x8000;
+		*addrp = PAGE_ALIGN(0x8000);
 		return 1; 
 	}
 
 	/* direct mapping tables of the kernel */
 	if (last >= table_start<<PAGE_SHIFT && addr < table_end<<PAGE_SHIFT) { 
-		*addrp = table_end << PAGE_SHIFT; 
+		*addrp = PAGE_ALIGN(table_end << PAGE_SHIFT);
 		return 1;
 	} 
 
@@ -68,18 +68,18 @@ static inline int bad_addr(unsigned long
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (LOADER_TYPE && INITRD_START && last >= INITRD_START && 
 	    addr < INITRD_START+INITRD_SIZE) { 
-		*addrp = INITRD_START + INITRD_SIZE; 
+		*addrp = PAGE_ALIGN(INITRD_START + INITRD_SIZE);
 		return 1;
 	} 
 #endif
 	/* kernel code */
-	if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
-		*addrp = __pa_symbol(&_end);
+	if (last >= __pa_symbol(&_text) && addr < __pa_symbol(&_end)) {
+		*addrp = PAGE_ALIGN(__pa_symbol(&_end));
 		return 1;
 	}
 
 	if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
-		*addrp = ebda_addr + ebda_size;
+		*addrp = PAGE_ALIGN(ebda_addr + ebda_size);
 		return 1;
 	}
 
@@ -152,7 +152,7 @@ unsigned long __init find_e820_area(unsi
 			continue; 
 		while (bad_addr(&addr, size) && addr+size <= ei->addr+ei->size)
 			;
-		last = addr + size;
+		last = PAGE_ALIGN(addr) + size;
 		if (last > ei->addr + ei->size)
 			continue;
 		if (last > end) 
_

^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-16 18:16                                                     ` Vivek Goyal
@ 2006-10-16 23:58                                                       ` Andrew Morton
  2006-10-17 12:18                                                         ` Adrian Bunk
  0 siblings, 1 reply; 140+ messages in thread
From: Andrew Morton @ 2006-10-16 23:58 UTC (permalink / raw)
  To: vgoyal
  Cc: Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh, lkml,
	netdev, kmannth, Andy Whitcroft, Adrian Bunk, Mel Gorman

On Mon, 16 Oct 2006 14:16:13 -0400
Vivek Goyal <vgoyal@in.ibm.com> wrote:

> 
> Can you please have a look at the attached patch

Looks like a fine patch to me, although it could benefit from a comment
explaining why all those PAGE_ALIGN()s are in there.

> and include it in -mm.

Does it fix a patch in -mm or is it needed in mainline?



^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-16 23:58                                                       ` Andrew Morton
@ 2006-10-17 12:18                                                         ` Adrian Bunk
  2006-10-17 17:32                                                           ` Mel Gorman
  0 siblings, 1 reply; 140+ messages in thread
From: Adrian Bunk @ 2006-10-17 12:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: vgoyal, Steve Fox, Andi Kleen, Badari Pulavarty, Martin Bligh,
	lkml, netdev, kmannth, Andy Whitcroft, Mel Gorman

On Mon, Oct 16, 2006 at 04:58:14PM -0700, Andrew Morton wrote:
> On Mon, 16 Oct 2006 14:16:13 -0400
> Vivek Goyal <vgoyal@in.ibm.com> wrote:
> 
> > 
> > Can you please have a look at the attached patch
> 
> Looks like a fine patch to me, although it could benefit from a comment
> explaining why all those PAGE_ALIGN()s are in there.
> 
> > and include it in -mm.
> 
> Does it fix a patch in -mm or is it needed in mainline?

The bug in my list was reported to be present in mainline [1].

cu
Adrian

[1] http://lkml.org/lkml/2006/10/4/394

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 140+ messages in thread

* Re: 2.6.18-mm2 boot failure on x86-64
  2006-10-17 12:18                                                         ` Adrian Bunk
@ 2006-10-17 17:32                                                           ` Mel Gorman
  0 siblings, 0 replies; 140+ messages in thread
From: Mel Gorman @ 2006-10-17 17:32 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Andrew Morton, vgoyal, Steve Fox, Andi Kleen, Badari Pulavarty,
	Martin Bligh, lkml, netdev, kmannth, Andy Whitcroft

On Tue, 17 Oct 2006, Adrian Bunk wrote:

> On Mon, Oct 16, 2006 at 04:58:14PM -0700, Andrew Morton wrote:
>> On Mon, 16 Oct 2006 14:16:13 -0400
>> Vivek Goyal <vgoyal@in.ibm.com> wrote:
>>
>>>
>>> Can you please have a look at the attached patch
>>
>> Looks like a fine patch to me, although it could benefit from a comment
>> explaining why all those PAGE_ALIGN()s are in there.
>>
>>> and include it in -mm.
>>
>> Does it fix a patch in -mm or is it needed in mainline?
>
> The bug in my list was reported to be present in mainline [1].
>

Confirmed. This bug is present in 2.6.19-rc2

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 140+ messages in thread

end of thread, other threads:[~2006-10-17 17:32 UTC | newest]

Thread overview: 140+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-28  8:46 2.6.18-mm2 Andrew Morton
2006-09-28 11:54 ` 2.6.18-mm2 Michal Piotrowski
2006-09-29 12:12   ` md deadlock (was Re: 2.6.18-mm2) Peter Zijlstra
2006-09-29 12:52     ` Neil Brown
2006-09-29 14:03       ` Peter Zijlstra
2006-10-02 13:47         ` Peter Zijlstra
2006-10-10  3:53           ` Neil Brown
2006-10-10  6:42             ` Ingo Molnar
2006-09-28 17:50 ` 2.6.18-mm2 Steve Fox
2006-09-28 19:00   ` 2.6.18-mm2 thunder7
2006-09-28 21:01   ` 2.6.18-mm2 Andrew Morton
2006-09-28 22:45     ` 2.6.18-mm2 Stephen Hemminger
2006-10-04 13:42     ` 2.6.18-mm2 boot failure on x86-64 Steve Fox
2006-10-04 15:45       ` Andrew Morton
2006-10-04 15:55         ` Vivek Goyal
2006-10-04 15:56         ` Andi Kleen
2006-10-05  1:57           ` Keith Mannthey
2006-10-04 16:41         ` Steve Fox
2006-10-05  0:06           ` Andrew Morton
2006-10-05  0:51             ` Vivek Goyal
2006-10-05  0:57               ` Andi Kleen
2006-10-05  1:08                 ` Martin Bligh
2006-10-05  2:05                   ` Keith Mannthey
2006-10-05 14:53                   ` Steve Fox
2006-10-05 15:12                     ` Badari Pulavarty
2006-10-05 15:32                       ` Steve Fox
2006-10-05 15:40                         ` Andi Kleen
2006-10-05 17:57                           ` Steve Fox
2006-10-05 18:27                             ` Andi Kleen
2006-10-05 18:51                               ` Steve Fox
2006-10-05 19:05                                 ` Andi Kleen
2006-10-05 20:42                                   ` Steve Fox
2006-10-05 20:50                                     ` Andi Kleen
2006-10-06  2:23                                       ` Steve Fox
2006-10-06 14:33                                         ` Mel Gorman
2006-10-06 15:36                                           ` Vivek Goyal
2006-10-06 17:11                                             ` Mel Gorman
2006-10-06 17:34                                               ` Vivek Goyal
2006-10-06 17:59                                               ` Vivek Goyal
2006-10-06 18:03                                               ` Steve Fox
2006-10-06 20:04                                                 ` Vivek Goyal
2006-10-09  9:53                                                   ` Mel Gorman
2006-10-16 18:16                                                     ` Vivek Goyal
2006-10-16 23:58                                                       ` Andrew Morton
2006-10-17 12:18                                                         ` Adrian Bunk
2006-10-17 17:32                                                           ` Mel Gorman
2006-10-05 18:52                               ` Vivek Goyal
2006-10-05 19:08                                 ` Andi Kleen
2006-10-05 20:25                                   ` Steve Fox
2006-10-05 20:39                                   ` Mel Gorman
2006-10-05 20:51                                     ` Andi Kleen
2006-10-05 23:14                                       ` 2.6.18-mm2 boot failure on x86-64 II Andi Kleen
2006-10-05 23:32                                         ` keith mannthey
2006-10-05 23:35                                           ` Andi Kleen
2006-10-05 23:58                                             ` keith mannthey
2006-10-06  0:02                                               ` Badari Pulavarty
2006-10-06  0:12                                                 ` Andrew Morton
2006-09-28 22:39 ` 2.6.18-mm2 Jim Cromie
2006-09-28 23:08   ` 2.6.18-mm2 Andi Kleen
2006-09-29 20:14     ` 2.6.18-mm2 Ingo Molnar
2006-09-29 20:36       ` 2.6.18-mm2 Andi Kleen
2006-09-29 20:32         ` 2.6.18-mm2 Ingo Molnar
2006-09-29 20:58           ` 2.6.18-mm2 Andi Kleen
2006-09-29 21:14             ` [patch] fix !apic build breakage Ingo Molnar
2006-09-29 21:44               ` Andi Kleen
2006-09-29 21:41                 ` Ingo Molnar
2006-09-29 21:44             ` 2.6.18-mm2 Alan Cox
2006-09-29 21:36         ` 2.6.18-mm2 Dave Jones
2006-09-29 21:46           ` 2.6.18-mm2 Andi Kleen
2006-09-28 22:44 ` 2.6.18-mm2 Matthias Hentges
2006-09-29  3:19 ` 2.6.18-mm2 - oops in cache_alloc_refill() Valdis.Kletnieks
2006-09-29  3:29   ` Andrew Morton
2006-09-29  3:58     ` Valdis.Kletnieks
2006-09-29 15:19     ` Valdis.Kletnieks
2006-09-29 19:45       ` Andrew Morton
2006-09-30  0:01         ` Valdis.Kletnieks
2006-09-30  1:20           ` Andrew Morton
2006-09-30  1:33             ` Jean Tourrilhes
2006-09-30  3:31               ` Valdis.Kletnieks
2006-09-30  7:50                 ` Valdis.Kletnieks
2006-09-30  8:33                   ` Andrew Morton
2006-09-30  1:40             ` Jean Tourrilhes
2006-09-30  3:31               ` Valdis.Kletnieks
2006-09-30  1:57             ` Makefile for linux modules x z
2006-09-30  8:55               ` Sam Ravnborg
2006-09-30  1:59             ` x z
2006-10-02 17:52             ` 2.6.18-mm2 - oops in cache_alloc_refill() Jean Tourrilhes
2006-10-02 19:57               ` Valdis.Kletnieks
2006-10-03 15:58               ` Samuel Tardieu
2006-10-03 16:34                 ` Jean Tourrilhes
2006-10-03 16:45                   ` Samuel Tardieu
2006-10-03 17:07                     ` Jean Tourrilhes
2006-10-05 22:37                   ` Pavel Roskin
2006-10-05 22:42                     ` Jean Tourrilhes
2006-09-29 19:47       ` Christoph Lameter
2006-09-29 13:57 ` 2.6.18-mm2 J.A. Magallón
2006-09-29 14:39   ` 2.6.18-mm2 Matthew Wilcox
2006-09-29 17:15     ` 2.6.18-mm2 Alan Cox
2006-09-29 23:50       ` 2.6.18-mm2 Frederik Deweerdt
2006-09-29 23:43         ` 2.6.18-mm2 Alan Cox
2006-09-30 14:09           ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
2006-09-30 14:19             ` Alan Cox
2006-09-30 13:51               ` Willy Tarreau
2006-09-30 23:58             ` Jeff Garzik
2006-10-01 14:28               ` Matthew Wilcox
2006-10-01 19:05                 ` Arjan van de Ven
2006-10-01 19:19                   ` Jeff Garzik
2006-10-01 19:34                     ` Arjan van de Ven
2006-10-01 19:36                   ` Matthew Wilcox
2006-10-01 19:42                     ` Jeff Garzik
2006-10-02  2:12                     ` Arjan van de Ven
2006-10-02 20:00                       ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Frederik Deweerdt
2006-10-02 18:15                         ` Matthew Wilcox
2006-10-02 21:09                           ` Frederik Deweerdt
2006-10-02 20:07                         ` [RFC PATCH] move aic7xxx to pci_request_irq Frederik Deweerdt
2006-10-02 18:27                           ` Matthew Wilcox
2006-10-02 21:02                             ` Frederik Deweerdt
2006-10-03  3:45                           ` Arjan van de Ven
2006-10-02 20:11                         ` [RFC PATCH] move tg3 " Frederik Deweerdt
2006-10-02 18:28                           ` Matthew Wilcox
2006-10-02 21:04                             ` Frederik Deweerdt
2006-10-03  7:18                           ` Arjan van de Ven
2006-10-02 20:12                         ` [RFC PATCH] move drm " Frederik Deweerdt
2006-10-02 18:37                           ` Matthew Wilcox
2006-10-02 21:07                             ` Frederik Deweerdt
2006-10-02 20:36                           ` Alan Cox
2006-10-02 22:26                             ` Frederik Deweerdt
2006-10-02 23:54                           ` Dave Airlie
2006-10-03  7:17                             ` Frederik Deweerdt
2006-10-03  3:58                         ` [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity) Randy Dunlap
2006-10-01 21:31               ` [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2) Frederik Deweerdt
2006-09-30 15:26         ` 2.6.18-mm2 James Bottomley
2006-09-30 16:21           ` 2.6.18-mm2 Matthew Wilcox
2006-09-30 17:20             ` 2.6.18-mm2 Mark Rustad
2006-09-30 20:54           ` 2.6.18-mm2 Alan Cox
2006-09-29 23:15     ` 2.6.18-mm2 J.A. Magallón
2006-09-30  7:04 ` 2.6.18-mm2 - possible recursive locking detected Borislav Petkov
2006-09-30  8:28   ` Andrew Morton
2006-09-30 18:19     ` Davide Libenzi
     [not found] ` <20060930133706.GA3291@melchior.yamamaya.is-a-geek.org>
2006-09-30 19:53   ` 2.6.18-mm2 Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).