All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] core block bits for 2.6.37-rc1
@ 2010-10-22  7:57 Jens Axboe
  2010-10-23 15:29 ` [origin tree boot failure] " Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2010-10-22  7:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Hi Linus,

This first pull request is the core bits, meaning general
block layer changes or core support. Should be clean this time,
only 'weird bit' is the seemingly duplicate entry from Malahal.
This is caused by the first patch being buggy (and later
reverted), second patch used the same single line description.

Nothing really exciting in here. A good collection of fixes, some of
which are marked for stable as well.

The biggest addition this time around is the block IO throttling support
from Vivek.

Please pull.


  git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core

Christof Schmitt (1):
      zfcp: Report scatter gather limit for DIX protection information

Corrado Zoccolo (1):
      cfq: improve fsync performance for small files

Geert Uytterhoeven (1):
      block: Turn bvec_k{un,}map_irq() into static inline functions

Jens Axboe (3):
      core: match_dev_by_uuid() should not be marked __init
      do_mounts: only enable PARTUUID for CONFIG_BLOCK
      block: revert bad fix for memory hotplug causing bounces

Malahal Naineni (2):
      block: set the bounce_pfn to the actual DMA limit rather than to max memory
      block: set the bounce_pfn to the actual DMA limit rather than to max memory

Mark Lord (2):
      block: Prevent hang_check firing during long I/O
      Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK

Martin K. Petersen (5):
      Consolidate min_not_zero
      block/scsi: Provide a limit on the number of integrity segments
      block: Ensure physical block size is unsigned int
      block: Fix double free in blk_integrity_unregister
      block: Make the integrity mapped property a bio flag

Namhyung Kim (2):
      block: fix an address space warning in blk-map.c
      sg: fix a warning in blk_rq_aligned() call

San Mehat (1):
      block: block_dump: Add number of sectors to debug output

Signed-off-by: Jan Kara (1):
      block: Fix race during disk initialization

Vivek Goyal (16):
      blk-cgroup: Kill the header printed at the start of blkio.weight_device file
      blk-cgroup: Prepare the base for supporting more than one IO control policies
      blk-cgroup: Introduce cgroup changes for throttling policy
      blkio: Core implementation of throttle policy
      blk-cgroup: cgroup changes for IOPS limit support
      blkio: Implementation of IOPS limit logic
      blkio: Documentation Update
      blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n
      blkio: deletion of a cgroup was causes oops
      blkio: Add root group to td->tg_list
      blkio: Recalculate the throttled bio dispatch time upon throttle limit change
      blkio-throttle: Fix link failure failure on i386
      blkio-throttle: There is no need to convert jiffies to milli seconds
      blkio-throttle: limit max iops value to UINT_MAX
      blkio-throttle: Fix possible multiplication overflow in iops calculations
      cfq-iosched: Fix a gcc 4.5 warning and put some comments

Will Drewry (3):
      block, partition: add partition_meta_info to hd_struct
      genhd, efi: add efi partition metadata to hd_structs
      init: add support for root devices specified by partition UUID

Yasuaki Ishimatsu (1):
      block: fix accounting bug on cross partition merges

 Documentation/cgroups/blkio-controller.txt |  106 +++-
 block/Kconfig                              |   12 +
 block/Makefile                             |    1 +
 block/blk-cgroup.c                         |  804 ++++++++++++++++----
 block/blk-cgroup.h                         |   87 ++-
 block/blk-core.c                           |   53 ++-
 block/blk-exec.c                           |    9 +-
 block/blk-integrity.c                      |   94 ++-
 block/blk-map.c                            |    5 +-
 block/blk-merge.c                          |   25 +-
 block/blk-settings.c                       |   12 +-
 block/blk-sysfs.c                          |   11 +
 block/blk-throttle.c                       | 1123 ++++++++++++++++++++++++++++
 block/blk.h                                |   12 -
 block/cfq-iosched.c                        |   39 +-
 block/cfq.h                                |    2 +-
 block/genhd.c                              |   30 +-
 block/ioctl.c                              |    2 +-
 drivers/block/drbd/drbd_receiver.c         |    1 -
 drivers/md/dm-snap.c                       |    2 -
 drivers/md/dm-table.c                      |    5 -
 drivers/s390/scsi/zfcp_scsi.c              |    1 +
 drivers/scsi/hosts.c                       |    1 +
 drivers/scsi/scsi_lib.c                    |   26 +-
 drivers/scsi/scsi_sysfs.c                  |    2 +
 drivers/scsi/sd_dif.c                      |   11 +-
 drivers/scsi/sg.c                          |    2 +-
 fs/jbd/commit.c                            |    2 +-
 fs/jbd2/commit.c                           |    2 +-
 fs/partitions/check.c                      |   35 +-
 fs/partitions/check.h                      |    3 +
 fs/partitions/efi.c                        |   25 +
 include/linux/bio.h                        |   15 +-
 include/linux/blk_types.h                  |    6 +-
 include/linux/blkdev.h                     |   66 ++-
 include/linux/elevator.h                   |    2 +
 include/linux/genhd.h                      |   54 ++-
 include/linux/kernel.h                     |   10 +
 include/linux/sched.h                      |    3 +
 include/scsi/scsi.h                        |    6 +
 include/scsi/scsi_host.h                   |    7 +
 init/Kconfig                               |    9 +-
 init/do_mounts.c                           |   70 ++
 43 files changed, 2494 insertions(+), 299 deletions(-)
 create mode 100644 block/blk-throttle.c

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-22  7:57 [GIT PULL] core block bits for 2.6.37-rc1 Jens Axboe
@ 2010-10-23 15:29 ` Ingo Molnar
  2010-10-23 15:42   ` Linus Torvalds
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ingo Molnar @ 2010-10-23 15:29 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo; +Cc: Linus Torvalds, linux-kernel


Hi,

* Jens Axboe <jaxboe@fusionio.com> wrote:

> Hi Linus,
> 
> This first pull request is the core bits, meaning general
> block layer changes or core support. Should be clean this time,
> only 'weird bit' is the seemingly duplicate entry from Malahal.
> This is caused by the first patch being buggy (and later
> reverted), second patch used the same single line description.
> 
> Nothing really exciting in here. A good collection of fixes, some of
> which are marked for stable as well.
> 
> The biggest addition this time around is the block IO throttling support
> from Vivek.

The upstream block bits pulled in this merge window (or maybe the workqueue bits) 
are possibly the cause a boot crash on today's -tip, using a trivial x86 bootup test 
(64-bit allyesconfig):

[  116.064281] calling  hd_init+0x0/0x302 @ 1
[  116.068529] hd: no drives specified - use hd=cyl,head,sectors on kernel command line
[  116.076334] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[  116.080274] last sysfs file: 
[  116.080274] CPU 0 
[  116.080274] Modules linked in:
[  116.080274] 
[  116.080274] Pid: 1, comm: swapper Tainted: G        W   2.6.36-tip-03555-g825d9ec-dirty #51843 A8N-E/System Product Name
[  116.080274] RIP: 0010:[<ffffffff81064380>]  [<ffffffff81064380>] __ticket_spin_trylock+0x4/0x21
[  116.080274] RSP: 0018:ffff88003c417c10  EFLAGS: 00010082
[  116.080274] RAX: ffff88003c418000 RBX: 6b6b6b6b6b6b6b6a RCX: 0000000000000000
[  116.080274] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6a
[  116.080274] RBP: ffff88003c417c10 R08: 0000000000000002 R09: 0000000000000001
[  116.080274] R10: 0000000000000286 R11: ffff880032498738 R12: 6b6b6b6b6b6b6b82
[  116.080274] R13: 0000000000000286 R14: 6b6b6b6b6b6b6b6b R15: 0000000000000001
[  116.080274] FS:  0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000
[  116.080274] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  116.080274] CR2: 0000000000000000 CR3: 0000000004071000 CR4: 00000000000006f0
[  116.080274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  116.080274] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  116.080274] Process swapper (pid: 1, threadinfo ffff88003c416000, task ffff88003c418000)
[  116.080274] Stack:
[  116.080274]  ffff88003c417c30 ffffffff8168c6ee 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6a
[  116.080274] <0> ffff88003c417c70 ffffffff82d37a20 ffffffff810a1b65 ffff88003c418000
[  116.080274] <0> ffffffff82d3836b 6b6b6b6b6b6b6b6a ffff8800330fcc20 ffff88003c417cb8
[  116.080274] Call Trace:
[  116.080274]  [<ffffffff8168c6ee>] do_raw_spin_trylock+0x1f/0x41
[  116.080274]  [<ffffffff82d37a20>] _raw_spin_lock_irqsave+0x72/0xa4
[  116.080274]  [<ffffffff810a1b65>] ? lock_timer_base+0x2c/0x52
[  116.080274]  [<ffffffff82d3836b>] ? _raw_spin_unlock_irqrestore+0x55/0x72
[  116.080274]  [<ffffffff810a1b65>] lock_timer_base+0x2c/0x52
[  116.080274]  [<ffffffff810a1c43>] del_timer+0x2f/0x82
[  116.080274]  [<ffffffff810ac906>] ? wait_on_work+0x0/0xdb
[  116.080274]  [<ffffffff810aca18>] __cancel_work_timer+0x37/0x130
[  116.080274]  [<ffffffff810acb23>] cancel_delayed_work_sync+0x12/0x14
[  116.080274]  [<ffffffff8166974a>] throtl_shutdown_timer_wq+0x1c/0x1e
[  116.080274]  [<ffffffff8165dbec>] blk_sync_queue+0x3d/0x41
[  116.080274]  [<ffffffff8165f872>] blk_release_queue+0x1e/0x6a
[  116.080274]  [<ffffffff81673ce3>] kobject_release+0xf4/0x122
[  116.080274]  [<ffffffff81673bef>] ? kobject_release+0x0/0x122
[  116.080274]  [<ffffffff81674e7e>] kref_put+0x43/0x4d
[  116.080274]  [<ffffffff81673b46>] kobject_put+0x47/0x4c
[  116.080274]  [<ffffffff8165dc53>] blk_cleanup_queue+0x63/0x68
[  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
[  116.080274]  [<ffffffff84883f54>] hd_init+0x2d4/0x302
[  116.080274]  [<ffffffff81910778>] ? device_pm_unlock+0x15/0x17
[  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
[  116.080274]  [<ffffffff81002062>] do_one_initcall+0x57/0x15a
[  116.080274]  [<ffffffff8482f78b>] kernel_init+0x194/0x222
[  116.080274]  [<ffffffff8103ad04>] kernel_thread_helper+0x4/0x10
[  116.080274]  [<ffffffff82d38710>] ? restore_args+0x0/0x30
[  116.080274]  [<ffffffff8482f5f7>] ? kernel_init+0x0/0x222
[  116.080274]  [<ffffffff8103ad00>] ? kernel_thread_helper+0x0/0x10
[  116.080274] Code: ff ff c9 c3 90 90 90 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 17 eb f5 c9 c3 55 48 89 e5 <8b> 07 89 c2 c1 c0 10 39 c2 8d 90 00 00 01 00 75 04 f0 0f b1 17 
[  116.080274] RIP  [<ffffffff81064380>] __ticket_spin_trylock+0x4/0x21
[  116.080274]  RSP <ffff88003c417c10>
[  116.080274] ---[ end trace e8df42e772bf6fed ]---
[  116.080274] Kernel panic - not syncing: Fatal exception
[  116.080274] Pid: 1, comm: swapper Tainted: G      D W   2.6.36-tip-03555-g825d9ec-dirty #51843
[  116.080274] Call Trace:
[  116.080274]  [<ffffffff82d34d9c>] panic+0x91/0x1b7
[  116.080274]  [<ffffffff81094c93>] ? kmsg_dump+0x18d/0x1a7
[  116.080274]  [<ffffffff82d38364>] ? _raw_spin_unlock_irqrestore+0x4e/0x72
[  116.080274]  [<ffffffff82d396af>] oops_end+0xd8/0xe8
[  116.080274]  [<ffffffff8103d6fd>] die+0x5a/0x63
[  116.080274]  [<ffffffff82d3924f>] do_general_protection+0x12a/0x132
[  116.080274]  [<ffffffff82d38740>] ? irq_return+0x0/0x10
[  116.080274]  [<ffffffff82d38965>] general_protection+0x25/0x30
[  116.080274]  [<ffffffff81064380>] ? __ticket_spin_trylock+0x4/0x21
[  116.080274]  [<ffffffff8168c6ee>] do_raw_spin_trylock+0x1f/0x41
[  116.080274]  [<ffffffff82d37a20>] _raw_spin_lock_irqsave+0x72/0xa4
[  116.080274]  [<ffffffff810a1b65>] ? lock_timer_base+0x2c/0x52
[  116.080274]  [<ffffffff82d3836b>] ? _raw_spin_unlock_irqrestore+0x55/0x72
[  116.080274]  [<ffffffff810a1b65>] lock_timer_base+0x2c/0x52
[  116.080274]  [<ffffffff810a1c43>] del_timer+0x2f/0x82
[  116.080274]  [<ffffffff810ac906>] ? wait_on_work+0x0/0xdb
[  116.080274]  [<ffffffff810aca18>] __cancel_work_timer+0x37/0x130
[  116.080274]  [<ffffffff810acb23>] cancel_delayed_work_sync+0x12/0x14
[  116.080274]  [<ffffffff8166974a>] throtl_shutdown_timer_wq+0x1c/0x1e
[  116.080274]  [<ffffffff8165dbec>] blk_sync_queue+0x3d/0x41
[  116.080274]  [<ffffffff8165f872>] blk_release_queue+0x1e/0x6a
[  116.080274]  [<ffffffff81673ce3>] kobject_release+0xf4/0x122
[  116.080274]  [<ffffffff81673bef>] ? kobject_release+0x0/0x122
[  116.080274]  [<ffffffff81674e7e>] kref_put+0x43/0x4d
[  116.080274]  [<ffffffff81673b46>] kobject_put+0x47/0x4c
[  116.080274]  [<ffffffff8165dc53>] blk_cleanup_queue+0x63/0x68
[  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
[  116.080274]  [<ffffffff84883f54>] hd_init+0x2d4/0x302
[  116.080274]  [<ffffffff81910778>] ? device_pm_unlock+0x15/0x17
[  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
[  116.080274]  [<ffffffff81002062>] do_one_initcall+0x57/0x15a
[  116.080274]  [<ffffffff8482f78b>] kernel_init+0x194/0x222
[  116.080274]  [<ffffffff8103ad04>] kernel_thread_helper+0x4/0x10
[  116.080274]  [<ffffffff82d38710>] ? restore_args+0x0/0x30
[  116.080274]  [<ffffffff8482f5f7>] ? kernel_init+0x0/0x222
[  116.080274]  [<ffffffff8103ad00>] ? kernel_thread_helper+0x0/0x10

(Note, the taint is there because there are a few other (unrelated and harmless) 
warnings in the bootup.)

Previous -tip testing narrows the regression down to between d4429f6 and ab34c02.

Going back to d4429f6 it boots fine.

I've also Cc:-ed Tejun as workqueue bits were pulled in that commit range as well 
and the crash is also in the workqueue code.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-23 15:29 ` [origin tree boot failure] " Ingo Molnar
@ 2010-10-23 15:42   ` Linus Torvalds
  2010-10-23 15:52   ` Ingo Molnar
  2010-10-23 16:51   ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2010-10-23 15:42 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jens Axboe, Tejun Heo, linux-kernel

On Sat, Oct 23, 2010 at 8:29 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> [  116.080274]
> [  116.080274] Pid: 1, comm: swapper Tainted: G        W   2.6.36-tip-03555-g825d9ec-dirty #51843 A8N-E/System Product Name
> [  116.080274] RIP: 0010:[<ffffffff81064380>]  [<ffffffff81064380>] __ticket_spin_trylock+0x4/0x21
> [  116.080274] RSP: 0018:ffff88003c417c10  EFLAGS: 00010082
> [  116.080274] RAX: ffff88003c418000 RBX: 6b6b6b6b6b6b6b6a RCX: 0000000000000000
> [  116.080274] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6a
> [  116.080274] RBP: ffff88003c417c10 R08: 0000000000000002 R09: 0000000000000001
> [  116.080274] R10: 0000000000000286 R11: ffff880032498738 R12: 6b6b6b6b6b6b6b82

And we obviously have that "6b" pattern for a use-after free with slab
poisoning. Jens, have you tried with slab debugging?

                    Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-23 15:29 ` [origin tree boot failure] " Ingo Molnar
  2010-10-23 15:42   ` Linus Torvalds
@ 2010-10-23 15:52   ` Ingo Molnar
  2010-10-23 16:51   ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2010-10-23 15:52 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo; +Cc: Linus Torvalds, linux-kernel


* Ingo Molnar <mingo@elte.hu> wrote:

> [  116.080274]  [<ffffffff8165dc53>] blk_cleanup_queue+0x63/0x68
> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
> [  116.080274]  [<ffffffff84883f54>] hd_init+0x2d4/0x302
> [  116.080274]  [<ffffffff81910778>] ? device_pm_unlock+0x15/0x17
> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
> [  116.080274]  [<ffffffff81002062>] do_one_initcall+0x57/0x15a
> [  116.080274]  [<ffffffff8482f78b>] kernel_init+0x194/0x222

Btw., another data point, the crash goes away when the ancient XT-HD driver is 
turned off:

 # CONFIG_BLK_DEV_HD is not set

I'm not sure whether the bug is limited to this driver alone though.

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-23 15:29 ` [origin tree boot failure] " Ingo Molnar
  2010-10-23 15:42   ` Linus Torvalds
  2010-10-23 15:52   ` Ingo Molnar
@ 2010-10-23 16:51   ` Jens Axboe
  2010-10-23 17:17     ` Jens Axboe
  2 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2010-10-23 16:51 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Tejun Heo, Linus Torvalds, linux-kernel, Vivek Goyal

On 2010-10-23 17:29, Ingo Molnar wrote:
> 
> Hi,
> 
> * Jens Axboe <jaxboe@fusionio.com> wrote:
> 
>> Hi Linus,
>>
>> This first pull request is the core bits, meaning general
>> block layer changes or core support. Should be clean this time,
>> only 'weird bit' is the seemingly duplicate entry from Malahal.
>> This is caused by the first patch being buggy (and later
>> reverted), second patch used the same single line description.
>>
>> Nothing really exciting in here. A good collection of fixes, some of
>> which are marked for stable as well.
>>
>> The biggest addition this time around is the block IO throttling support
>> from Vivek.
> 
> The upstream block bits pulled in this merge window (or maybe the workqueue bits) 
> are possibly the cause a boot crash on today's -tip, using a trivial x86 bootup test 
> (64-bit allyesconfig):
> 
> [  116.064281] calling  hd_init+0x0/0x302 @ 1
> [  116.068529] hd: no drives specified - use hd=cyl,head,sectors on kernel command line
> [  116.076334] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> [  116.080274] last sysfs file: 
> [  116.080274] CPU 0 
> [  116.080274] Modules linked in:
> [  116.080274] 
> [  116.080274] Pid: 1, comm: swapper Tainted: G        W   2.6.36-tip-03555-g825d9ec-dirty #51843 A8N-E/System Product Name
> [  116.080274] RIP: 0010:[<ffffffff81064380>]  [<ffffffff81064380>] __ticket_spin_trylock+0x4/0x21
> [  116.080274] RSP: 0018:ffff88003c417c10  EFLAGS: 00010082
> [  116.080274] RAX: ffff88003c418000 RBX: 6b6b6b6b6b6b6b6a RCX: 0000000000000000
> [  116.080274] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6a
> [  116.080274] RBP: ffff88003c417c10 R08: 0000000000000002 R09: 0000000000000001
> [  116.080274] R10: 0000000000000286 R11: ffff880032498738 R12: 6b6b6b6b6b6b6b82
> [  116.080274] R13: 0000000000000286 R14: 6b6b6b6b6b6b6b6b R15: 0000000000000001
> [  116.080274] FS:  0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000
> [  116.080274] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  116.080274] CR2: 0000000000000000 CR3: 0000000004071000 CR4: 00000000000006f0
> [  116.080274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  116.080274] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  116.080274] Process swapper (pid: 1, threadinfo ffff88003c416000, task ffff88003c418000)
> [  116.080274] Stack:
> [  116.080274]  ffff88003c417c30 ffffffff8168c6ee 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6a
> [  116.080274] <0> ffff88003c417c70 ffffffff82d37a20 ffffffff810a1b65 ffff88003c418000
> [  116.080274] <0> ffffffff82d3836b 6b6b6b6b6b6b6b6a ffff8800330fcc20 ffff88003c417cb8
> [  116.080274] Call Trace:
> [  116.080274]  [<ffffffff8168c6ee>] do_raw_spin_trylock+0x1f/0x41
> [  116.080274]  [<ffffffff82d37a20>] _raw_spin_lock_irqsave+0x72/0xa4
> [  116.080274]  [<ffffffff810a1b65>] ? lock_timer_base+0x2c/0x52
> [  116.080274]  [<ffffffff82d3836b>] ? _raw_spin_unlock_irqrestore+0x55/0x72
> [  116.080274]  [<ffffffff810a1b65>] lock_timer_base+0x2c/0x52
> [  116.080274]  [<ffffffff810a1c43>] del_timer+0x2f/0x82
> [  116.080274]  [<ffffffff810ac906>] ? wait_on_work+0x0/0xdb
> [  116.080274]  [<ffffffff810aca18>] __cancel_work_timer+0x37/0x130
> [  116.080274]  [<ffffffff810acb23>] cancel_delayed_work_sync+0x12/0x14
> [  116.080274]  [<ffffffff8166974a>] throtl_shutdown_timer_wq+0x1c/0x1e
> [  116.080274]  [<ffffffff8165dbec>] blk_sync_queue+0x3d/0x41
> [  116.080274]  [<ffffffff8165f872>] blk_release_queue+0x1e/0x6a
> [  116.080274]  [<ffffffff81673ce3>] kobject_release+0xf4/0x122
> [  116.080274]  [<ffffffff81673bef>] ? kobject_release+0x0/0x122
> [  116.080274]  [<ffffffff81674e7e>] kref_put+0x43/0x4d
> [  116.080274]  [<ffffffff81673b46>] kobject_put+0x47/0x4c
> [  116.080274]  [<ffffffff8165dc53>] blk_cleanup_queue+0x63/0x68
> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
> [  116.080274]  [<ffffffff84883f54>] hd_init+0x2d4/0x302
> [  116.080274]  [<ffffffff81910778>] ? device_pm_unlock+0x15/0x17
> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
> [  116.080274]  [<ffffffff81002062>] do_one_initcall+0x57/0x15a
> [  116.080274]  [<ffffffff8482f78b>] kernel_init+0x194/0x222
> [  116.080274]  [<ffffffff8103ad04>] kernel_thread_helper+0x4/0x10
> [  116.080274]  [<ffffffff82d38710>] ? restore_args+0x0/0x30
> [  116.080274]  [<ffffffff8482f5f7>] ? kernel_init+0x0/0x222
> [  116.080274]  [<ffffffff8103ad00>] ? kernel_thread_helper+0x0/0x10
> [  116.080274] Code: ff ff c9 c3 90 90 90 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 17 eb f5 c9 c3 55 48 89 e5 <8b> 07 89 c2 c1 c0 10 39 c2 8d 90 00 00 01 00 75 04 f0 0f b1 17 

Looks like a fairly straight forward case of uninitialized memory and
blk_sync_queue() -> throtl_shutdown_timer() ->
cancel_delayed_work_sync().

Will get that fixed up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-23 16:51   ` Jens Axboe
@ 2010-10-23 17:17     ` Jens Axboe
  2010-10-23 18:21       ` Ingo Molnar
  2010-10-24  5:48       ` [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1 Vivek Goyal
  0 siblings, 2 replies; 11+ messages in thread
From: Jens Axboe @ 2010-10-23 17:17 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Tejun Heo, Linus Torvalds, linux-kernel, Vivek Goyal

On 2010-10-23 18:51, Jens Axboe wrote:
> On 2010-10-23 17:29, Ingo Molnar wrote:
>>
>> Hi,
>>
>> * Jens Axboe <jaxboe@fusionio.com> wrote:
>>
>>> Hi Linus,
>>>
>>> This first pull request is the core bits, meaning general
>>> block layer changes or core support. Should be clean this time,
>>> only 'weird bit' is the seemingly duplicate entry from Malahal.
>>> This is caused by the first patch being buggy (and later
>>> reverted), second patch used the same single line description.
>>>
>>> Nothing really exciting in here. A good collection of fixes, some of
>>> which are marked for stable as well.
>>>
>>> The biggest addition this time around is the block IO throttling support
>>> from Vivek.
>>
>> The upstream block bits pulled in this merge window (or maybe the workqueue bits) 
>> are possibly the cause a boot crash on today's -tip, using a trivial x86 bootup test 
>> (64-bit allyesconfig):
>>
>> [  116.064281] calling  hd_init+0x0/0x302 @ 1
>> [  116.068529] hd: no drives specified - use hd=cyl,head,sectors on kernel command line
>> [  116.076334] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
>> [  116.080274] last sysfs file: 
>> [  116.080274] CPU 0 
>> [  116.080274] Modules linked in:
>> [  116.080274] 
>> [  116.080274] Pid: 1, comm: swapper Tainted: G        W   2.6.36-tip-03555-g825d9ec-dirty #51843 A8N-E/System Product Name
>> [  116.080274] RIP: 0010:[<ffffffff81064380>]  [<ffffffff81064380>] __ticket_spin_trylock+0x4/0x21
>> [  116.080274] RSP: 0018:ffff88003c417c10  EFLAGS: 00010082
>> [  116.080274] RAX: ffff88003c418000 RBX: 6b6b6b6b6b6b6b6a RCX: 0000000000000000
>> [  116.080274] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6a
>> [  116.080274] RBP: ffff88003c417c10 R08: 0000000000000002 R09: 0000000000000001
>> [  116.080274] R10: 0000000000000286 R11: ffff880032498738 R12: 6b6b6b6b6b6b6b82
>> [  116.080274] R13: 0000000000000286 R14: 6b6b6b6b6b6b6b6b R15: 0000000000000001
>> [  116.080274] FS:  0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000
>> [  116.080274] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [  116.080274] CR2: 0000000000000000 CR3: 0000000004071000 CR4: 00000000000006f0
>> [  116.080274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [  116.080274] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [  116.080274] Process swapper (pid: 1, threadinfo ffff88003c416000, task ffff88003c418000)
>> [  116.080274] Stack:
>> [  116.080274]  ffff88003c417c30 ffffffff8168c6ee 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6a
>> [  116.080274] <0> ffff88003c417c70 ffffffff82d37a20 ffffffff810a1b65 ffff88003c418000
>> [  116.080274] <0> ffffffff82d3836b 6b6b6b6b6b6b6b6a ffff8800330fcc20 ffff88003c417cb8
>> [  116.080274] Call Trace:
>> [  116.080274]  [<ffffffff8168c6ee>] do_raw_spin_trylock+0x1f/0x41
>> [  116.080274]  [<ffffffff82d37a20>] _raw_spin_lock_irqsave+0x72/0xa4
>> [  116.080274]  [<ffffffff810a1b65>] ? lock_timer_base+0x2c/0x52
>> [  116.080274]  [<ffffffff82d3836b>] ? _raw_spin_unlock_irqrestore+0x55/0x72
>> [  116.080274]  [<ffffffff810a1b65>] lock_timer_base+0x2c/0x52
>> [  116.080274]  [<ffffffff810a1c43>] del_timer+0x2f/0x82
>> [  116.080274]  [<ffffffff810ac906>] ? wait_on_work+0x0/0xdb
>> [  116.080274]  [<ffffffff810aca18>] __cancel_work_timer+0x37/0x130
>> [  116.080274]  [<ffffffff810acb23>] cancel_delayed_work_sync+0x12/0x14
>> [  116.080274]  [<ffffffff8166974a>] throtl_shutdown_timer_wq+0x1c/0x1e
>> [  116.080274]  [<ffffffff8165dbec>] blk_sync_queue+0x3d/0x41
>> [  116.080274]  [<ffffffff8165f872>] blk_release_queue+0x1e/0x6a
>> [  116.080274]  [<ffffffff81673ce3>] kobject_release+0xf4/0x122
>> [  116.080274]  [<ffffffff81673bef>] ? kobject_release+0x0/0x122
>> [  116.080274]  [<ffffffff81674e7e>] kref_put+0x43/0x4d
>> [  116.080274]  [<ffffffff81673b46>] kobject_put+0x47/0x4c
>> [  116.080274]  [<ffffffff8165dc53>] blk_cleanup_queue+0x63/0x68
>> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
>> [  116.080274]  [<ffffffff84883f54>] hd_init+0x2d4/0x302
>> [  116.080274]  [<ffffffff81910778>] ? device_pm_unlock+0x15/0x17
>> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
>> [  116.080274]  [<ffffffff81002062>] do_one_initcall+0x57/0x15a
>> [  116.080274]  [<ffffffff8482f78b>] kernel_init+0x194/0x222
>> [  116.080274]  [<ffffffff8103ad04>] kernel_thread_helper+0x4/0x10
>> [  116.080274]  [<ffffffff82d38710>] ? restore_args+0x0/0x30
>> [  116.080274]  [<ffffffff8482f5f7>] ? kernel_init+0x0/0x222
>> [  116.080274]  [<ffffffff8103ad00>] ? kernel_thread_helper+0x0/0x10
>> [  116.080274] Code: ff ff c9 c3 90 90 90 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 17 eb f5 c9 c3 55 48 89 e5 <8b> 07 89 c2 c1 c0 10 39 c2 8d 90 00 00 01 00 75 04 f0 0f b1 17 
> 
> Looks like a fairly straight forward case of uninitialized memory and
> blk_sync_queue() -> throtl_shutdown_timer() ->
> cancel_delayed_work_sync().
> 
> Will get that fixed up.

It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the
final put happens, blk_sync_queue() is called and then ends up doing the
cancel_delayed_work_sync() on freed memory.

Two possible fixes:

- Clear ->td when the queue is goin dead. May require other ->td == NULL
  checks in the code, so I opted for:

- Move the free to when the queue is really going away, post doing the
  blk_sync_queue() call.

The below should fix it.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

diff --git a/block/blk-core.c b/block/blk-core.c
index 4514146..51efd83 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -462,8 +462,6 @@ void blk_cleanup_queue(struct request_queue *q)
 	if (q->elevator)
 		elevator_exit(q->elevator);
 
-	blk_throtl_exit(q);
-
 	blk_put_queue(q);
 }
 EXPORT_SYMBOL(blk_cleanup_queue);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index da8a8a4..013457f 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -471,6 +471,8 @@ static void blk_release_queue(struct kobject *kobj)
 
 	blk_sync_queue(q);
 
+	blk_throtl_exit(q);
+
 	if (rl->rq_pool)
 		mempool_destroy(rl->rq_pool);
 


-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-23 17:17     ` Jens Axboe
@ 2010-10-23 18:21       ` Ingo Molnar
  2010-10-23 18:43         ` [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1) Jens Axboe
  2010-10-24  5:48       ` [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1 Vivek Goyal
  1 sibling, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2010-10-23 18:21 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, Linus Torvalds, linux-kernel, Vivek Goyal


* Jens Axboe <jaxboe@fusionio.com> wrote:

> > Looks like a fairly straight forward case of uninitialized memory and 
> > blk_sync_queue() -> throtl_shutdown_timer() -> cancel_delayed_work_sync().
> > 
> > Will get that fixed up.
> 
> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the final put 
> happens, blk_sync_queue() is called and then ends up doing the 
> cancel_delayed_work_sync() on freed memory.
> 
> Two possible fixes:
> 
> - Clear ->td when the queue is goin dead. May require other ->td == NULL
>   checks in the code, so I opted for:
> 
> - Move the free to when the queue is really going away, post doing the
>   blk_sync_queue() call.
> 
> The below should fix it.
> 
> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

This did the trick, thanks Jens!

Tested-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for  2.6.37-rc1)
  2010-10-23 18:21       ` Ingo Molnar
@ 2010-10-23 18:43         ` Jens Axboe
  2010-10-23 20:33           ` Maxim Levitsky
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2010-10-23 18:43 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Tejun Heo, Linus Torvalds, linux-kernel, Vivek Goyal

On 2010-10-23 20:21, Ingo Molnar wrote:
> 
> * Jens Axboe <jaxboe@fusionio.com> wrote:
> 
>>> Looks like a fairly straight forward case of uninitialized memory and 
>>> blk_sync_queue() -> throtl_shutdown_timer() -> cancel_delayed_work_sync().
>>>
>>> Will get that fixed up.
>>
>> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the final put 
>> happens, blk_sync_queue() is called and then ends up doing the 
>> cancel_delayed_work_sync() on freed memory.
>>
>> Two possible fixes:
>>
>> - Clear ->td when the queue is goin dead. May require other ->td == NULL
>>   checks in the code, so I opted for:
>>
>> - Move the free to when the queue is really going away, post doing the
>>   blk_sync_queue() call.
>>
>> The below should fix it.
>>
>> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
> 
> This did the trick, thanks Jens!

Great, thanks for testing/reporting! I added your reported/tested-by.

Linus, please pull this single fix, better get this out the door since
I'll be travelling very shortly.


  git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core

Jens Axboe (1):
      block: fix use-after-free bug in blk throttle code

 block/blk-core.c  |    2 --
 block/blk-sysfs.c |    2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for  2.6.37-rc1)
  2010-10-23 18:43         ` [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1) Jens Axboe
@ 2010-10-23 20:33           ` Maxim Levitsky
  2010-10-24  6:15             ` Vivek Goyal
  0 siblings, 1 reply; 11+ messages in thread
From: Maxim Levitsky @ 2010-10-23 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ingo Molnar, Tejun Heo, Linus Torvalds, linux-kernel, Vivek Goyal

On Sat, 2010-10-23 at 20:43 +0200, Jens Axboe wrote:
> On 2010-10-23 20:21, Ingo Molnar wrote:
> > 
> > * Jens Axboe <jaxboe@fusionio.com> wrote:
> > 
> >>> Looks like a fairly straight forward case of uninitialized memory and 
> >>> blk_sync_queue() -> throtl_shutdown_timer() -> cancel_delayed_work_sync().
> >>>
> >>> Will get that fixed up.
> >>
> >> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the final put 
> >> happens, blk_sync_queue() is called and then ends up doing the 
> >> cancel_delayed_work_sync() on freed memory.
> >>
> >> Two possible fixes:
> >>
> >> - Clear ->td when the queue is goin dead. May require other ->td == NULL
> >>   checks in the code, so I opted for:
> >>
> >> - Move the free to when the queue is really going away, post doing the
> >>   blk_sync_queue() call.
> >>
> >> The below should fix it.
> >>
> >> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
> > 
> > This did the trick, thanks Jens!
> 
> Great, thanks for testing/reporting! I added your reported/tested-by.
> 
> Linus, please pull this single fix, better get this out the door since
> I'll be travelling very shortly.
> 
> 
>   git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core
> 
> Jens Axboe (1):
>       block: fix use-after-free bug in blk throttle code
> 
>  block/blk-core.c  |    2 --
>  block/blk-sysfs.c |    2 ++
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
I have here very similar bug.
Must have been caused by this patch series.
I pulled that tree, but that didn't affect anything.

System oopses/panics on removal of any hotplugable device.
(reproduced with xD, MemoryStick, and USB mass storage).

Here is backtrace for MemoryStick card:

<6>[   24.138665] r592: IRQ: card removed
<1>[   24.228293] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8
<1>[   24.228966] IP: [<00000000000001f8>] 0x1f8
<4>[   24.230739] PGD 0 
<0>[   24.231182] Oops: 0010 [#1] PREEMPT SMP 
<0>[   24.231182] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda3/alignment_offset
<4>[   24.231182] CPU 1 
<4>[   24.231182] Modules linked in: dm_crypt firewire_net usb_storage usb_libusual cpufreq_powersave cpufreq_conservative cpufreq_userspace uvcvideo videodev v4l2_compat_ioctl32 acpi_cpufreq iwl3945 iwlcore snd_hda_codec_realtek mac80211 mperf r852 iTCO_wdt coretemp uhci_hcd sm_common ir_lirc_codec mspro_block snd_hda_intel ms_block ehci_hcd sdhci_pci lirc_dev joydev sbp2 nand snd_hda_codec cfg80211 firewire_ohci sdhci ir_sony_decoder ieee1394 nand_ids usbcore r592 ir_jvc_decoder snd_hwdep mmc_core nand_ecc ir_rc6_decoder ene_ir snd_pcm tg3 ir_rc5_decoder firewire_core mtd battery memstick ac ir_nec_decoder psmouse snd_page_alloc libphy sunrpc ir_core sg evdev serio_raw dm_mirror dm_region_hash dm_log dm_mod nouveau ttm drm_kms_helper drm i2c_algo_bit thermal video
<4>[   32.881606] 
<4>[   32.881606] Pid: 543, comm: kworker/u:4 Not tainted 2.6.36+ #191 Nettiling/Aspire 5720     
<4>[   32.881606] RIP: 0010:[<00000000000001f8>]  [<00000000000001f8>] 0x1f8
<4>[   32.881606] RSP: 0018:ffff880037a03ab8  EFLAGS: 00010086
<4>[   32.881606] RAX: ffff88007c0ebc00 RBX: ffff880037af9470 RCX: 0000000000000000
<4>[   32.881606] RDX: 0000000000000019 RSI: 0000000000000001 RDI: ffff880037af9470
<4>[   32.881606] RBP: ffff880037a03ad0 R08: 0000000000000000 R09: 0000000000000001
<4>[   32.881606] R10: 00000000000002f0 R11: 0000000000000000 R12: ffff880037af9470
<4>[   32.881606] R13: ffff880075d6a870 R14: ffff880075bfb560 R15: 0000000000000282
<4>[   32.881606] FS:  0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
<4>[   32.881606] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[   32.881606] CR2: 00000000000001f8 CR3: 000000007a046000 CR4: 00000000000006e0
<4>[   32.881606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   32.881606] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[   32.881606] Process kworker/u:4 (pid: 543, threadinfo ffff880037a02000, task ffff88007c5b0000)
<0>[   32.881606] Stack:
<4>[   32.881606]  ffffffff811c42a2 ffff880037a03af0 ffff880037af9470 ffff880037a03af0
<4>[   32.881606] <0> ffffffff811c525a ffff880077250040 ffff880077250040 ffff880037a03b10
<4>[   32.881606] <0> ffffffff811cebb2 ffff880075d6a800 ffff880075d6a8a8 ffff880037a03b30
<0>[   32.881606] Call Trace:
<4>[   32.881606]  [<ffffffff811c42a2>] ? elv_drain_elevator+0x22/0x70
<4>[   32.881606]  [<ffffffff811c525a>] elv_quiesce_start+0x3a/0xc0
<4>[   32.881606]  [<ffffffff811cebb2>] disk_replace_part_tbl+0x42/0x70
<4>[   32.881606]  [<ffffffff811cec63>] disk_release+0x23/0x50
<4>[   32.881606]  [<ffffffff81273c42>] device_release+0x22/0x90
<4>[   32.881606]  [<ffffffff811daced>] kobject_release+0x8d/0x1a0
<4>[   32.881606]  [<ffffffff811dac60>] ? kobject_release+0x0/0x1a0
<4>[   32.881606]  [<ffffffff811dc257>] kref_put+0x37/0x70
<4>[   32.881606]  [<ffffffff811dab67>] kobject_put+0x27/0x60
<4>[   32.881606]  [<ffffffff811cef42>] put_disk+0x12/0x20
<4>[   32.881606]  [<ffffffffa0627663>] mspro_block_disk_release+0xa3/0xb0 [mspro_block]
<4>[   32.881606]  [<ffffffffa062773d>] mspro_block_remove+0xcd/0x140 [mspro_block]
<4>[   32.881606]  [<ffffffffa01d42b5>] memstick_device_remove+0x35/0x60 [memstick]
<4>[   32.881606]  [<ffffffff81277630>] __device_release_driver+0x70/0xe0
<4>[   32.881606]  [<ffffffff8127779a>] device_release_driver+0x2a/0x40
<4>[   32.881606]  [<ffffffff812769b5>] bus_remove_device+0xb5/0x120
<4>[   32.881606]  [<ffffffff81274817>] device_del+0x127/0x1d0
<4>[   32.881606]  [<ffffffff812748dd>] device_unregister+0x1d/0x60
<4>[   32.881606]  [<ffffffffa01d5071>] memstick_check+0x241/0x360 [memstick]
<4>[   32.881606]  [<ffffffff8105a740>] process_one_work+0x1c0/0x4d0
<4>[   32.881606]  [<ffffffff8105a6e2>] ? process_one_work+0x162/0x4d0
<4>[   32.881606]  [<ffffffffa01d4e30>] ? memstick_check+0x0/0x360 [memstick]
<4>[   32.881606]  [<ffffffff8105ae36>] worker_thread+0x156/0x410
<4>[   32.881606]  [<ffffffff8105ace0>] ? worker_thread+0x0/0x410
<4>[   32.881606]  [<ffffffff8105ed66>] kthread+0xb6/0xc0
<4>[   32.881606]  [<ffffffff81037fa6>] ? finish_task_switch+0x46/0xe0
<4>[   32.881606]  [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
<4>[   32.881606]  [<ffffffff8105ecb0>] ? kthread+0x0/0xc0
<4>[   32.881606]  [<ffffffff81003c10>] ? kernel_thread_helper+0x0/0x10
<0>[   32.881606] Code:  Bad RIP value.
<1>[   32.881606] RIP  [<00000000000001f8>] 0x1f8
<4>[   32.881606]  RSP <ffff880037a03ab8>
<0>[   32.881606] CR2: 00000000000001f8
<4>[   32.881606] ---[ end trace ca0206dec4457aff ]---

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1
  2010-10-23 17:17     ` Jens Axboe
  2010-10-23 18:21       ` Ingo Molnar
@ 2010-10-24  5:48       ` Vivek Goyal
  1 sibling, 0 replies; 11+ messages in thread
From: Vivek Goyal @ 2010-10-24  5:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Ingo Molnar, Tejun Heo, Linus Torvalds, linux-kernel

On Sat, Oct 23, 2010 at 07:17:34PM +0200, Jens Axboe wrote:
> On 2010-10-23 18:51, Jens Axboe wrote:
> > On 2010-10-23 17:29, Ingo Molnar wrote:
> >>
> >> Hi,
> >>
> >> * Jens Axboe <jaxboe@fusionio.com> wrote:
> >>
> >>> Hi Linus,
> >>>
> >>> This first pull request is the core bits, meaning general
> >>> block layer changes or core support. Should be clean this time,
> >>> only 'weird bit' is the seemingly duplicate entry from Malahal.
> >>> This is caused by the first patch being buggy (and later
> >>> reverted), second patch used the same single line description.
> >>>
> >>> Nothing really exciting in here. A good collection of fixes, some of
> >>> which are marked for stable as well.
> >>>
> >>> The biggest addition this time around is the block IO throttling support
> >>> from Vivek.
> >>
> >> The upstream block bits pulled in this merge window (or maybe the workqueue bits) 
> >> are possibly the cause a boot crash on today's -tip, using a trivial x86 bootup test 
> >> (64-bit allyesconfig):
> >>
> >> [  116.064281] calling  hd_init+0x0/0x302 @ 1
> >> [  116.068529] hd: no drives specified - use hd=cyl,head,sectors on kernel command line
> >> [  116.076334] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> >> [  116.080274] last sysfs file: 
> >> [  116.080274] CPU 0 
> >> [  116.080274] Modules linked in:
> >> [  116.080274] 
> >> [  116.080274] Pid: 1, comm: swapper Tainted: G        W   2.6.36-tip-03555-g825d9ec-dirty #51843 A8N-E/System Product Name
> >> [  116.080274] RIP: 0010:[<ffffffff81064380>]  [<ffffffff81064380>] __ticket_spin_trylock+0x4/0x21
> >> [  116.080274] RSP: 0018:ffff88003c417c10  EFLAGS: 00010082
> >> [  116.080274] RAX: ffff88003c418000 RBX: 6b6b6b6b6b6b6b6a RCX: 0000000000000000
> >> [  116.080274] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6a
> >> [  116.080274] RBP: ffff88003c417c10 R08: 0000000000000002 R09: 0000000000000001
> >> [  116.080274] R10: 0000000000000286 R11: ffff880032498738 R12: 6b6b6b6b6b6b6b82
> >> [  116.080274] R13: 0000000000000286 R14: 6b6b6b6b6b6b6b6b R15: 0000000000000001
> >> [  116.080274] FS:  0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000
> >> [  116.080274] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [  116.080274] CR2: 0000000000000000 CR3: 0000000004071000 CR4: 00000000000006f0
> >> [  116.080274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [  116.080274] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> [  116.080274] Process swapper (pid: 1, threadinfo ffff88003c416000, task ffff88003c418000)
> >> [  116.080274] Stack:
> >> [  116.080274]  ffff88003c417c30 ffffffff8168c6ee 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6a
> >> [  116.080274] <0> ffff88003c417c70 ffffffff82d37a20 ffffffff810a1b65 ffff88003c418000
> >> [  116.080274] <0> ffffffff82d3836b 6b6b6b6b6b6b6b6a ffff8800330fcc20 ffff88003c417cb8
> >> [  116.080274] Call Trace:
> >> [  116.080274]  [<ffffffff8168c6ee>] do_raw_spin_trylock+0x1f/0x41
> >> [  116.080274]  [<ffffffff82d37a20>] _raw_spin_lock_irqsave+0x72/0xa4
> >> [  116.080274]  [<ffffffff810a1b65>] ? lock_timer_base+0x2c/0x52
> >> [  116.080274]  [<ffffffff82d3836b>] ? _raw_spin_unlock_irqrestore+0x55/0x72
> >> [  116.080274]  [<ffffffff810a1b65>] lock_timer_base+0x2c/0x52
> >> [  116.080274]  [<ffffffff810a1c43>] del_timer+0x2f/0x82
> >> [  116.080274]  [<ffffffff810ac906>] ? wait_on_work+0x0/0xdb
> >> [  116.080274]  [<ffffffff810aca18>] __cancel_work_timer+0x37/0x130
> >> [  116.080274]  [<ffffffff810acb23>] cancel_delayed_work_sync+0x12/0x14
> >> [  116.080274]  [<ffffffff8166974a>] throtl_shutdown_timer_wq+0x1c/0x1e
> >> [  116.080274]  [<ffffffff8165dbec>] blk_sync_queue+0x3d/0x41
> >> [  116.080274]  [<ffffffff8165f872>] blk_release_queue+0x1e/0x6a
> >> [  116.080274]  [<ffffffff81673ce3>] kobject_release+0xf4/0x122
> >> [  116.080274]  [<ffffffff81673bef>] ? kobject_release+0x0/0x122
> >> [  116.080274]  [<ffffffff81674e7e>] kref_put+0x43/0x4d
> >> [  116.080274]  [<ffffffff81673b46>] kobject_put+0x47/0x4c
> >> [  116.080274]  [<ffffffff8165dc53>] blk_cleanup_queue+0x63/0x68
> >> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
> >> [  116.080274]  [<ffffffff84883f54>] hd_init+0x2d4/0x302
> >> [  116.080274]  [<ffffffff81910778>] ? device_pm_unlock+0x15/0x17
> >> [  116.080274]  [<ffffffff84883c80>] ? hd_init+0x0/0x302
> >> [  116.080274]  [<ffffffff81002062>] do_one_initcall+0x57/0x15a
> >> [  116.080274]  [<ffffffff8482f78b>] kernel_init+0x194/0x222
> >> [  116.080274]  [<ffffffff8103ad04>] kernel_thread_helper+0x4/0x10
> >> [  116.080274]  [<ffffffff82d38710>] ? restore_args+0x0/0x30
> >> [  116.080274]  [<ffffffff8482f5f7>] ? kernel_init+0x0/0x222
> >> [  116.080274]  [<ffffffff8103ad00>] ? kernel_thread_helper+0x0/0x10
> >> [  116.080274] Code: ff ff c9 c3 90 90 90 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 17 eb f5 c9 c3 55 48 89 e5 <8b> 07 89 c2 c1 c0 10 39 c2 8d 90 00 00 01 00 75 04 f0 0f b1 17 
> > 
> > Looks like a fairly straight forward case of uninitialized memory and
> > blk_sync_queue() -> throtl_shutdown_timer() ->
> > cancel_delayed_work_sync().
> > 
> > Will get that fixed up.
> 
> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the
> final put happens, blk_sync_queue() is called and then ends up doing the
> cancel_delayed_work_sync() on freed memory.
> 
> Two possible fixes:
> 
> - Clear ->td when the queue is goin dead. May require other ->td == NULL
>   checks in the code, so I opted for:
> 
> - Move the free to when the queue is really going away, post doing the
>   blk_sync_queue() call.
> 
> The below should fix it.
> 
> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 4514146..51efd83 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -462,8 +462,6 @@ void blk_cleanup_queue(struct request_queue *q)
>  	if (q->elevator)
>  		elevator_exit(q->elevator);
>  
> -	blk_throtl_exit(q);
> -
>  	blk_put_queue(q);
>  }
>  EXPORT_SYMBOL(blk_cleanup_queue);
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index da8a8a4..013457f 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -471,6 +471,8 @@ static void blk_release_queue(struct kobject *kobj)
>  
>  	blk_sync_queue(q);
>  
> +	blk_throtl_exit(q);
> +
>  	if (rl->rq_pool)
>  		mempool_destroy(rl->rq_pool);

Thanks for the fix Jens. I had done testing with pulling out a usb key
from a running system to check for hot remove/ blk_cleanup_queue() path
and not sure why didn't I catch it.

I have got one little concern here. blk_throtl_exit() takes requeust queue
spin locks and relies on the fact that q->queue_lock is still around.

IIUC, in blk_release_queue(), there is no gurantee that driver has not
freed up the memory associated with spin lock (If it is a driver provided
spin lock).

Checking for q->td in throtl_shutdown_timer_wq(), might be a fix
but it has the potential to be racy as throtl_shutdown_timer_wq() does
not take spin lock and I guess it can't take spin lock to check for
q->td, as it is called in blk_release_queue-> blk_sync_queue path and
it is not guranteed if spin lock is still around.

So may be we need to come up with a method to make sure driver does not
release queue lock until all the users of queue are gone and one can safely
assume q->queue_lock is valid in blk_release_queue().

Or may be make q->td rcu protected. It is already spin lock protected and
it kind of will become messy to access it under rcu lock in
throtl_shutdown_timer_wq(), and under q->queue_lock in rest of the places.
Ofcourse freeing of q->td will be after waiting through call_rcu().

Thanks
Vivek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for  2.6.37-rc1)
  2010-10-23 20:33           ` Maxim Levitsky
@ 2010-10-24  6:15             ` Vivek Goyal
  0 siblings, 0 replies; 11+ messages in thread
From: Vivek Goyal @ 2010-10-24  6:15 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Jens Axboe, Ingo Molnar, Tejun Heo, Linus Torvalds, linux-kernel,
	Yasuaki Ishimatsu

On Sat, Oct 23, 2010 at 10:33:13PM +0200, Maxim Levitsky wrote:
> On Sat, 2010-10-23 at 20:43 +0200, Jens Axboe wrote:
> > On 2010-10-23 20:21, Ingo Molnar wrote:
> > > 
> > > * Jens Axboe <jaxboe@fusionio.com> wrote:
> > > 
> > >>> Looks like a fairly straight forward case of uninitialized memory and 
> > >>> blk_sync_queue() -> throtl_shutdown_timer() -> cancel_delayed_work_sync().
> > >>>
> > >>> Will get that fixed up.
> > >>
> > >> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the final put 
> > >> happens, blk_sync_queue() is called and then ends up doing the 
> > >> cancel_delayed_work_sync() on freed memory.
> > >>
> > >> Two possible fixes:
> > >>
> > >> - Clear ->td when the queue is goin dead. May require other ->td == NULL
> > >>   checks in the code, so I opted for:
> > >>
> > >> - Move the free to when the queue is really going away, post doing the
> > >>   blk_sync_queue() call.
> > >>
> > >> The below should fix it.
> > >>
> > >> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
> > > 
> > > This did the trick, thanks Jens!
> > 
> > Great, thanks for testing/reporting! I added your reported/tested-by.
> > 
> > Linus, please pull this single fix, better get this out the door since
> > I'll be travelling very shortly.
> > 
> > 
> >   git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core
> > 
> > Jens Axboe (1):
> >       block: fix use-after-free bug in blk throttle code
> > 
> >  block/blk-core.c  |    2 --
> >  block/blk-sysfs.c |    2 ++
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> I have here very similar bug.
> Must have been caused by this patch series.
> I pulled that tree, but that didn't affect anything.
> 
> System oopses/panics on removal of any hotplugable device.
> (reproduced with xD, MemoryStick, and USB mass storage).
> 
> Here is backtrace for MemoryStick card:
> 
> <6>[   24.138665] r592: IRQ: card removed
> <1>[   24.228293] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8
> <1>[   24.228966] IP: [<00000000000001f8>] 0x1f8
> <4>[   24.230739] PGD 0 
> <0>[   24.231182] Oops: 0010 [#1] PREEMPT SMP 
> <0>[   24.231182] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda3/alignment_offset
> <4>[   24.231182] CPU 1 
> <4>[   24.231182] Modules linked in: dm_crypt firewire_net usb_storage usb_libusual cpufreq_powersave cpufreq_conservative cpufreq_userspace uvcvideo videodev v4l2_compat_ioctl32 acpi_cpufreq iwl3945 iwlcore snd_hda_codec_realtek mac80211 mperf r852 iTCO_wdt coretemp uhci_hcd sm_common ir_lirc_codec mspro_block snd_hda_intel ms_block ehci_hcd sdhci_pci lirc_dev joydev sbp2 nand snd_hda_codec cfg80211 firewire_ohci sdhci ir_sony_decoder ieee1394 nand_ids usbcore r592 ir_jvc_decoder snd_hwdep mmc_core nand_ecc ir_rc6_decoder ene_ir snd_pcm tg3 ir_rc5_decoder firewire_core mtd battery memstick ac ir_nec_decoder psmouse snd_page_alloc libphy sunrpc ir_core sg evdev serio_raw dm_mirror dm_region_hash dm_log dm_mod nouveau ttm drm_kms_helper drm i2c_algo_bit thermal video
> <4>[   32.881606] 
> <4>[   32.881606] Pid: 543, comm: kworker/u:4 Not tainted 2.6.36+ #191 Nettiling/Aspire 5720     
> <4>[   32.881606] RIP: 0010:[<00000000000001f8>]  [<00000000000001f8>] 0x1f8
> <4>[   32.881606] RSP: 0018:ffff880037a03ab8  EFLAGS: 00010086
> <4>[   32.881606] RAX: ffff88007c0ebc00 RBX: ffff880037af9470 RCX: 0000000000000000
> <4>[   32.881606] RDX: 0000000000000019 RSI: 0000000000000001 RDI: ffff880037af9470
> <4>[   32.881606] RBP: ffff880037a03ad0 R08: 0000000000000000 R09: 0000000000000001
> <4>[   32.881606] R10: 00000000000002f0 R11: 0000000000000000 R12: ffff880037af9470
> <4>[   32.881606] R13: ffff880075d6a870 R14: ffff880075bfb560 R15: 0000000000000282
> <4>[   32.881606] FS:  0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> <4>[   32.881606] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> <4>[   32.881606] CR2: 00000000000001f8 CR3: 000000007a046000 CR4: 00000000000006e0
> <4>[   32.881606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <4>[   32.881606] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> <4>[   32.881606] Process kworker/u:4 (pid: 543, threadinfo ffff880037a02000, task ffff88007c5b0000)
> <0>[   32.881606] Stack:
> <4>[   32.881606]  ffffffff811c42a2 ffff880037a03af0 ffff880037af9470 ffff880037a03af0
> <4>[   32.881606] <0> ffffffff811c525a ffff880077250040 ffff880077250040 ffff880037a03b10
> <4>[   32.881606] <0> ffffffff811cebb2 ffff880075d6a800 ffff880075d6a8a8 ffff880037a03b30
> <0>[   32.881606] Call Trace:
> <4>[   32.881606]  [<ffffffff811c42a2>] ? elv_drain_elevator+0x22/0x70
> <4>[   32.881606]  [<ffffffff811c525a>] elv_quiesce_start+0x3a/0xc0
> <4>[   32.881606]  [<ffffffff811cebb2>] disk_replace_part_tbl+0x42/0x70
> <4>[   32.881606]  [<ffffffff811cec63>] disk_release+0x23/0x50
> <4>[   32.881606]  [<ffffffff81273c42>] device_release+0x22/0x90
> <4>[   32.881606]  [<ffffffff811daced>] kobject_release+0x8d/0x1a0
> <4>[   32.881606]  [<ffffffff811dac60>] ? kobject_release+0x0/0x1a0
> <4>[   32.881606]  [<ffffffff811dc257>] kref_put+0x37/0x70
> <4>[   32.881606]  [<ffffffff811dab67>] kobject_put+0x27/0x60
> <4>[   32.881606]  [<ffffffff811cef42>] put_disk+0x12/0x20
> <4>[   32.881606]  [<ffffffffa0627663>] mspro_block_disk_release+0xa3/0xb0 [mspro_block]
> <4>[   32.881606]  [<ffffffffa062773d>] mspro_block_remove+0xcd/0x140 [mspro_block]
> <4>[   32.881606]  [<ffffffffa01d42b5>] memstick_device_remove+0x35/0x60 [memstick]
> <4>[   32.881606]  [<ffffffff81277630>] __device_release_driver+0x70/0xe0
> <4>[   32.881606]  [<ffffffff8127779a>] device_release_driver+0x2a/0x40
> <4>[   32.881606]  [<ffffffff812769b5>] bus_remove_device+0xb5/0x120
> <4>[   32.881606]  [<ffffffff81274817>] device_del+0x127/0x1d0
> <4>[   32.881606]  [<ffffffff812748dd>] device_unregister+0x1d/0x60
> <4>[   32.881606]  [<ffffffffa01d5071>] memstick_check+0x241/0x360 [memstick]
> <4>[   32.881606]  [<ffffffff8105a740>] process_one_work+0x1c0/0x4d0
> <4>[   32.881606]  [<ffffffff8105a6e2>] ? process_one_work+0x162/0x4d0
> <4>[   32.881606]  [<ffffffffa01d4e30>] ? memstick_check+0x0/0x360 [memstick]
> <4>[   32.881606]  [<ffffffff8105ae36>] worker_thread+0x156/0x410
> <4>[   32.881606]  [<ffffffff8105ace0>] ? worker_thread+0x0/0x410
> <4>[   32.881606]  [<ffffffff8105ed66>] kthread+0xb6/0xc0
> <4>[   32.881606]  [<ffffffff81037fa6>] ? finish_task_switch+0x46/0xe0
> <4>[   32.881606]  [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
> <4>[   32.881606]  [<ffffffff8105ecb0>] ? kthread+0x0/0xc0
> <4>[   32.881606]  [<ffffffff81003c10>] ? kernel_thread_helper+0x0/0x10
> <0>[   32.881606] Code:  Bad RIP value.
> <1>[   32.881606] RIP  [<00000000000001f8>] 0x1f8
> <4>[   32.881606]  RSP <ffff880037a03ab8>
> <0>[   32.881606] CR2: 00000000000001f8
> <4>[   32.881606] ---[ end trace ca0206dec4457aff ]---
> 

Looking at the backtrace and commit messages, it might be coming from
following commit.

commit 7681bfeeccff5efa9eb29bf09249a3c400b15327
Author: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Date:   Tue Oct 19 09:05:00 2010 +0200

    block: fix accounting bug on cross partition merges


Looks like we have freed the request queue in mspro_block_remove() and
then we are calling mspro_block_disk_release() which ends up accessing
request queue in disk_replace_part_tbl(). So use-after-free case.  
 
CCing Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-10-24  6:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-22  7:57 [GIT PULL] core block bits for 2.6.37-rc1 Jens Axboe
2010-10-23 15:29 ` [origin tree boot failure] " Ingo Molnar
2010-10-23 15:42   ` Linus Torvalds
2010-10-23 15:52   ` Ingo Molnar
2010-10-23 16:51   ` Jens Axboe
2010-10-23 17:17     ` Jens Axboe
2010-10-23 18:21       ` Ingo Molnar
2010-10-23 18:43         ` [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1) Jens Axboe
2010-10-23 20:33           ` Maxim Levitsky
2010-10-24  6:15             ` Vivek Goyal
2010-10-24  5:48       ` [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1 Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.