All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH] dlm: Allow large nodeids
@ 2009-01-27 10:44 Chrissie Caulfield
  2009-01-27 11:33 ` Chrissie Caulfield
  0 siblings, 1 reply; 8+ messages in thread
From: Chrissie Caulfield @ 2009-01-27 10:44 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch changes DLM to use its own hash table rather than the idr_
code. It should allow clusters with large nodeids to work correctly.

It also fixes the 1..max_nodeid loops when the DLM shuts down.

This is a slightly different patch to the one I posted to IRC yesterday,
but all I've changed is the use of list_for_each_entry( rather than just
list_for_each().

Chrissie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dlm_large_nodeids.patch
Type: text/x-patch
Size: 8079 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090127/93af0d18/attachment.bin>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-01-27 10:44 [Cluster-devel] [PATCH] dlm: Allow large nodeids Chrissie Caulfield
@ 2009-01-27 11:33 ` Chrissie Caulfield
  2009-01-27 20:06   ` David Teigland
  0 siblings, 1 reply; 8+ messages in thread
From: Chrissie Caulfield @ 2009-01-27 11:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This an updated patch that uses hlists rather than list_heads to save
memory in the connection structure.

Thanks to Steven Whitehouse for the suggestion.


Chrissie Caulfield wrote:
> This patch changes DLM to use its own hash table rather than the idr_
> code. It should allow clusters with large nodeids to work correctly.
> 
> It also fixes the 1..max_nodeid loops when the DLM shuts down.
> 
> This is a slightly different patch to the one I posted to IRC yesterday,
> but all I've changed is the use of list_for_each_entry( rather than just
> list_for_each().



Chrissie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dlm_large_nodeids.patch
Type: text/x-patch
Size: 8111 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090127/982eeda4/attachment.bin>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-01-27 11:33 ` Chrissie Caulfield
@ 2009-01-27 20:06   ` David Teigland
  2009-01-27 20:19     ` David Teigland
  0 siblings, 1 reply; 8+ messages in thread
From: David Teigland @ 2009-01-27 20:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> This an updated patch that uses hlists rather than list_heads to save
> memory in the connection structure.
> 
> Thanks to Steven Whitehouse for the suggestion.

I fixed some checkpatch warnings, tested, and pushed into the "next" branch.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-01-27 20:06   ` David Teigland
@ 2009-01-27 20:19     ` David Teigland
  2009-01-28 11:27       ` Chrissie Caulfield
  0 siblings, 1 reply; 8+ messages in thread
From: David Teigland @ 2009-01-27 20:19 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> > This an updated patch that uses hlists rather than list_heads to save
> > memory in the connection structure.
> > 
> > Thanks to Steven Whitehouse for the suggestion.
> 
> I fixed some checkpatch warnings, tested, and pushed into the "next" branch.

I take that back after hitting the following on unmount,

Pid: 4484, comm: umount Not tainted 2.6.29-rc2 #1
RIP: 0010:[<ffffffffa04ecfb4>]  [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm]
RSP: 0018:ffff880072db5d38  EFLAGS: 00010202
RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
RDX: ffffffffa04ed0dc RSI: 000000000000006b RDI: ffff880057998de0
RBP: ffff880072db5d58 R08: 0000000000000000 R09: ffff880057998de8
R10: 0000000000000000 R11: ffff88007dd428d8 R12: 0000000000000000
R13: ffffffffa04ecede R14: 0000000000006000 R15: 0000000000000100
FS:  00007fbce8f0b720(0000) GS:ffffffff80a33080(0000) knlGS:00000000f7f7a6c0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007ff8aa8d38e8 CR3: 0000000138c4a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 4484, threadinfo ffff880072db4000, task ffff8800738d4740)
Stack:
 ffff88007d187000 0000000000000000 ffff88007d187000 ffff88007c145fa0
 ffff880072db5d68 ffffffffa04ed35a ffff880072db5d78 ffffffffa04eaf20
 ffff880072db5db8 ffffffffa04eb299 ffff880072db5da8 ffff88007e85e198
Call Trace:
 [<ffffffffa04ed35a>] dlm_lowcomms_stop+0x68/0x82 [dlm]
 [<ffffffffa04eaf20>] threads_stop+0xe/0x15 [dlm]
 [<ffffffffa04eb299>] dlm_release_lockspace+0x372/0x3a4 [dlm]
 [<ffffffffa02720e0>] gdlm_unmount+0x28/0x49 [lock_dlm]
 [<ffffffffa047270f>] gfs2_unmount_lockproto+0x2d/0x52 [gfs2]
 [<ffffffffa0476bcc>] gfs2_lm_unmount+0x16/0x18 [gfs2]
 [<ffffffffa047afb7>] gfs2_put_super+0x180/0x190 [gfs2]
 [<ffffffff802afadc>] generic_shutdown_super+0x73/0xe8
 [<ffffffff802afb73>] kill_block_super+0x22/0x3a
 [<ffffffffa0476953>] gfs2_kill_sb+0x63/0x78 [gfs2]
 [<ffffffff802afc5c>] deactivate_super+0x68/0x7d
 [<ffffffff802c2aaf>] mntput_no_expire+0x103/0x149
 [<ffffffff802c3094>] sys_umount+0x2e2/0x341
 [<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b
Code: 23 fe df 48 89 d8 5b 41 5c c9 c3 55 48 89 e5 41 55 49 89 fd 41 54 45 31 e4 53 48 83 ec 08 4a 8b 1c e5 e0 79 50 a0 48 85 db 74 15 <48> 8b 03 48 8d bb d0 fe ff ff 0f 18 08 41 ff d5 48 8b 1b eb e6
RIP  [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm]
 RSP <ffff880072db5d38>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-01-27 20:19     ` David Teigland
@ 2009-01-28 11:27       ` Chrissie Caulfield
  2009-03-06 20:51         ` David Teigland
  0 siblings, 1 reply; 8+ messages in thread
From: Chrissie Caulfield @ 2009-01-28 11:27 UTC (permalink / raw)
  To: cluster-devel.redhat.com

David Teigland wrote:
> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
>>> This an updated patch that uses hlists rather than list_heads to save
>>> memory in the connection structure.
>>>
>>> Thanks to Steven Whitehouse for the suggestion.
>> I fixed some checkpatch warnings, tested, and pushed into the "next" branch.
> 
> I take that back after hitting the following on unmount,
> 
> Pid: 4484, comm: umount Not tainted 2.6.29-rc2 #1
> RIP: 0010:[<ffffffffa04ecfb4>]  [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm]
> RSP: 0018:ffff880072db5d38  EFLAGS: 00010202

Thanks,

The attached patch should, I hope, fix that

-- 

Chrissie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dlm_large_nodeids.patch_safe_free.patch
Type: text/x-patch
Size: 846 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090128/93693cba/attachment.bin>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-01-28 11:27       ` Chrissie Caulfield
@ 2009-03-06 20:51         ` David Teigland
  2009-03-09 10:01           ` Chrissie Caulfield
  2009-03-11 16:02           ` Chrissie Caulfield
  0 siblings, 2 replies; 8+ messages in thread
From: David Teigland @ 2009-03-06 20:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
> David Teigland wrote:
> > On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
> >> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> >>> This an updated patch that uses hlists rather than list_heads to save
> >>> memory in the connection structure.

This patch (with fix) seems to cause the following about half of the time when
killing dlm_controld:

dlm: x: leaving the lockspace group...
dlm: x: group event done 0 0
dlm: x: release_lockspace final free
dlm: closing connection to node 1
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/kernel/dlm/x/event_done
CPU 1
Modules linked in: lock_dlm dlm gfs2 configfs autofs4 sunrpc ipv6 cpufreq_ondema
nd dm_multipath video output sbs sbshc battery ac parport_pc lp parport sg butto
n serio_raw tg3 libphy i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror
 dm_region_hash dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptspi mptscsih m
ptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 10416, comm: dlm_controld Not tainted 2.6.29-rc2 #1
RIP: 0010:[<ffffffffa045116a>]  [<ffffffffa045116a>] __find_con+0x17/0x35 [dlm]
RSP: 0018:ffff88007b189da8  EFLAGS: 00010202
RAX: ffff880078ccfde8 RBX: 0000000000000001 RCX: 6b6b6b6b6b6b6b6b
RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000022 RDI: 0000000000000001
RBP: ffff88007b189da8 R08: 0000000000000000 R09: ffff88007b189d48
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: ffffffffa0462960 R15: ffff88007dd52de0
FS:  00007f71554c06e0(0000) GS:ffff88007f682210(0000) knlGS:00000000f7ef76c0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f111c3ce000 CR3: 000000007e92a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dlm_controld (pid: 10416, threadinfo ffff88007b188000, task ffff88007e47
83c0)
Stack:
 ffff88007b189dd8 ffffffffa04514ea ffffffffa026d61f 0000000000000001
 ffff880078d12b50 ffffffffa04629d0 ffff88007b189df8 ffffffffa045169c
 ffff88007b1935f8 ffff880078d12b50 ffff88007b189e18 ffffffffa0446921
Call Trace:
 [<ffffffffa04514ea>] nodeid2con+0x29/0x1b7 [dlm]
 [<ffffffffa026d61f>] ? configfs_rmdir+0x203/0x277 [configfs]
 [<ffffffffa045169c>] dlm_lowcomms_close+0x24/0x48 [dlm]
 [<ffffffffa0446921>] drop_comm+0x29/0x55 [dlm]
 [<ffffffffa026be0c>] client_drop_item+0x25/0x31 [configfs]
 [<ffffffffa026d63d>] configfs_rmdir+0x221/0x277 [configfs]
 [<ffffffff804d0609>] ? _spin_unlock+0x26/0x2a
 [<ffffffff802b5ca9>] vfs_rmdir+0xc5/0x137
 [<ffffffff802b7c00>] do_rmdir+0xb5/0x107
 [<ffffffff8026f0a0>] ? audit_syscall_entry+0x16b/0x19e
 [<ffffffff802b7c89>] sys_rmdir+0x11/0x13
 [<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b
Code: c7 80 34 46 a0 31 db e8 b1 d9 07 e0 48 89 d8 5b 41 5c c9 c3 48 89 f8 55 83
 e0 1f 48 8b 14 c5 e0 bb 46 a0 48 89 e5 48 85 d2 74 1a <39> ba d8 fe ff ff 48 8b
 0a 48 8d 82 d0 fe ff ff 0f 18 09 74 07
RIP  [<ffffffffa045116a>] __find_con+0x17/0x35 [dlm]
 RSP <ffff88007b189da8>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-03-06 20:51         ` David Teigland
@ 2009-03-09 10:01           ` Chrissie Caulfield
  2009-03-11 16:02           ` Chrissie Caulfield
  1 sibling, 0 replies; 8+ messages in thread
From: Chrissie Caulfield @ 2009-03-09 10:01 UTC (permalink / raw)
  To: cluster-devel.redhat.com

David Teigland wrote:
> On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
>> David Teigland wrote:
>>> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
>>>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
>>>>> This an updated patch that uses hlists rather than list_heads to save
>>>>> memory in the connection structure.
> 
> This patch (with fix) seems to cause the following about half of the time when
> killing dlm_controld:


I thought you were going to change the iterator in foreach_conn to use
hlist_for_each_entry_safe() ?

My guess is that the connection is being freed  by free_conn and messing
up the list.

Chrissie



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cluster-devel] [PATCH] dlm: Allow large nodeids
  2009-03-06 20:51         ` David Teigland
  2009-03-09 10:01           ` Chrissie Caulfield
@ 2009-03-11 16:02           ` Chrissie Caulfield
  1 sibling, 0 replies; 8+ messages in thread
From: Chrissie Caulfield @ 2009-03-11 16:02 UTC (permalink / raw)
  To: cluster-devel.redhat.com

David Teigland wrote:
> On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
>> David Teigland wrote:
>>> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
>>>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
>>>>> This an updated patch that uses hlists rather than list_heads to save
>>>>> memory in the connection structure.
> 
> This patch (with fix) seems to cause the following about half of the time when
> killing dlm_controld:
> 

Oops, Something slightly vital was missing from free_conn() ...


Chrissie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dlm_oops_fix.patch
Type: text/x-patch
Size: 360 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090311/e7a4206f/attachment.bin>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-03-11 16:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-27 10:44 [Cluster-devel] [PATCH] dlm: Allow large nodeids Chrissie Caulfield
2009-01-27 11:33 ` Chrissie Caulfield
2009-01-27 20:06   ` David Teigland
2009-01-27 20:19     ` David Teigland
2009-01-28 11:27       ` Chrissie Caulfield
2009-03-06 20:51         ` David Teigland
2009-03-09 10:01           ` Chrissie Caulfield
2009-03-11 16:02           ` Chrissie Caulfield

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.