* Fscache support for Ceph @ 2013-05-23 21:48 Milosz Tanski [not found] ` <CAKxz0mwqStYgeHnCvYokizsJXoe_cOccMSjx8L=EO9rFPyaK_A@mail.gmail.com> 0 siblings, 1 reply; 3+ messages in thread From: Milosz Tanski @ 2013-05-23 21:48 UTC (permalink / raw) To: ceph-devel, linux-cachefs This is my first at adding fscache support for the Ceph Linux module. My motivation for doing this work was speedup our distributed database that uses the Ceph filesystem as a backing store. By far more of the workload that our application is doing is read only and latency is our biggest challenge. Being able to cache frequently used blocks on the SSD drives that our machines use dramatically speeds up our query setup time when we're fetching multiple compressed indexes and then navigating the block tree. The branch containing the two patches is here: https://bitbucket.org/adfin/linux-fs.git in the forceph branch. If you want to review it in your browser here is the bitbucket url: https://bitbucket.org/adfin/linux-fs/commits/branch/forceph I've tested this both in mainline and in the branch that features upcoming fscache changes. The patches are broken into two pieces. 01 - Setups the facility for fscache in it's independent files 02 - Enables fscache in the ceph filesystem and adds a new configuration option The patches will follow in the new few emails as well. Future wise; there's some new work being done to add write-back caching to fscache & NFS. When that's done I'd like to integrated the Ceph fscache implementation. From the benchmarks of the author of that it seams like it has much the same benefit for write to NFS as bcache does. I'd like to get this into ceph, and I'm looking for feedback. Thanks, - Milosz ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <CAKxz0mwqStYgeHnCvYokizsJXoe_cOccMSjx8L=EO9rFPyaK_A@mail.gmail.com>]
* Re: Fscache support for Ceph [not found] ` <CAKxz0mwqStYgeHnCvYokizsJXoe_cOccMSjx8L=EO9rFPyaK_A@mail.gmail.com> @ 2013-05-29 13:35 ` Milosz Tanski 2013-05-29 17:46 ` Milosz Tanski 0 siblings, 1 reply; 3+ messages in thread From: Milosz Tanski @ 2013-05-29 13:35 UTC (permalink / raw) To: Elso Andras; +Cc: ceph-devel, linux-cachefs Elbandi, Thanks to your stack trace I see the bug. I'll send you a fix as soon as I get back to my office. Apparently, I spent too much time testing it in UP vms and UML. Thanks, -- Milosz On Wed, May 29, 2013 at 5:47 AM, Elso Andras <elso.andras@gmail.com> wrote: > Hi, > > I try your fscache patch on my test cluster. the client node is a > ubuntu lucid (10.4) with 3.8 kernel (*) + your patch. > Little after i mount the cephfs, i got this: > > [ 316.303851] Pid: 1565, comm: lighttpd Not tainted 3.8.0-22-fscache > #33 HP ProLiant DL160 G6 > [ 316.303853] RIP: 0010:[<ffffffff81045c42>] [<ffffffff81045c42>] > __ticket_spin_lock+0x22/0x30 > [ 316.303861] RSP: 0018:ffff8804180e79f8 EFLAGS: 00000297 > [ 316.303863] RAX: 0000000000000004 RBX: ffffffffa0224e53 RCX: 0000000000000004 > [ 316.303865] RDX: 0000000000000005 RSI: 00000000000000d0 RDI: ffff88041eb29a50 > [ 316.303866] RBP: ffff8804180e79f8 R08: ffffe8ffffa40150 R09: 0000000000000000 > [ 316.303868] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88041da75050 > [ 316.303869] R13: ffff880428ef0000 R14: ffffffff81702b86 R15: ffff8804180e7968 > [ 316.303871] FS: 00007fbcca138700(0000) GS:ffff88042f240000(0000) > knlGS:0000000000000000 > [ 316.303873] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 316.303875] CR2: 00007f5c96649f00 CR3: 00000004180c9000 CR4: 00000000000007e0 > [ 316.303877] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 316.303878] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 316.303880] Process lighttpd (pid: 1565, threadinfo > ffff8804180e6000, task ffff88041cc22e80) > [ 316.303881] Stack: > [ 316.303883] ffff8804180e7a08 ffffffff817047ae ffff8804180e7a58 > ffffffffa02c816a > [ 316.303886] ffff8804180e7a58 ffff88041eb29a50 0000000000000000 > ffff88041eb29d50 > [ 316.303889] ffff88041eb29a50 ffff88041b29ed00 ffff88041eb29a40 > 0000000000000d01 > [ 316.303892] Call Trace: > [ 316.303898] [<ffffffff817047ae>] _raw_spin_lock+0xe/0x20 > [ 316.303910] [<ffffffffa02c816a>] ceph_init_file+0xca/0x1c0 [ceph] > [ 316.303917] [<ffffffffa02c83e1>] ceph_open+0x181/0x3c0 [ceph] > [ 316.303925] [<ffffffffa02c8260>] ? ceph_init_file+0x1c0/0x1c0 [ceph] > [ 316.303930] [<ffffffff8119a62e>] do_dentry_open+0x21e/0x2a0 > [ 316.303933] [<ffffffff8119a6e5>] finish_open+0x35/0x50 > [ 316.303940] [<ffffffffa02c9304>] ceph_atomic_open+0x214/0x2f0 [ceph] > [ 316.303944] [<ffffffff811b416f>] ? __d_alloc+0x5f/0x180 > [ 316.303948] [<ffffffff811a7fa1>] atomic_open+0xf1/0x460 > [ 316.303951] [<ffffffff811a86f4>] lookup_open+0x1a4/0x1d0 > [ 316.303954] [<ffffffff811a8fad>] do_last+0x30d/0x820 > [ 316.303958] [<ffffffff811ab413>] path_openat+0xb3/0x4d0 > [ 316.303962] [<ffffffff815da87d>] ? sock_aio_read+0x2d/0x40 > [ 316.303965] [<ffffffff8119c333>] ? do_sync_read+0xa3/0xe0 > [ 316.303968] [<ffffffff811ac232>] do_filp_open+0x42/0xa0 > [ 316.303971] [<ffffffff811b9eb5>] ? __alloc_fd+0xe5/0x170 > [ 316.303974] [<ffffffff8119be8a>] do_sys_open+0xfa/0x250 > [ 316.303977] [<ffffffff8119cacd>] ? vfs_read+0x10d/0x180 > [ 316.303980] [<ffffffff8119c001>] sys_open+0x21/0x30 > [ 316.303983] [<ffffffff8170d61d>] system_call_fastpath+0x1a/0x1f > > And the console print this lines forever, server is freezed: > [ 376.305754] BUG: soft lockup - CPU#2 stuck for 22s! [lighttpd:1565] > [ 404.294735] BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:39] > [ 404.306735] BUG: soft lockup - CPU#2 stuck for 22s! [lighttpd:1565] > [ 432.295716] BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:39] > > Have you any idea? > > Elbandi > > * http://packages.ubuntu.com/raring/linux-image-3.8.0-19-generic > > 2013/5/23 Milosz Tanski <milosz@adfin.com>: >> This is my first at adding fscache support for the Ceph Linux module. >> >> My motivation for doing this work was speedup our distributed database >> that uses the Ceph filesystem as a backing store. By far more of the >> workload that our application is doing is read only and latency is our >> biggest challenge. Being able to cache frequently used blocks on the >> SSD drives that our machines use dramatically speeds up our query >> setup time when we're fetching multiple compressed indexes and then >> navigating the block tree. >> >> The branch containing the two patches is here: >> https://bitbucket.org/adfin/linux-fs.git in the forceph branch. >> >> If you want to review it in your browser here is the bitbucket url: >> https://bitbucket.org/adfin/linux-fs/commits/branch/forceph >> >> I've tested this both in mainline and in the branch that features >> upcoming fscache changes. The patches are broken into two pieces. >> >> 01 - Setups the facility for fscache in it's independent files >> 02 - Enables fscache in the ceph filesystem and adds a new configuration option >> >> The patches will follow in the new few emails as well. >> >> Future wise; there's some new work being done to add write-back >> caching to fscache & NFS. When that's done I'd like to integrated the >> Ceph fscache implementation. From the benchmarks of the author of that >> it seams like it has much the same benefit for write to NFS as bcache >> does. >> >> I'd like to get this into ceph, and I'm looking for feedback. >> >> Thanks, >> - Milosz >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fscache support for Ceph 2013-05-29 13:35 ` Milosz Tanski @ 2013-05-29 17:46 ` Milosz Tanski 0 siblings, 0 replies; 3+ messages in thread From: Milosz Tanski @ 2013-05-29 17:46 UTC (permalink / raw) To: Elso Andras; +Cc: ceph-devel, linux-cachefs Elso, I have both good and bad news for you. First, the good news is that I fixed this particular issue. You can find the changes needed here: https://bitbucket.org/adfin/linux-fs/commits/339c82d37ec0223733778f83111f29599f220e35. As you can see it's a simple fix. I also put another patch in my tree that makes fscache a mount option. The bad news is that when working with the ubuntu 3.8.9-22 kernel on LTS there an sporadic crash. This is due to a bug in the upstream kernel code. There is a fix for it in David Howells tree: http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/commit/?h=fscache&id=82958c45e35963c93fc6cbe6a27752e2d97e9f9a I can't repro this under normal conditions but I can repo it forcing the kernel to drop caches. Best, - Milosz On Wed, May 29, 2013 at 9:35 AM, Milosz Tanski <milosz@adfin.com> wrote: > Elbandi, > > Thanks to your stack trace I see the bug. I'll send you a fix as soon > as I get back to my office. Apparently, I spent too much time testing > it in UP vms and UML. > > Thanks, > -- Milosz > > On Wed, May 29, 2013 at 5:47 AM, Elso Andras <elso.andras@gmail.com> wrote: >> Hi, >> >> I try your fscache patch on my test cluster. the client node is a >> ubuntu lucid (10.4) with 3.8 kernel (*) + your patch. >> Little after i mount the cephfs, i got this: >> >> [ 316.303851] Pid: 1565, comm: lighttpd Not tainted 3.8.0-22-fscache >> #33 HP ProLiant DL160 G6 >> [ 316.303853] RIP: 0010:[<ffffffff81045c42>] [<ffffffff81045c42>] >> __ticket_spin_lock+0x22/0x30 >> [ 316.303861] RSP: 0018:ffff8804180e79f8 EFLAGS: 00000297 >> [ 316.303863] RAX: 0000000000000004 RBX: ffffffffa0224e53 RCX: 0000000000000004 >> [ 316.303865] RDX: 0000000000000005 RSI: 00000000000000d0 RDI: ffff88041eb29a50 >> [ 316.303866] RBP: ffff8804180e79f8 R08: ffffe8ffffa40150 R09: 0000000000000000 >> [ 316.303868] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88041da75050 >> [ 316.303869] R13: ffff880428ef0000 R14: ffffffff81702b86 R15: ffff8804180e7968 >> [ 316.303871] FS: 00007fbcca138700(0000) GS:ffff88042f240000(0000) >> knlGS:0000000000000000 >> [ 316.303873] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 316.303875] CR2: 00007f5c96649f00 CR3: 00000004180c9000 CR4: 00000000000007e0 >> [ 316.303877] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 316.303878] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> [ 316.303880] Process lighttpd (pid: 1565, threadinfo >> ffff8804180e6000, task ffff88041cc22e80) >> [ 316.303881] Stack: >> [ 316.303883] ffff8804180e7a08 ffffffff817047ae ffff8804180e7a58 >> ffffffffa02c816a >> [ 316.303886] ffff8804180e7a58 ffff88041eb29a50 0000000000000000 >> ffff88041eb29d50 >> [ 316.303889] ffff88041eb29a50 ffff88041b29ed00 ffff88041eb29a40 >> 0000000000000d01 >> [ 316.303892] Call Trace: >> [ 316.303898] [<ffffffff817047ae>] _raw_spin_lock+0xe/0x20 >> [ 316.303910] [<ffffffffa02c816a>] ceph_init_file+0xca/0x1c0 [ceph] >> [ 316.303917] [<ffffffffa02c83e1>] ceph_open+0x181/0x3c0 [ceph] >> [ 316.303925] [<ffffffffa02c8260>] ? ceph_init_file+0x1c0/0x1c0 [ceph] >> [ 316.303930] [<ffffffff8119a62e>] do_dentry_open+0x21e/0x2a0 >> [ 316.303933] [<ffffffff8119a6e5>] finish_open+0x35/0x50 >> [ 316.303940] [<ffffffffa02c9304>] ceph_atomic_open+0x214/0x2f0 [ceph] >> [ 316.303944] [<ffffffff811b416f>] ? __d_alloc+0x5f/0x180 >> [ 316.303948] [<ffffffff811a7fa1>] atomic_open+0xf1/0x460 >> [ 316.303951] [<ffffffff811a86f4>] lookup_open+0x1a4/0x1d0 >> [ 316.303954] [<ffffffff811a8fad>] do_last+0x30d/0x820 >> [ 316.303958] [<ffffffff811ab413>] path_openat+0xb3/0x4d0 >> [ 316.303962] [<ffffffff815da87d>] ? sock_aio_read+0x2d/0x40 >> [ 316.303965] [<ffffffff8119c333>] ? do_sync_read+0xa3/0xe0 >> [ 316.303968] [<ffffffff811ac232>] do_filp_open+0x42/0xa0 >> [ 316.303971] [<ffffffff811b9eb5>] ? __alloc_fd+0xe5/0x170 >> [ 316.303974] [<ffffffff8119be8a>] do_sys_open+0xfa/0x250 >> [ 316.303977] [<ffffffff8119cacd>] ? vfs_read+0x10d/0x180 >> [ 316.303980] [<ffffffff8119c001>] sys_open+0x21/0x30 >> [ 316.303983] [<ffffffff8170d61d>] system_call_fastpath+0x1a/0x1f >> >> And the console print this lines forever, server is freezed: >> [ 376.305754] BUG: soft lockup - CPU#2 stuck for 22s! [lighttpd:1565] >> [ 404.294735] BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:39] >> [ 404.306735] BUG: soft lockup - CPU#2 stuck for 22s! [lighttpd:1565] >> [ 432.295716] BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:39] >> >> Have you any idea? >> >> Elbandi >> >> * http://packages.ubuntu.com/raring/linux-image-3.8.0-19-generic >> >> 2013/5/23 Milosz Tanski <milosz@adfin.com>: >>> This is my first at adding fscache support for the Ceph Linux module. >>> >>> My motivation for doing this work was speedup our distributed database >>> that uses the Ceph filesystem as a backing store. By far more of the >>> workload that our application is doing is read only and latency is our >>> biggest challenge. Being able to cache frequently used blocks on the >>> SSD drives that our machines use dramatically speeds up our query >>> setup time when we're fetching multiple compressed indexes and then >>> navigating the block tree. >>> >>> The branch containing the two patches is here: >>> https://bitbucket.org/adfin/linux-fs.git in the forceph branch. >>> >>> If you want to review it in your browser here is the bitbucket url: >>> https://bitbucket.org/adfin/linux-fs/commits/branch/forceph >>> >>> I've tested this both in mainline and in the branch that features >>> upcoming fscache changes. The patches are broken into two pieces. >>> >>> 01 - Setups the facility for fscache in it's independent files >>> 02 - Enables fscache in the ceph filesystem and adds a new configuration option >>> >>> The patches will follow in the new few emails as well. >>> >>> Future wise; there's some new work being done to add write-back >>> caching to fscache & NFS. When that's done I'd like to integrated the >>> Ceph fscache implementation. From the benchmarks of the author of that >>> it seams like it has much the same benefit for write to NFS as bcache >>> does. >>> >>> I'd like to get this into ceph, and I'm looking for feedback. >>> >>> Thanks, >>> - Milosz >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-05-29 17:46 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-05-23 21:48 Fscache support for Ceph Milosz Tanski [not found] ` <CAKxz0mwqStYgeHnCvYokizsJXoe_cOccMSjx8L=EO9rFPyaK_A@mail.gmail.com> 2013-05-29 13:35 ` Milosz Tanski 2013-05-29 17:46 ` Milosz Tanski
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.