All of lore.kernel.org
 help / color / mirror / Atom feed
* Proposal for adding disable FileJournal option
@ 2014-01-09  8:13 Haomai Wang
  2014-01-09 17:28 ` Gregory Farnum
  0 siblings, 1 reply; 9+ messages in thread
From: Haomai Wang @ 2014-01-09  8:13 UTC (permalink / raw)
  To: ceph-devel

Hi all,

We know FileJournal plays a important role in FileStore backend, it can
hugely reduce write latency and improve small write operations.

But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).

If cachepool enabled, we may use use journal in cache_pool but may
not like to use journal in base_pool. The main reason why drop journal
in base_pool is that journal take over a single physical device and waste
too much in base_pool.

Like above, if I enable FlashCache or other cache, I'd not like to enable
journal in OSD layer.

So is it necessary to disable journal in special(not really special) case?

Best regards,
Wheats




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-09  8:13 Proposal for adding disable FileJournal option Haomai Wang
@ 2014-01-09 17:28 ` Gregory Farnum
  2014-01-09 18:16   ` Stefan Priebe - Profihost AG
  2014-01-10  2:04   ` Haomai Wang
  0 siblings, 2 replies; 9+ messages in thread
From: Gregory Farnum @ 2014-01-09 17:28 UTC (permalink / raw)
  To: Haomai Wang; +Cc: ceph-devel

The FileJournal is also for data safety whenever we're using write
ahead. To disable it we need a backing store that we know can provide
us consistent checkpoints (i.e., we can use parallel journaling mode —
so for the FileJournal, we're using btrfs, or maybe zfs someday). But
for those systems you can already configure the system not to use a
journal.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
> Hi all,
>
> We know FileJournal plays a important role in FileStore backend, it can
> hugely reduce write latency and improve small write operations.
>
> But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>
> If cachepool enabled, we may use use journal in cache_pool but may
> not like to use journal in base_pool. The main reason why drop journal
> in base_pool is that journal take over a single physical device and waste
> too much in base_pool.
>
> Like above, if I enable FlashCache or other cache, I'd not like to enable
> journal in OSD layer.
>
> So is it necessary to disable journal in special(not really special) case?
>
> Best regards,
> Wheats
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-09 17:28 ` Gregory Farnum
@ 2014-01-09 18:16   ` Stefan Priebe - Profihost AG
  2014-01-10  2:04   ` Haomai Wang
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Priebe - Profihost AG @ 2014-01-09 18:16 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Haomai Wang, ceph-devel

I had the samt question in the past but there seems no way to change it for the ceph team

Stefan

This mail was sent with my iPhone.

> Am 09.01.2014 um 18:28 schrieb Gregory Farnum <greg@inktank.com>:
> 
> The FileJournal is also for data safety whenever we're using write
> ahead. To disable it we need a backing store that we know can provide
> us consistent checkpoints (i.e., we can use parallel journaling mode —
> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
> for those systems you can already configure the system not to use a
> journal.
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> 
> 
>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
>> Hi all,
>> 
>> We know FileJournal plays a important role in FileStore backend, it can
>> hugely reduce write latency and improve small write operations.
>> 
>> But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>> 
>> If cachepool enabled, we may use use journal in cache_pool but may
>> not like to use journal in base_pool. The main reason why drop journal
>> in base_pool is that journal take over a single physical device and waste
>> too much in base_pool.
>> 
>> Like above, if I enable FlashCache or other cache, I'd not like to enable
>> journal in OSD layer.
>> 
>> So is it necessary to disable journal in special(not really special) case?
>> 
>> Best regards,
>> Wheats
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-09 17:28 ` Gregory Farnum
  2014-01-09 18:16   ` Stefan Priebe - Profihost AG
@ 2014-01-10  2:04   ` Haomai Wang
  2014-01-10  3:08     ` Dong Yuan
  1 sibling, 1 reply; 9+ messages in thread
From: Haomai Wang @ 2014-01-10  2:04 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
>
> The FileJournal is also for data safety whenever we're using write
> ahead. To disable it we need a backing store that we know can provide
> us consistent checkpoints (i.e., we can use parallel journaling mode —
> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
> for those systems you can already configure the system not to use a
> journal.

Yes, it depends on backend. For example, FileStore can write a object with sync
to sure consistent. If adding a disable FileJournal option, we need
some works on
FileStore to implement it.

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
> > Hi all,
> >
> > We know FileJournal plays a important role in FileStore backend, it can
> > hugely reduce write latency and improve small write operations.
> >
> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
> >
> > If cachepool enabled, we may use use journal in cache_pool but may
> > not like to use journal in base_pool. The main reason why drop journal
> > in base_pool is that journal take over a single physical device and waste
> > too much in base_pool.
> >
> > Like above, if I enable FlashCache or other cache, I'd not like to enable
> > journal in OSD layer.
> >
> > So is it necessary to disable journal in special(not really special) case?
> >
> > Best regards,
> > Wheats
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html




-- 

Best Regards,

Wheat
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-10  2:04   ` Haomai Wang
@ 2014-01-10  3:08     ` Dong Yuan
  2014-01-10  3:13       ` Gregory Farnum
  0 siblings, 1 reply; 9+ messages in thread
From: Dong Yuan @ 2014-01-10  3:08 UTC (permalink / raw)
  To: Haomai Wang; +Cc: Gregory Farnum, ceph-devel

The Journal is the part of implementation of ObjectStore Transaction
Interface, while transaction is used by PG to write pglog with object
data in one transaction.
So I think if the FileJournal could be disabled, there must be
something else to implement the Transaction Interface. But it seems
hard while no local file-system provide such function in my opinion.


On 10 January 2014 10:04, Haomai Wang <haomaiwang@gmail.com> wrote:
> On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
>>
>> The FileJournal is also for data safety whenever we're using write
>> ahead. To disable it we need a backing store that we know can provide
>> us consistent checkpoints (i.e., we can use parallel journaling mode —
>> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
>> for those systems you can already configure the system not to use a
>> journal.
>
> Yes, it depends on backend. For example, FileStore can write a object with sync
> to sure consistent. If adding a disable FileJournal option, we need
> some works on
> FileStore to implement it.
>
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
>> > Hi all,
>> >
>> > We know FileJournal plays a important role in FileStore backend, it can
>> > hugely reduce write latency and improve small write operations.
>> >
>> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>> >
>> > If cachepool enabled, we may use use journal in cache_pool but may
>> > not like to use journal in base_pool. The main reason why drop journal
>> > in base_pool is that journal take over a single physical device and waste
>> > too much in base_pool.
>> >
>> > Like above, if I enable FlashCache or other cache, I'd not like to enable
>> > journal in OSD layer.
>> >
>> > So is it necessary to disable journal in special(not really special) case?
>> >
>> > Best regards,
>> > Wheats
>> >
>> >
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
>
> --
>
> Best Regards,
>
> Wheat
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Dong Yuan
Email:yuandong1222@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-10  3:08     ` Dong Yuan
@ 2014-01-10  3:13       ` Gregory Farnum
  2014-01-11  5:24         ` Haomai Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Gregory Farnum @ 2014-01-10  3:13 UTC (permalink / raw)
  To: Dong Yuan; +Cc: Haomai Wang, ceph-devel

Exactly. We can't do a safe update without a journal — what if power
goes out while the write is happening? When we boot back up, we don't
know what version the object is actually at. So if you're using btrfs,
you can run without a journal already (and depend on snapshots for
recovering after failures); if you are using xfs or ext4 a journal is
required for any safety at all, even when it's fronted by a cache
pool.

On Thu, Jan 9, 2014 at 7:08 PM, Dong Yuan <yuandong1222@gmail.com> wrote:
> The Journal is the part of implementation of ObjectStore Transaction
> Interface, while transaction is used by PG to write pglog with object
> data in one transaction.
> So I think if the FileJournal could be disabled, there must be
> something else to implement the Transaction Interface. But it seems
> hard while no local file-system provide such function in my opinion.
>
>
> On 10 January 2014 10:04, Haomai Wang <haomaiwang@gmail.com> wrote:
>> On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
>>>
>>> The FileJournal is also for data safety whenever we're using write
>>> ahead. To disable it we need a backing store that we know can provide
>>> us consistent checkpoints (i.e., we can use parallel journaling mode —
>>> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
>>> for those systems you can already configure the system not to use a
>>> journal.
>>
>> Yes, it depends on backend. For example, FileStore can write a object with sync
>> to sure consistent. If adding a disable FileJournal option, we need
>> some works on
>> FileStore to implement it.
>>
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
>>> > Hi all,
>>> >
>>> > We know FileJournal plays a important role in FileStore backend, it can
>>> > hugely reduce write latency and improve small write operations.
>>> >
>>> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>>> >
>>> > If cachepool enabled, we may use use journal in cache_pool but may
>>> > not like to use journal in base_pool. The main reason why drop journal
>>> > in base_pool is that journal take over a single physical device and waste
>>> > too much in base_pool.
>>> >
>>> > Like above, if I enable FlashCache or other cache, I'd not like to enable
>>> > journal in OSD layer.
>>> >
>>> > So is it necessary to disable journal in special(not really special) case?
>>> >
>>> > Best regards,
>>> > Wheats
>>> >
>>> >
>>> >
>>> > --
>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> > the body of a message to majordomo@vger.kernel.org
>>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>
>> --
>>
>> Best Regards,
>>
>> Wheat
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-10  3:13       ` Gregory Farnum
@ 2014-01-11  5:24         ` Haomai Wang
  2014-01-11 15:18           ` Dong Yuan
  0 siblings, 1 reply; 9+ messages in thread
From: Haomai Wang @ 2014-01-11  5:24 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Dong Yuan, ceph-devel

On Fri, Jan 10, 2014 at 11:13 AM, Gregory Farnum <greg@inktank.com> wrote:
> Exactly. We can't do a safe update without a journal — what if power
> goes out while the write is happening? When we boot back up, we don't
> know what version the object is actually at. So if you're using btrfs,
> you can run without a journal already (and depend on snapshots for
> recovering after failures); if you are using xfs or ext4 a journal is
> required for any safety at all, even when it's fronted by a cache
> pool.

I'm not fully agree with it. Why we can't call "fdatasync()" during
each transaction to
ensure consistent if exists cache in the front of.

>
> On Thu, Jan 9, 2014 at 7:08 PM, Dong Yuan <yuandong1222@gmail.com> wrote:
>> The Journal is the part of implementation of ObjectStore Transaction
>> Interface, while transaction is used by PG to write pglog with object
>> data in one transaction.
>> So I think if the FileJournal could be disabled, there must be
>> something else to implement the Transaction Interface. But it seems
>> hard while no local file-system provide such function in my opinion.
>>
>>
>> On 10 January 2014 10:04, Haomai Wang <haomaiwang@gmail.com> wrote:
>>> On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
>>>>
>>>> The FileJournal is also for data safety whenever we're using write
>>>> ahead. To disable it we need a backing store that we know can provide
>>>> us consistent checkpoints (i.e., we can use parallel journaling mode —
>>>> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
>>>> for those systems you can already configure the system not to use a
>>>> journal.
>>>
>>> Yes, it depends on backend. For example, FileStore can write a object with sync
>>> to sure consistent. If adding a disable FileJournal option, we need
>>> some works on
>>> FileStore to implement it.
>>>
>>>> -Greg
>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>
>>>>
>>>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
>>>> > Hi all,
>>>> >
>>>> > We know FileJournal plays a important role in FileStore backend, it can
>>>> > hugely reduce write latency and improve small write operations.
>>>> >
>>>> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>>>> >
>>>> > If cachepool enabled, we may use use journal in cache_pool but may
>>>> > not like to use journal in base_pool. The main reason why drop journal
>>>> > in base_pool is that journal take over a single physical device and waste
>>>> > too much in base_pool.
>>>> >
>>>> > Like above, if I enable FlashCache or other cache, I'd not like to enable
>>>> > journal in OSD layer.
>>>> >
>>>> > So is it necessary to disable journal in special(not really special) case?
>>>> >
>>>> > Best regards,
>>>> > Wheats
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> > the body of a message to majordomo@vger.kernel.org
>>>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Best Regards,
>>>
>>> Wheat
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> --
>> Dong Yuan
>> Email:yuandong1222@gmail.com



-- 
Best Regards,

Wheat
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-11  5:24         ` Haomai Wang
@ 2014-01-11 15:18           ` Dong Yuan
  2014-01-12 11:25             ` Haomai Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Dong Yuan @ 2014-01-11 15:18 UTC (permalink / raw)
  To: Haomai Wang; +Cc: Gregory Farnum, ceph-devel

It is not only for consistent between memory and disk. The key point
is to implement the atomicity of an trancation.

That is when an trancation needs to write an object and update the
pglog at the same time, we must make sure the two IO do both or
nether.

With the journal, when osd restore from failure, the reply process can
redo the transcation. I think that is why the journal can not be
disabled.

On 11 January 2014 13:24, Haomai Wang <haomaiwang@gmail.com> wrote:
> On Fri, Jan 10, 2014 at 11:13 AM, Gregory Farnum <greg@inktank.com> wrote:
>> Exactly. We can't do a safe update without a journal — what if power
>> goes out while the write is happening? When we boot back up, we don't
>> know what version the object is actually at. So if you're using btrfs,
>> you can run without a journal already (and depend on snapshots for
>> recovering after failures); if you are using xfs or ext4 a journal is
>> required for any safety at all, even when it's fronted by a cache
>> pool.
>
> I'm not fully agree with it. Why we can't call "fdatasync()" during
> each transaction to
> ensure consistent if exists cache in the front of.
>
>>
>> On Thu, Jan 9, 2014 at 7:08 PM, Dong Yuan <yuandong1222@gmail.com> wrote:
>>> The Journal is the part of implementation of ObjectStore Transaction
>>> Interface, while transaction is used by PG to write pglog with object
>>> data in one transaction.
>>> So I think if the FileJournal could be disabled, there must be
>>> something else to implement the Transaction Interface. But it seems
>>> hard while no local file-system provide such function in my opinion.
>>>
>>>
>>> On 10 January 2014 10:04, Haomai Wang <haomaiwang@gmail.com> wrote:
>>>> On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
>>>>>
>>>>> The FileJournal is also for data safety whenever we're using write
>>>>> ahead. To disable it we need a backing store that we know can provide
>>>>> us consistent checkpoints (i.e., we can use parallel journaling mode —
>>>>> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
>>>>> for those systems you can already configure the system not to use a
>>>>> journal.
>>>>
>>>> Yes, it depends on backend. For example, FileStore can write a object with sync
>>>> to sure consistent. If adding a disable FileJournal option, we need
>>>> some works on
>>>> FileStore to implement it.
>>>>
>>>>> -Greg
>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>
>>>>>
>>>>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
>>>>> > Hi all,
>>>>> >
>>>>> > We know FileJournal plays a important role in FileStore backend, it can
>>>>> > hugely reduce write latency and improve small write operations.
>>>>> >
>>>>> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>>>>> >
>>>>> > If cachepool enabled, we may use use journal in cache_pool but may
>>>>> > not like to use journal in base_pool. The main reason why drop journal
>>>>> > in base_pool is that journal take over a single physical device and waste
>>>>> > too much in base_pool.
>>>>> >
>>>>> > Like above, if I enable FlashCache or other cache, I'd not like to enable
>>>>> > journal in OSD layer.
>>>>> >
>>>>> > So is it necessary to disable journal in special(not really special) case?
>>>>> >
>>>>> > Best regards,
>>>>> > Wheats
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> > the body of a message to majordomo@vger.kernel.org
>>>>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Best Regards,
>>>>
>>>> Wheat
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>> --
>>> Dong Yuan
>>> Email:yuandong1222@gmail.com
>
>
>
> --
> Best Regards,
>
> Wheat



-- 
Dong Yuan
Email:yuandong1222@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for adding disable FileJournal option
  2014-01-11 15:18           ` Dong Yuan
@ 2014-01-12 11:25             ` Haomai Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Haomai Wang @ 2014-01-12 11:25 UTC (permalink / raw)
  To: Dong Yuan; +Cc: Gregory Farnum, ceph-devel

On Sat, Jan 11, 2014 at 11:18 PM, Dong Yuan <yuandong1222@gmail.com> wrote:
> It is not only for consistent between memory and disk. The key point
> is to implement the atomicity of an trancation.
>
> That is when an trancation needs to write an object and update the
> pglog at the same time, we must make sure the two IO do both or
> nether.
>
> With the journal, when osd restore from failure, the reply process can
> redo the transcation. I think that is why the journal can not be
> disabled.

Hmm, I missed it. Journal is the guarantee for the atomic of transaction.

Thanks!

>
> On 11 January 2014 13:24, Haomai Wang <haomaiwang@gmail.com> wrote:
>> On Fri, Jan 10, 2014 at 11:13 AM, Gregory Farnum <greg@inktank.com> wrote:
>>> Exactly. We can't do a safe update without a journal — what if power
>>> goes out while the write is happening? When we boot back up, we don't
>>> know what version the object is actually at. So if you're using btrfs,
>>> you can run without a journal already (and depend on snapshots for
>>> recovering after failures); if you are using xfs or ext4 a journal is
>>> required for any safety at all, even when it's fronted by a cache
>>> pool.
>>
>> I'm not fully agree with it. Why we can't call "fdatasync()" during
>> each transaction to
>> ensure consistent if exists cache in the front of.
>>
>>>
>>> On Thu, Jan 9, 2014 at 7:08 PM, Dong Yuan <yuandong1222@gmail.com> wrote:
>>>> The Journal is the part of implementation of ObjectStore Transaction
>>>> Interface, while transaction is used by PG to write pglog with object
>>>> data in one transaction.
>>>> So I think if the FileJournal could be disabled, there must be
>>>> something else to implement the Transaction Interface. But it seems
>>>> hard while no local file-system provide such function in my opinion.
>>>>
>>>>
>>>> On 10 January 2014 10:04, Haomai Wang <haomaiwang@gmail.com> wrote:
>>>>> On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
>>>>>>
>>>>>> The FileJournal is also for data safety whenever we're using write
>>>>>> ahead. To disable it we need a backing store that we know can provide
>>>>>> us consistent checkpoints (i.e., we can use parallel journaling mode —
>>>>>> so for the FileJournal, we're using btrfs, or maybe zfs someday). But
>>>>>> for those systems you can already configure the system not to use a
>>>>>> journal.
>>>>>
>>>>> Yes, it depends on backend. For example, FileStore can write a object with sync
>>>>> to sure consistent. If adding a disable FileJournal option, we need
>>>>> some works on
>>>>> FileStore to implement it.
>>>>>
>>>>>> -Greg
>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
>>>>>> > Hi all,
>>>>>> >
>>>>>> > We know FileJournal plays a important role in FileStore backend, it can
>>>>>> > hugely reduce write latency and improve small write operations.
>>>>>> >
>>>>>> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready).
>>>>>> >
>>>>>> > If cachepool enabled, we may use use journal in cache_pool but may
>>>>>> > not like to use journal in base_pool. The main reason why drop journal
>>>>>> > in base_pool is that journal take over a single physical device and waste
>>>>>> > too much in base_pool.
>>>>>> >
>>>>>> > Like above, if I enable FlashCache or other cache, I'd not like to enable
>>>>>> > journal in OSD layer.
>>>>>> >
>>>>>> > So is it necessary to disable journal in special(not really special) case?
>>>>>> >
>>>>>> > Best regards,
>>>>>> > Wheats
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> > the body of a message to majordomo@vger.kernel.org
>>>>>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Wheat
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>>
>>>> --
>>>> Dong Yuan
>>>> Email:yuandong1222@gmail.com
>>
>>
>>
>> --
>> Best Regards,
>>
>> Wheat
>
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com



-- 
Best Regards,

Wheat
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-01-12 11:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09  8:13 Proposal for adding disable FileJournal option Haomai Wang
2014-01-09 17:28 ` Gregory Farnum
2014-01-09 18:16   ` Stefan Priebe - Profihost AG
2014-01-10  2:04   ` Haomai Wang
2014-01-10  3:08     ` Dong Yuan
2014-01-10  3:13       ` Gregory Farnum
2014-01-11  5:24         ` Haomai Wang
2014-01-11 15:18           ` Dong Yuan
2014-01-12 11:25             ` Haomai Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.