All of lore.kernel.org
 help / color / mirror / Atom feed
* Let's Not Destroy the World in 2038
@ 2015-12-22 20:10 Adam C. Emerson
  2015-12-23  3:54 ` Gregory Farnum
  0 siblings, 1 reply; 3+ messages in thread
From: Adam C. Emerson @ 2015-12-22 20:10 UTC (permalink / raw)
  To: The Sacred Order of the Squid Cybernetic

[-- Attachment #1: Type: text/plain, Size: 1465 bytes --]

Comrades,

Ceph's victory is assured. It will be the storage system of The Future.
Matt Benjamin has reminded me that if we don't act fast¹ Ceph will be
responsible for destroying the world.

utime_t() uses a 32-bit second count internally. This isn't great, but it's
something we can fix. ceph::real_time currently uses a 64-bit bit count of
nanoseconds, which is better. And we can change it to something else without
having to rewrite much other code.

The problem lies in our encode/deocde functions for time (both utime_t
and ceph::real_time, since I didn't want to break compatibility.) we
use a 32-bit second count. I would like to change the wire and disk
representation to a 64-bit second count and a 32-bit nanosecond count.

Would there be resistance to a project to do this? I don't know if a
FEATURE bit would help. A FEATURE bit to toggle the width of the second
count would be ideal if it would work. Otherwise it looks like the best
way to do this would be to find all the structures currently ::encoded
that hold time values, bump the version number and have an 'old_utime'
that we use for everything pre-change.

Thank you!

¹ Within the next twenty-three years. But that's not really a long time in the
  larger scheme of things.

-- 
Senior Software Engineer           Red Hat Storage, Ann Arbor, MI, US
IRC: Aemerson@{RedHat, OFTC, Freenode}
0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 603 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Let's Not Destroy the World in 2038
  2015-12-22 20:10 Let's Not Destroy the World in 2038 Adam C. Emerson
@ 2015-12-23  3:54 ` Gregory Farnum
  2015-12-23 19:16   ` Adam C. Emerson
  0 siblings, 1 reply; 3+ messages in thread
From: Gregory Farnum @ 2015-12-23  3:54 UTC (permalink / raw)
  To: The Sacred Order of the Squid Cybernetic

On Tue, Dec 22, 2015 at 12:10 PM, Adam C. Emerson <aemerson@redhat.com> wrote:
> Comrades,
>
> Ceph's victory is assured. It will be the storage system of The Future.
> Matt Benjamin has reminded me that if we don't act fast¹ Ceph will be
> responsible for destroying the world.
>
> utime_t() uses a 32-bit second count internally. This isn't great, but it's
> something we can fix. ceph::real_time currently uses a 64-bit bit count of
> nanoseconds, which is better. And we can change it to something else without
> having to rewrite much other code.
>
> The problem lies in our encode/deocde functions for time (both utime_t
> and ceph::real_time, since I didn't want to break compatibility.) we
> use a 32-bit second count. I would like to change the wire and disk
> representation to a 64-bit second count and a 32-bit nanosecond count.
>
> Would there be resistance to a project to do this? I don't know if a
> FEATURE bit would help. A FEATURE bit to toggle the width of the second
> count would be ideal if it would work. Otherwise it looks like the best
> way to do this would be to find all the structures currently ::encoded
> that hold time values, bump the version number and have an 'old_utime'
> that we use for everything pre-change.

Unfortunately, we include utimes in structures that are written to
disk. So I think we're stuck with creating a new utime_t and
incrementing the struct_v on everything that contains them. :/

Of course, we'll also then need the full feature bit system to make
sure we send the old encoding to clients which don't understand the
new one, and to prevent a mid-upgrade cluster from writing data on a
new node that gets moved to a new node which doesn't understand it.

Given that utime_t occurs in a lot of places, and really can't change
*again* after this, we probably shouldn't set up the new version with
versioned encoding?
-Greg

>
> Thank you!
>
> ¹ Within the next twenty-three years. But that's not really a long time in the
>   larger scheme of things.
>
> --
> Senior Software Engineer           Red Hat Storage, Ann Arbor, MI, US
> IRC: Aemerson@{RedHat, OFTC, Freenode}
> 0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Let's Not Destroy the World in 2038
  2015-12-23  3:54 ` Gregory Farnum
@ 2015-12-23 19:16   ` Adam C. Emerson
  0 siblings, 0 replies; 3+ messages in thread
From: Adam C. Emerson @ 2015-12-23 19:16 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: The Sacred Order of the Squid Cybernetic

[-- Attachment #1: Type: text/plain, Size: 2141 bytes --]

On 22/12/2015, Gregory Farnum wrote:
[snip]
> So I think we're stuck with creating a new utime_t and incrementing
> the struct_v on everything that contains them. :/
[snip]
> We'll also then need the full feature bit system to make
> sure we send the old encoding to clients which don't understand the
> new one, and to prevent a mid-upgrade cluster from writing data on a
> new node that gets moved to a new node which doesn't understand it.

That is my understanding. I have the impression that network communication
get feature bits for the other nodes and on-disk structures are explicitly
versioned. If I'm mistaken, please hurl corrections at me.

> Given that utime_t occurs in a lot of places, and really can't change
> *again* after this, we probably shouldn't set up the new version with
> versioned encoding?

You're overly pessimistic. I'm hoping our post-human descendents store
their unfathomably alien, reconstructed minds in some galaxy spanning
descendent of Ceph and need more than a 64-bit second count.

However, I agree that the time value itself should not have an encoded
version tag.

To my intuition, the best way forward would be to:

(1) Add non-defaulted feature parameters on encode/decode of utime_t and
    ceph::real_time. This will break everything that uses them.

(2) Add explicit encode_old/encode_new functions. that way when we KNOW which
    one we want at compile time we don't have to pay for a runtime check.

(3) When we have feature bits, pass them in.

(4) When we have a version, bump it. For new versions, explicitly call
    encode_new. When we know we want old, call old.

(5) If there are classes that we encode that have neither feature bits nor
    versioning available, see what uses them and act accordingly. Hopefully the
    special cases will be few.

Does that seem reasonable?

I thank you.

And all hypothetical post-huamn Ceph users thank you.

-- 
Senior Software Engineer           Red Hat Storage, Ann Arbor, MI, US
IRC: Aemerson@{RedHat, OFTC, Freenode}
0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 603 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-12-23 19:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-22 20:10 Let's Not Destroy the World in 2038 Adam C. Emerson
2015-12-23  3:54 ` Gregory Farnum
2015-12-23 19:16   ` Adam C. Emerson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.