All of lore.kernel.org
 help / color / mirror / Atom feed
* Memstore issue on v0.91
@ 2015-01-21  7:48 Blinick, Stephen L
  2015-01-21 14:13 ` Mark Nelson
  0 siblings, 1 reply; 5+ messages in thread
From: Blinick, Stephen L @ 2015-01-21  7:48 UTC (permalink / raw)
  To: Ceph Development

Moved to 0.91 yesterday and ran into some issues, with a Memstore OSD of default size.  After a few hundred K OPS client gets a FULL message and the avail space looks like:

[root@cephtestnode0 bmpa]# rados df
pool name                 KB      objects       clones     degraded     unfound           rd        rd KB           wr        wr KB
rbd                        0            0            0            0          0            0            0            0            0
testmemstore         1505052       376263            0            0          0            0            0       376263      1505052
  total used         1494247       376263
  total avail   18014398508987737
  total space        1000000

I was looking at pull request #2836 and found a case where there could be an unsigned int underflow in MemStore.cc:statfs:

   st->f_bfree = st->f_bavail = MAX((st->f_blocks - used_bytes / st->f_bsize), 0);

But that wasn't the final issue.. I put some asserts where used_bytes could underflow as well and didn't catch anything.  I will keep digging but wanted to find out if anyone else was seeing the issue as well.

Thanks,

Stephen


 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Memstore issue on v0.91
  2015-01-21  7:48 Memstore issue on v0.91 Blinick, Stephen L
@ 2015-01-21 14:13 ` Mark Nelson
  2015-01-22  8:38   ` Chen, Xiaoxi
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Nelson @ 2015-01-21 14:13 UTC (permalink / raw)
  To: Blinick, Stephen L, Ceph Development

On 01/21/2015 01:48 AM, Blinick, Stephen L wrote:
> Moved to 0.91 yesterday and ran into some issues, with a Memstore OSD of default size.  After a few hundred K OPS client gets a FULL message and the avail space looks like:
>
> [root@cephtestnode0 bmpa]# rados df
> pool name                 KB      objects       clones     degraded     unfound           rd        rd KB           wr        wr KB
> rbd                        0            0            0            0          0            0            0            0            0
> testmemstore         1505052       376263            0            0          0            0            0       376263      1505052
>    total used         1494247       376263
>    total avail   18014398508987737
>    total space        1000000
>
> I was looking at pull request #2836 and found a case where there could be an unsigned int underflow in MemStore.cc:statfs:
>
>     st->f_bfree = st->f_bavail = MAX((st->f_blocks - used_bytes / st->f_bsize), 0);
>
> But that wasn't the final issue.. I put some asserts where used_bytes could underflow as well and didn't catch anything.  I will keep digging but wanted to find out if anyone else was seeing the issue as well.

FWIW, I hit this yesterday too.  I hadn't started debugging it yet 
though.  Looks like you are farther along than I am, thanks Stephen! :)

>
> Thanks,
>
> Stephen
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Memstore issue on v0.91
  2015-01-21 14:13 ` Mark Nelson
@ 2015-01-22  8:38   ` Chen, Xiaoxi
  2015-01-22 23:25     ` Blinick, Stephen L
  0 siblings, 1 reply; 5+ messages in thread
From: Chen, Xiaoxi @ 2015-01-22  8:38 UTC (permalink / raw)
  To: mnelson, Blinick, Stephen L, Ceph Development

This is due to the implicit type cast in the compiler,  when
                st->f_blocks     <    (used_bytes/st->f_bsize),
   the minus should be a negative ,but the compiler take it as an unsigned value


Fix is proposed in https://github.com/ceph/ceph/pull/3451

				Xiaoxi

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
Sent: Wednesday, January 21, 2015 10:13 PM
To: Blinick, Stephen L; Ceph Development
Subject: Re: Memstore issue on v0.91

On 01/21/2015 01:48 AM, Blinick, Stephen L wrote:
> Moved to 0.91 yesterday and ran into some issues, with a Memstore OSD of default size.  After a few hundred K OPS client gets a FULL message and the avail space looks like:
>
> [root@cephtestnode0 bmpa]# rados df
> pool name                 KB      objects       clones     degraded     unfound           rd        rd KB           wr        wr KB
> rbd                        0            0            0            0          0            0            0            0            0
> testmemstore         1505052       376263            0            0          0            0            0       376263      1505052
>    total used         1494247       376263
>    total avail   18014398508987737
>    total space        1000000
>
> I was looking at pull request #2836 and found a case where there could be an unsigned int underflow in MemStore.cc:statfs:
>
>     st->f_bfree = st->f_bavail = MAX((st->f_blocks - used_bytes / 
> st->f_bsize), 0);
>
> But that wasn't the final issue.. I put some asserts where used_bytes could underflow as well and didn't catch anything.  I will keep digging but wanted to find out if anyone else was seeing the issue as well.

FWIW, I hit this yesterday too.  I hadn't started debugging it yet though.  Looks like you are farther along than I am, thanks Stephen! :)

>
> Thanks,
>
> Stephen
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Memstore issue on v0.91
  2015-01-22  8:38   ` Chen, Xiaoxi
@ 2015-01-22 23:25     ` Blinick, Stephen L
  2015-01-23  0:41       ` 回复: " Chen, Xiaoxi
  0 siblings, 1 reply; 5+ messages in thread
From: Blinick, Stephen L @ 2015-01-22 23:25 UTC (permalink / raw)
  To: Chen, Xiaoxi, mnelson, Ceph Development

That fix looks good and I also verified it works.   When I found that bug originally I fixed it, but it didn't work and I debugged a bit further.    The reason was rpmbuild untar's the /SOURCES archive and builds from that -- and so my change didn't get picked up.   Too much automation :)

Mark -- if you're doing the Memstore testing still, you can use this fix but also need to make sure to specify a large enough memstore "device".. i.e. over 4GB if you're going to be writing for 60 seconds with 4K objects at 20K IOPS.     

Thanks,

Stephen


-----Original Message-----
From: Chen, Xiaoxi 
Sent: Thursday, January 22, 2015 1:39 AM
To: mnelson@redhat.com; Blinick, Stephen L; Ceph Development
Subject: RE: Memstore issue on v0.91

This is due to the implicit type cast in the compiler,  when
                st->f_blocks     <    (used_bytes/st->f_bsize),
   the minus should be a negative ,but the compiler take it as an unsigned value


Fix is proposed in https://github.com/ceph/ceph/pull/3451

				Xiaoxi

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
Sent: Wednesday, January 21, 2015 10:13 PM
To: Blinick, Stephen L; Ceph Development
Subject: Re: Memstore issue on v0.91

On 01/21/2015 01:48 AM, Blinick, Stephen L wrote:
> Moved to 0.91 yesterday and ran into some issues, with a Memstore OSD of default size.  After a few hundred K OPS client gets a FULL message and the avail space looks like:
>
> [root@cephtestnode0 bmpa]# rados df
> pool name                 KB      objects       clones     degraded     unfound           rd        rd KB           wr        wr KB
> rbd                        0            0            0            0          0            0            0            0            0
> testmemstore         1505052       376263            0            0          0            0            0       376263      1505052
>    total used         1494247       376263
>    total avail   18014398508987737
>    total space        1000000
>
> I was looking at pull request #2836 and found a case where there could be an unsigned int underflow in MemStore.cc:statfs:
>
>     st->f_bfree = st->f_bavail = MAX((st->f_blocks - used_bytes /
> st->f_bsize), 0);
>
> But that wasn't the final issue.. I put some asserts where used_bytes could underflow as well and didn't catch anything.  I will keep digging but wanted to find out if anyone else was seeing the issue as well.

FWIW, I hit this yesterday too.  I hadn't started debugging it yet though.  Looks like you are farther along than I am, thanks Stephen! :)

>
> Thanks,
>
> Stephen
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* 回复: RE: Memstore issue on v0.91
  2015-01-22 23:25     ` Blinick, Stephen L
@ 2015-01-23  0:41       ` Chen, Xiaoxi
  0 siblings, 0 replies; 5+ messages in thread
From: Chen, Xiaoxi @ 2015-01-23  0:41 UTC (permalink / raw)
  To: mnelson, ceph-devel, Blinick, Stephen L

I would.prefer just set the memstore_device_size to a huge constant, say 1000G. So it will not make any trouble😊



---- Blinick, Stephen L编写 ----


That fix looks good and I also verified it works.   When I found that bug originally I fixed it, but it didn't work and I debugged a bit further.    The reason was rpmbuild untar's the /SOURCES archive and builds from that -- and so my change didn't get picked up.   Too much automation :)

Mark -- if you're doing the Memstore testing still, you can use this fix but also need to make sure to specify a large enough memstore "device".. i.e. over 4GB if you're going to be writing for 60 seconds with 4K objects at 20K IOPS.

Thanks,

Stephen


-----Original Message-----
From: Chen, Xiaoxi
Sent: Thursday, January 22, 2015 1:39 AM
To: mnelson@redhat.com; Blinick, Stephen L; Ceph Development
Subject: RE: Memstore issue on v0.91

This is due to the implicit type cast in the compiler,  when
                st->f_blocks     <    (used_bytes/st->f_bsize),
   the minus should be a negative ,but the compiler take it as an unsigned value


Fix is proposed in https://github.com/ceph/ceph/pull/3451

                                Xiaoxi

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
Sent: Wednesday, January 21, 2015 10:13 PM
To: Blinick, Stephen L; Ceph Development
Subject: Re: Memstore issue on v0.91

On 01/21/2015 01:48 AM, Blinick, Stephen L wrote:
> Moved to 0.91 yesterday and ran into some issues, with a Memstore OSD of default size.  After a few hundred K OPS client gets a FULL message and the avail space looks like:
>
> [root@cephtestnode0 bmpa]# rados df
> pool name                 KB      objects       clones     degraded     unfound           rd        rd KB           wr        wr KB
> rbd                        0            0            0            0          0            0            0            0            0
> testmemstore         1505052       376263            0            0          0            0            0       376263      1505052
>    total used         1494247       376263
>    total avail   18014398508987737
>    total space        1000000
>
> I was looking at pull request #2836 and found a case where there could be an unsigned int underflow in MemStore.cc:statfs:
>
>     st->f_bfree = st->f_bavail = MAX((st->f_blocks - used_bytes /
> st->f_bsize), 0);
>
> But that wasn't the final issue.. I put some asserts where used_bytes could underflow as well and didn't catch anything.  I will keep digging but wanted to find out if anyone else was seeing the issue as well.

FWIW, I hit this yesterday too.  I hadn't started debugging it yet though.  Looks like you are farther along than I am, thanks Stephen! :)

>
> Thanks,
>
> Stephen
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-01-23  0:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21  7:48 Memstore issue on v0.91 Blinick, Stephen L
2015-01-21 14:13 ` Mark Nelson
2015-01-22  8:38   ` Chen, Xiaoxi
2015-01-22 23:25     ` Blinick, Stephen L
2015-01-23  0:41       ` 回复: " Chen, Xiaoxi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.