All of lore.kernel.org
 help / color / mirror / Atom feed
* Latency Improvement Report for ShardedOpWQ
@ 2014-09-28  3:45 Dong Yuan
  2014-09-28  5:46 ` Somnath Roy
  0 siblings, 1 reply; 6+ messages in thread
From: Dong Yuan @ 2014-09-28  3:45 UTC (permalink / raw)
  To: ceph-devel

===== Test Purpose =====

Measure whether and how much Sharded OpWQ is better than Traditional
OpWQ for random write scene.

===== Test Case =====

4K Object WriteFull for 1w times.

===== Test Method =====

Put the following static probes into codes when running tests to get
the time span between enqeueue and dequeue of OpWQ.

Start: PG::enqueue_op before osd->op_wq.equeue call
End: OSD::dequeue_op.entry

===== Test Result =====

Traditional OpWQ: 109us(AVG), 40us(MIN)
ShardedOpWQ: 97us(AVG), 32us(MIN)

===== Test Conclusion =====

No Remarkably Improvement for Latency


-- 
Dong Yuan
Email:yuandong1222@gmail.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Latency Improvement Report for ShardedOpWQ
  2014-09-28  3:45 Latency Improvement Report for ShardedOpWQ Dong Yuan
@ 2014-09-28  5:46 ` Somnath Roy
  2014-09-28  7:19   ` Dong Yuan
  0 siblings, 1 reply; 6+ messages in thread
From: Somnath Roy @ 2014-09-28  5:46 UTC (permalink / raw)
  To: Dong Yuan, ceph-devel

Hi Dong,
I don't think in case of single client scenario there is much benefit. Single client has a limitation. The benefit with sharded TP is, a single OSD is scaling much more with the increase of clients since it is increasing parallelism (by reducing lock contention) in the filestore level. A quick check could be like this.

1. Create a single node, single OSD cluster and try putting load with increasing number of clients like 1,3, 5, 8,10. Small workload serving from memory should be ideal.
2. Compare the code with sharded TP against say firefly. You should be seeing firefly is not scaling with increasing number of clients.
3. try top -H on two different case and you should be seeing more threads in case of sharded tp were working in parallel than firefly.

Also, I am sure this latency result will not hold true in high workload , there you should be seeing more contention and as a result more latency.

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Dong Yuan
Sent: Saturday, September 27, 2014 8:45 PM
To: ceph-devel
Subject: Latency Improvement Report for ShardedOpWQ

===== Test Purpose =====

Measure whether and how much Sharded OpWQ is better than Traditional OpWQ for random write scene.

===== Test Case =====

4K Object WriteFull for 1w times.

===== Test Method =====

Put the following static probes into codes when running tests to get the time span between enqeueue and dequeue of OpWQ.

Start: PG::enqueue_op before osd->op_wq.equeue call
End: OSD::dequeue_op.entry

===== Test Result =====

Traditional OpWQ: 109us(AVG), 40us(MIN)
ShardedOpWQ: 97us(AVG), 32us(MIN)

===== Test Conclusion =====

No Remarkably Improvement for Latency


--
Dong Yuan
Email:yuandong1222@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Latency Improvement Report for ShardedOpWQ
  2014-09-28  5:46 ` Somnath Roy
@ 2014-09-28  7:19   ` Dong Yuan
  2014-09-28  9:01     ` Somnath Roy
  0 siblings, 1 reply; 6+ messages in thread
From: Dong Yuan @ 2014-09-28  7:19 UTC (permalink / raw)
  To: Somnath Roy; +Cc: ceph-devel

Hi Somnath,

I totally agree with you.

I read the code about  sharded TP and the new OSD OpWQ. In the new
implementation, there is not  single lock for all PGs, but each lock
for a subset of PGs(Am I right?).   It is very useful to reduce lock
contention and so increase parallelism. It is an awesome work!

While I am working on the latency of single IO (mainly 4K random
write), I notice the OpWQ spent about 100+us to transfer an IO from
msg dispatcher to OpWQ worker thread, Do you have any idea to reduce
the time span?

Thanks for your help.
Dong.

On 28 September 2014 13:46, Somnath Roy <Somnath.Roy@sandisk.com> wrote:
> Hi Dong,
> I don't think in case of single client scenario there is much benefit. Single client has a limitation. The benefit with sharded TP is, a single OSD is scaling much more with the increase of clients since it is increasing parallelism (by reducing lock contention) in the filestore level. A quick check could be like this.
>
> 1. Create a single node, single OSD cluster and try putting load with increasing number of clients like 1,3, 5, 8,10. Small workload serving from memory should be ideal.
> 2. Compare the code with sharded TP against say firefly. You should be seeing firefly is not scaling with increasing number of clients.
> 3. try top -H on two different case and you should be seeing more threads in case of sharded tp were working in parallel than firefly.
>
> Also, I am sure this latency result will not hold true in high workload , there you should be seeing more contention and as a result more latency.
>
> Thanks & Regards
> Somnath
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Dong Yuan
> Sent: Saturday, September 27, 2014 8:45 PM
> To: ceph-devel
> Subject: Latency Improvement Report for ShardedOpWQ
>
> ===== Test Purpose =====
>
> Measure whether and how much Sharded OpWQ is better than Traditional OpWQ for random write scene.
>
> ===== Test Case =====
>
> 4K Object WriteFull for 1w times.
>
> ===== Test Method =====
>
> Put the following static probes into codes when running tests to get the time span between enqeueue and dequeue of OpWQ.
>
> Start: PG::enqueue_op before osd->op_wq.equeue call
> End: OSD::dequeue_op.entry
>
> ===== Test Result =====
>
> Traditional OpWQ: 109us(AVG), 40us(MIN)
> ShardedOpWQ: 97us(AVG), 32us(MIN)
>
> ===== Test Conclusion =====
>
> No Remarkably Improvement for Latency
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>



-- 
Dong Yuan
Email:yuandong1222@gmail.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Latency Improvement Report for ShardedOpWQ
  2014-09-28  7:19   ` Dong Yuan
@ 2014-09-28  9:01     ` Somnath Roy
  2014-09-28  9:18       ` Ma, Jianpeng
  0 siblings, 1 reply; 6+ messages in thread
From: Somnath Roy @ 2014-09-28  9:01 UTC (permalink / raw)
  To: Dong Yuan; +Cc: ceph-devel

Dong,
This is mostly because of lock contention may be. 
You can tweak the number of shards in case of sharded WQ to see if it is improving this number or not. 
There is still one global lock we have; this is to protect pg_for_processing() and this we can't get rid of since we need to maintain op order within a pg. This could be increasing latency as well. I would suggest you to measure this number in different stages within ShardedOpWQ::_process() like after dequeue from pqueue and after getting the pglock and popping the ops from pg_for_processing().

Also, keep in mind there is context switch happening and this could be expensive depending on the data copy etc. It's worth trying this experiment by pinning OSD to may be actual physical cores ?

Thanks & Regards
Somnath

-----Original Message-----
From: Dong Yuan [mailto:yuandong1222@gmail.com] 
Sent: Sunday, September 28, 2014 12:19 AM
To: Somnath Roy
Cc: ceph-devel
Subject: Re: Latency Improvement Report for ShardedOpWQ

Hi Somnath,

I totally agree with you.

I read the code about  sharded TP and the new OSD OpWQ. In the new implementation, there is not  single lock for all PGs, but each lock
for a subset of PGs(Am I right?).   It is very useful to reduce lock
contention and so increase parallelism. It is an awesome work!

While I am working on the latency of single IO (mainly 4K random write), I notice the OpWQ spent about 100+us to transfer an IO from msg dispatcher to OpWQ worker thread, Do you have any idea to reduce the time span?

Thanks for your help.
Dong.

On 28 September 2014 13:46, Somnath Roy <Somnath.Roy@sandisk.com> wrote:
> Hi Dong,
> I don't think in case of single client scenario there is much benefit. Single client has a limitation. The benefit with sharded TP is, a single OSD is scaling much more with the increase of clients since it is increasing parallelism (by reducing lock contention) in the filestore level. A quick check could be like this.
>
> 1. Create a single node, single OSD cluster and try putting load with increasing number of clients like 1,3, 5, 8,10. Small workload serving from memory should be ideal.
> 2. Compare the code with sharded TP against say firefly. You should be seeing firefly is not scaling with increasing number of clients.
> 3. try top -H on two different case and you should be seeing more threads in case of sharded tp were working in parallel than firefly.
>
> Also, I am sure this latency result will not hold true in high workload , there you should be seeing more contention and as a result more latency.
>
> Thanks & Regards
> Somnath
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org 
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Dong Yuan
> Sent: Saturday, September 27, 2014 8:45 PM
> To: ceph-devel
> Subject: Latency Improvement Report for ShardedOpWQ
>
> ===== Test Purpose =====
>
> Measure whether and how much Sharded OpWQ is better than Traditional OpWQ for random write scene.
>
> ===== Test Case =====
>
> 4K Object WriteFull for 1w times.
>
> ===== Test Method =====
>
> Put the following static probes into codes when running tests to get the time span between enqeueue and dequeue of OpWQ.
>
> Start: PG::enqueue_op before osd->op_wq.equeue call
> End: OSD::dequeue_op.entry
>
> ===== Test Result =====
>
> Traditional OpWQ: 109us(AVG), 40us(MIN)
> ShardedOpWQ: 97us(AVG), 32us(MIN)
>
> ===== Test Conclusion =====
>
> No Remarkably Improvement for Latency
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>



--
Dong Yuan
Email:yuandong1222@gmail.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Latency Improvement Report for ShardedOpWQ
  2014-09-28  9:01     ` Somnath Roy
@ 2014-09-28  9:18       ` Ma, Jianpeng
  2014-09-28  9:44         ` Somnath Roy
  0 siblings, 1 reply; 6+ messages in thread
From: Ma, Jianpeng @ 2014-09-28  9:18 UTC (permalink / raw)
  To: Somnath Roy, Dong Yuan; +Cc: ceph-devel

Hi Somnath:
You mentioned: There is still one global lock we have; this is to protect pg_for_processing() and this we can't get rid of since we need to maintain op order within a pg.

But for most object operations, we only maintain the order of object. Why need maintain op order within a pg?
Can you explain in detail?

> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Somnath Roy
> Sent: Sunday, September 28, 2014 5:02 PM
> To: Dong Yuan
> Cc: ceph-devel
> Subject: RE: Latency Improvement Report for ShardedOpWQ
> 
> Dong,
> This is mostly because of lock contention may be.
> You can tweak the number of shards in case of sharded WQ to see if it is
> improving this number or not.
> There is still one global lock we have; this is to protect pg_for_processing() and
> this we can't get rid of since we need to maintain op order within a pg. This
> could be increasing latency as well. I would suggest you to measure this
> number in different stages within ShardedOpWQ::_process() like after dequeue
> from pqueue and after getting the pglock and popping the ops from
> pg_for_processing().
> 
> Also, keep in mind there is context switch happening and this could be
> expensive depending on the data copy etc. It's worth trying this experiment by
> pinning OSD to may be actual physical cores ?
> 
> Thanks & Regards
> Somnath
> 
> -----Original Message-----
> From: Dong Yuan [mailto:yuandong1222@gmail.com]
> Sent: Sunday, September 28, 2014 12:19 AM
> To: Somnath Roy
> Cc: ceph-devel
> Subject: Re: Latency Improvement Report for ShardedOpWQ
> 
> Hi Somnath,
> 
> I totally agree with you.
> 
> I read the code about  sharded TP and the new OSD OpWQ. In the new
> implementation, there is not  single lock for all PGs, but each lock
> for a subset of PGs(Am I right?).   It is very useful to reduce lock
> contention and so increase parallelism. It is an awesome work!
> 
> While I am working on the latency of single IO (mainly 4K random write), I
> notice the OpWQ spent about 100+us to transfer an IO from msg dispatcher to
> OpWQ worker thread, Do you have any idea to reduce the time span?
> 
> Thanks for your help.
> Dong.
> 
> On 28 September 2014 13:46, Somnath Roy <Somnath.Roy@sandisk.com>
> wrote:
> > Hi Dong,
> > I don't think in case of single client scenario there is much benefit. Single
> client has a limitation. The benefit with sharded TP is, a single OSD is scaling
> much more with the increase of clients since it is increasing parallelism (by
> reducing lock contention) in the filestore level. A quick check could be like this.
> >
> > 1. Create a single node, single OSD cluster and try putting load with
> increasing number of clients like 1,3, 5, 8,10. Small workload serving from
> memory should be ideal.
> > 2. Compare the code with sharded TP against say firefly. You should be seeing
> firefly is not scaling with increasing number of clients.
> > 3. try top -H on two different case and you should be seeing more threads in
> case of sharded tp were working in parallel than firefly.
> >
> > Also, I am sure this latency result will not hold true in high workload , there
> you should be seeing more contention and as a result more latency.
> >
> > Thanks & Regards
> > Somnath
> >
> > -----Original Message-----
> > From: ceph-devel-owner@vger.kernel.org
> > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Dong Yuan
> > Sent: Saturday, September 27, 2014 8:45 PM
> > To: ceph-devel
> > Subject: Latency Improvement Report for ShardedOpWQ
> >
> > ===== Test Purpose =====
> >
> > Measure whether and how much Sharded OpWQ is better than Traditional
> OpWQ for random write scene.
> >
> > ===== Test Case =====
> >
> > 4K Object WriteFull for 1w times.
> >
> > ===== Test Method =====
> >
> > Put the following static probes into codes when running tests to get the time
> span between enqeueue and dequeue of OpWQ.
> >
> > Start: PG::enqueue_op before osd->op_wq.equeue call
> > End: OSD::dequeue_op.entry
> >
> > ===== Test Result =====
> >
> > Traditional OpWQ: 109us(AVG), 40us(MIN)
> > ShardedOpWQ: 97us(AVG), 32us(MIN)
> >
> > ===== Test Conclusion =====
> >
> > No Remarkably Improvement for Latency
> >
> >
> > --
> > Dong Yuan
> > Email:yuandong1222@gmail.com
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> > ________________________________
> >
> > PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If the
> reader of this message is not the intended recipient, you are hereby notified
> that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly prohibited. If
> you have received this communication in error, please notify the sender by
> telephone or e-mail (as shown above) immediately and destroy any and all
> copies of this message in your possession (whether hard copies or
> electronically stored copies).
> >
> 
> 
> 
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
> \x13  칻\x1c & ~ & \x18  +-  ݶ\x17  w  ˛   m \x1e \x17^  b  ^n r   z \x1a  h    &  \x1e G
> h \x03( 階 ݢj"  \x1a ^[m     z ޖ   f   h   ~ m

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Latency Improvement Report for ShardedOpWQ
  2014-09-28  9:18       ` Ma, Jianpeng
@ 2014-09-28  9:44         ` Somnath Roy
  0 siblings, 0 replies; 6+ messages in thread
From: Somnath Roy @ 2014-09-28  9:44 UTC (permalink / raw)
  To: Ma, Jianpeng, Dong Yuan; +Cc: ceph-devel

Actually, I was wrong ! I have sharded that pg_for_processing lock as well. But, still worth to capture time as I mentioned.

<< But for most object operations, we only maintain the order of object. Why need maintain op order within a pg?

Yes, but still OSD (opQ part) needs to put these ops in order within a pg since it is context switched. So, the solution is to  hold the lock guarding the queue till the pg->lock() is acquired. But, this will cause a deadlock in ceph code, since after holding pg->lock() , it can requeue again! So, we need to release the queue lock before acquiring pg->lock and this can break the order of ops.

Thanks & Regards
Somnath

-----Original Message-----
From: Ma, Jianpeng [mailto:jianpeng.ma@intel.com]
Sent: Sunday, September 28, 2014 2:19 AM
To: Somnath Roy; Dong Yuan
Cc: ceph-devel
Subject: RE: Latency Improvement Report for ShardedOpWQ

Hi Somnath:
You mentioned: There is still one global lock we have; this is to protect pg_for_processing() and this we can't get rid of since we need to maintain op order within a pg.

But for most object operations, we only maintain the order of object. Why need maintain op order within a pg?
Can you explain in detail?

> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Somnath Roy
> Sent: Sunday, September 28, 2014 5:02 PM
> To: Dong Yuan
> Cc: ceph-devel
> Subject: RE: Latency Improvement Report for ShardedOpWQ
>
> Dong,
> This is mostly because of lock contention may be.
> You can tweak the number of shards in case of sharded WQ to see if it
> is improving this number or not.
> There is still one global lock we have; this is to protect
> pg_for_processing() and this we can't get rid of since we need to
> maintain op order within a pg. This could be increasing latency as
> well. I would suggest you to measure this number in different stages
> within ShardedOpWQ::_process() like after dequeue from pqueue and
> after getting the pglock and popping the ops from pg_for_processing().
>
> Also, keep in mind there is context switch happening and this could be
> expensive depending on the data copy etc. It's worth trying this
> experiment by pinning OSD to may be actual physical cores ?
>
> Thanks & Regards
> Somnath
>
> -----Original Message-----
> From: Dong Yuan [mailto:yuandong1222@gmail.com]
> Sent: Sunday, September 28, 2014 12:19 AM
> To: Somnath Roy
> Cc: ceph-devel
> Subject: Re: Latency Improvement Report for ShardedOpWQ
>
> Hi Somnath,
>
> I totally agree with you.
>
> I read the code about  sharded TP and the new OSD OpWQ. In the new
> implementation, there is not  single lock for all PGs, but each lock
> for a subset of PGs(Am I right?).   It is very useful to reduce lock
> contention and so increase parallelism. It is an awesome work!
>
> While I am working on the latency of single IO (mainly 4K random
> write), I notice the OpWQ spent about 100+us to transfer an IO from
> msg dispatcher to OpWQ worker thread, Do you have any idea to reduce the time span?
>
> Thanks for your help.
> Dong.
>
> On 28 September 2014 13:46, Somnath Roy <Somnath.Roy@sandisk.com>
> wrote:
> > Hi Dong,
> > I don't think in case of single client scenario there is much
> > benefit. Single
> client has a limitation. The benefit with sharded TP is, a single OSD
> is scaling much more with the increase of clients since it is
> increasing parallelism (by reducing lock contention) in the filestore level. A quick check could be like this.
> >
> > 1. Create a single node, single OSD cluster and try putting load
> > with
> increasing number of clients like 1,3, 5, 8,10. Small workload serving
> from memory should be ideal.
> > 2. Compare the code with sharded TP against say firefly. You should
> > be seeing
> firefly is not scaling with increasing number of clients.
> > 3. try top -H on two different case and you should be seeing more
> > threads in
> case of sharded tp were working in parallel than firefly.
> >
> > Also, I am sure this latency result will not hold true in high
> > workload , there
> you should be seeing more contention and as a result more latency.
> >
> > Thanks & Regards
> > Somnath
> >
> > -----Original Message-----
> > From: ceph-devel-owner@vger.kernel.org
> > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Dong Yuan
> > Sent: Saturday, September 27, 2014 8:45 PM
> > To: ceph-devel
> > Subject: Latency Improvement Report for ShardedOpWQ
> >
> > ===== Test Purpose =====
> >
> > Measure whether and how much Sharded OpWQ is better than Traditional
> OpWQ for random write scene.
> >
> > ===== Test Case =====
> >
> > 4K Object WriteFull for 1w times.
> >
> > ===== Test Method =====
> >
> > Put the following static probes into codes when running tests to get
> > the time
> span between enqeueue and dequeue of OpWQ.
> >
> > Start: PG::enqueue_op before osd->op_wq.equeue call
> > End: OSD::dequeue_op.entry
> >
> > ===== Test Result =====
> >
> > Traditional OpWQ: 109us(AVG), 40us(MIN)
> > ShardedOpWQ: 97us(AVG), 32us(MIN)
> >
> > ===== Test Conclusion =====
> >
> > No Remarkably Improvement for Latency
> >
> >
> > --
> > Dong Yuan
> > Email:yuandong1222@gmail.com
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> > ________________________________
> >
> > PLEASE NOTE: The information contained in this electronic mail
> > message is
> intended only for the use of the designated recipient(s) named above.
> If the reader of this message is not the intended recipient, you are
> hereby notified that you have received this message in error and that
> any review, dissemination, distribution, or copying of this message is
> strictly prohibited. If you have received this communication in error,
> please notify the sender by telephone or e-mail (as shown above)
> immediately and destroy any and all copies of this message in your
> possession (whether hard copies or electronically stored copies).
> >
>
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
> \x13  칻\x1c & ~ & \x18  +-  ݶ\x17  w  ˛   m \x1e \x17^  b  ^n r   z \x1a  h    &  \x1e G
> h \x03( 階 ݢj"  \x1a ^[m     z ޖ   f   h   ~ m

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-09-28  9:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-28  3:45 Latency Improvement Report for ShardedOpWQ Dong Yuan
2014-09-28  5:46 ` Somnath Roy
2014-09-28  7:19   ` Dong Yuan
2014-09-28  9:01     ` Somnath Roy
2014-09-28  9:18       ` Ma, Jianpeng
2014-09-28  9:44         ` Somnath Roy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.