All of lore.kernel.org
 help / color / mirror / Atom feed
* reduce networking latency
@ 2014-06-10 15:29 David Xu
  0 siblings, 0 replies; 6+ messages in thread
From: David Xu @ 2014-06-10 15:29 UTC (permalink / raw)
  To: kvm

Hi All,

I found this interesting project from KVM TODO website:

allow handling short packets from softirq or VCPU context
 Plan:
   We are going through the scheduler 3 times
   (could be up to 5 if softirqd is involved)
   Consider RX: host irq -> io thread -> VCPU thread ->
   guest irq -> guest thread.
   This adds a lot of latency.
   We can cut it by some 1.5x if we do a bit of work
   either in the VCPU or softirq context.
 Testing: netperf TCP RR - should be improved drastically
          netperf TCP STREAM guest to host - no regression
 Developer: MST

 I am also tuning the vCPU scheduling of KVM. If someone would like to
say some details about the work either in the vCPU or softirq context,
I will be very appreciated. BTW, how to get the evaluation results
that the shortcut can improve the performance by up to 1.5X?
Thanks a lot!

Regards,
Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reduce networking latency
  2014-10-22 20:35     ` David Xu
@ 2014-10-23  4:15       ` Michael S. Tsirkin
  0 siblings, 0 replies; 6+ messages in thread
From: Michael S. Tsirkin @ 2014-10-23  4:15 UTC (permalink / raw)
  To: David Xu; +Cc: kvm

On Wed, Oct 22, 2014 at 04:35:07PM -0400, David Xu wrote:
> 2014-10-15 14:30 GMT-04:00 David Xu <davidxu06@gmail.com>:
> > 2014-09-29 5:04 GMT-04:00 Michael S. Tsirkin <mst@redhat.com>:
> >> On Wed, Sep 24, 2014 at 02:40:53PM -0400, David Xu wrote:
> >>> Hi Michael,
> >>>
> >>> I found this interesting project from KVM TODO website:
> >>>
> >>> allow handling short packets from softirq or VCPU context
> >>>  Plan:
> >>>    We are going through the scheduler 3 times
> >>>    (could be up to 5 if softirqd is involved)
> >>>    Consider RX: host irq -> io thread -> VCPU thread ->
> >>>    guest irq -> guest thread.
> >>>    This adds a lot of latency.
> >>>    We can cut it by some 1.5x if we do a bit of work
> >>>    either in the VCPU or softirq context.
> >>>  Testing: netperf TCP RR - should be improved drastically
> >>>           netperf TCP STREAM guest to host - no regression
> >>>
> >>> Would you mind saying more about the work either in the vCPU or
> >>> softirq context?
> >>
> >> For TX, we might be able to execute it directly from VCPU context.
> >> For RX, from softirq context.
> >>
> 
> Do you mean for RX, we directly put data to a shared buffer accessed
> by guest VM bypassing the IO thread? For TX, in vCPU context network
> data is added to the shared buffer and kick host IRQ to send them?

Yes, that's the idea.

> >>> Why it is only for short packets handling?
> >>
> >> That's just one idea to avoid doing too much work like this.
> >>
> >> Doing too much work in VCPU context would break pipelining,
> >> likely degrading stream performance.
> >> Work in softirq context is not accounted against the correct
> >> cgroups, doing a lot of work there will mean guest can steal
> >> CPU from other guests.
> >>
> >>> Thanks a
> >>> lot!
> >>>
> >>>
> >>> Regards,
> >>>
> >>> Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reduce networking latency
  2014-10-15 18:30   ` David Xu
@ 2014-10-22 20:35     ` David Xu
  2014-10-23  4:15       ` Michael S. Tsirkin
  0 siblings, 1 reply; 6+ messages in thread
From: David Xu @ 2014-10-22 20:35 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm

2014-10-15 14:30 GMT-04:00 David Xu <davidxu06@gmail.com>:
> 2014-09-29 5:04 GMT-04:00 Michael S. Tsirkin <mst@redhat.com>:
>> On Wed, Sep 24, 2014 at 02:40:53PM -0400, David Xu wrote:
>>> Hi Michael,
>>>
>>> I found this interesting project from KVM TODO website:
>>>
>>> allow handling short packets from softirq or VCPU context
>>>  Plan:
>>>    We are going through the scheduler 3 times
>>>    (could be up to 5 if softirqd is involved)
>>>    Consider RX: host irq -> io thread -> VCPU thread ->
>>>    guest irq -> guest thread.
>>>    This adds a lot of latency.
>>>    We can cut it by some 1.5x if we do a bit of work
>>>    either in the VCPU or softirq context.
>>>  Testing: netperf TCP RR - should be improved drastically
>>>           netperf TCP STREAM guest to host - no regression
>>>
>>> Would you mind saying more about the work either in the vCPU or
>>> softirq context?
>>
>> For TX, we might be able to execute it directly from VCPU context.
>> For RX, from softirq context.
>>

Do you mean for RX, we directly put data to a shared buffer accessed
by guest VM bypassing the IO thread? For TX, in vCPU context network
data is added to the shared buffer and kick host IRQ to send them?

>>> Why it is only for short packets handling?
>>
>> That's just one idea to avoid doing too much work like this.
>>
>> Doing too much work in VCPU context would break pipelining,
>> likely degrading stream performance.
>> Work in softirq context is not accounted against the correct
>> cgroups, doing a lot of work there will mean guest can steal
>> CPU from other guests.
>>
>>> Thanks a
>>> lot!
>>>
>>>
>>> Regards,
>>>
>>> Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reduce networking latency
  2014-09-29  9:04 ` Michael S. Tsirkin
@ 2014-10-15 18:30   ` David Xu
  2014-10-22 20:35     ` David Xu
  0 siblings, 1 reply; 6+ messages in thread
From: David Xu @ 2014-10-15 18:30 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm

2014-09-29 5:04 GMT-04:00 Michael S. Tsirkin <mst@redhat.com>:
> On Wed, Sep 24, 2014 at 02:40:53PM -0400, David Xu wrote:
>> Hi Michael,
>>
>> I found this interesting project from KVM TODO website:
>>
>> allow handling short packets from softirq or VCPU context
>>  Plan:
>>    We are going through the scheduler 3 times
>>    (could be up to 5 if softirqd is involved)
>>    Consider RX: host irq -> io thread -> VCPU thread ->
>>    guest irq -> guest thread.
>>    This adds a lot of latency.
>>    We can cut it by some 1.5x if we do a bit of work
>>    either in the VCPU or softirq context.
>>  Testing: netperf TCP RR - should be improved drastically
>>           netperf TCP STREAM guest to host - no regression
>>
>> Would you mind saying more about the work either in the vCPU or
>> softirq context?
>
> For TX, we might be able to execute it directly from VCPU context.
> For RX, from softirq context.
>

Which step is removed for TX and RX compared with vanilla? Or what's
the new path?

TX: guest thread-> host irq?
TX: host irq-> ?

Thanks.

>> Why it is only for short packets handling?
>
> That's just one idea to avoid doing too much work like this.
>
> Doing too much work in VCPU context would break pipelining,
> likely degrading stream performance.
> Work in softirq context is not accounted against the correct
> cgroups, doing a lot of work there will mean guest can steal
> CPU from other guests.
>
>> Thanks a
>> lot!
>>
>>
>> Regards,
>>
>> Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reduce networking latency
  2014-09-24 18:40 David Xu
@ 2014-09-29  9:04 ` Michael S. Tsirkin
  2014-10-15 18:30   ` David Xu
  0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2014-09-29  9:04 UTC (permalink / raw)
  To: David Xu; +Cc: kvm

On Wed, Sep 24, 2014 at 02:40:53PM -0400, David Xu wrote:
> Hi Michael,
> 
> I found this interesting project from KVM TODO website:
> 
> allow handling short packets from softirq or VCPU context
>  Plan:
>    We are going through the scheduler 3 times
>    (could be up to 5 if softirqd is involved)
>    Consider RX: host irq -> io thread -> VCPU thread ->
>    guest irq -> guest thread.
>    This adds a lot of latency.
>    We can cut it by some 1.5x if we do a bit of work
>    either in the VCPU or softirq context.
>  Testing: netperf TCP RR - should be improved drastically
>           netperf TCP STREAM guest to host - no regression
> 
> Would you mind saying more about the work either in the vCPU or
> softirq context?

For TX, we might be able to execute it directly from VCPU context.
For RX, from softirq context.

> Why it is only for short packets handling?

That's just one idea to avoid doing too much work like this.

Doing too much work in VCPU context would break pipelining,
likely degrading stream performance.
Work in softirq context is not accounted against the correct
cgroups, doing a lot of work there will mean guest can steal
CPU from other guests.

> Thanks a
> lot!
> 
> 
> Regards,
> 
> Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* reduce networking latency
@ 2014-09-24 18:40 David Xu
  2014-09-29  9:04 ` Michael S. Tsirkin
  0 siblings, 1 reply; 6+ messages in thread
From: David Xu @ 2014-09-24 18:40 UTC (permalink / raw)
  To: kvm; +Cc: mst

Hi Michael,

I found this interesting project from KVM TODO website:

allow handling short packets from softirq or VCPU context
 Plan:
   We are going through the scheduler 3 times
   (could be up to 5 if softirqd is involved)
   Consider RX: host irq -> io thread -> VCPU thread ->
   guest irq -> guest thread.
   This adds a lot of latency.
   We can cut it by some 1.5x if we do a bit of work
   either in the VCPU or softirq context.
 Testing: netperf TCP RR - should be improved drastically
          netperf TCP STREAM guest to host - no regression

Would you mind saying more about the work either in the vCPU or
softirq context? Why it is only for short packets handling? Thanks a
lot!


Regards,

Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-10-23  4:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-10 15:29 reduce networking latency David Xu
2014-09-24 18:40 David Xu
2014-09-29  9:04 ` Michael S. Tsirkin
2014-10-15 18:30   ` David Xu
2014-10-22 20:35     ` David Xu
2014-10-23  4:15       ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.