All of lore.kernel.org
 help / color / mirror / Atom feed
* PHC device sharing between PCI functions
@ 2013-07-01 16:56 Ben Hutchings
  2013-07-02 14:24 ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2013-07-01 16:56 UTC (permalink / raw)
  To: Richard Cochran; +Cc: linux-net-drivers, netdev

Future Solarflare NICs may allow multiple PCI functions to make use of a
PTP hardware clock, but without a separate clock per function (probably
only one per controller).

I understand that shared PTP hardware clocks already exist, but they
usually have an independent existence as a separate PCI or platform
device.  In this case the clock would be accessible through any of the
PCI functions that also have a net device.

Options I see are:

1. Instantiate a clock only for the first function.  But that would
preclude making the clock available within multiple VMs and their host.

2. Keep track of controllers in the driver, and instantiate a clock
device for each function we see that is part of a controller we haven't
yet seen.  However, if that first function is subsequently passed-
through to a VM from its host, the host loses the clock (it can't be
reparented).

3. Keep track of controllers in the driver, instantiate a 'platform
device' for each of them, and instantiate a clock for each of them.
This is a little weird, as it wouldn't have any obvious association to
the PCI device hierarchy.  But it would let us control the lifetime of
the clock devices independently of any one function.

I prefer option 3 as I dislike introducing special cases, but I would be
interested to hear your (or other people's) opinion on this.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-01 16:56 PHC device sharing between PCI functions Ben Hutchings
@ 2013-07-02 14:24 ` Richard Cochran
  2013-07-02 15:17   ` Ben Hutchings
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2013-07-02 14:24 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev

Ben,

I really don't understand what the use case is...

On Mon, Jul 01, 2013 at 05:56:08PM +0100, Ben Hutchings wrote:
> Future Solarflare NICs may allow multiple PCI functions to make use of a
> PTP hardware clock, but without a separate clock per function (probably
> only one per controller).
> 
> I understand that shared PTP hardware clocks already exist, but they
> usually have an independent existence as a separate PCI or platform
> device.  In this case the clock would be accessible through any of the
> PCI functions that also have a net device.
> 
> Options I see are:
> 
> 1. Instantiate a clock only for the first function.  But that would
> preclude making the clock available within multiple VMs and their host.

So, I guess PCI functions on one card may be divided up among the
guests in a VM environment?

Even if you did make your one clock visible to mutiple guests, still
only one would be able to adjust the clock, right?

And if so, then how will the mutiple, read-only MAC clocks help other
guests? Seems kinda useless to me.
 
> 3. Keep track of controllers in the driver, instantiate a 'platform
> device' for each of them, and instantiate a clock for each of them.
> This is a little weird, as it wouldn't have any obvious association to
> the PCI device hierarchy.  But it would let us control the lifetime of
> the clock devices independently of any one function.

I think clock and MAC must go hand in hand. Does one card appear as a
MAC in more than one VM?

> I prefer option 3 as I dislike introducing special cases, but I would be
> interested to hear your (or other people's) opinion on this.

Sorry, my brain just isn't letting any of this in.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-02 14:24 ` Richard Cochran
@ 2013-07-02 15:17   ` Ben Hutchings
  2013-07-03 18:30     ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2013-07-02 15:17 UTC (permalink / raw)
  To: Richard Cochran; +Cc: linux-net-drivers, netdev

On Tue, 2013-07-02 at 16:24 +0200, Richard Cochran wrote:
> Ben,
> 
> I really don't understand what the use case is...
> 
> On Mon, Jul 01, 2013 at 05:56:08PM +0100, Ben Hutchings wrote:
> > Future Solarflare NICs may allow multiple PCI functions to make use of a
> > PTP hardware clock, but without a separate clock per function (probably
> > only one per controller).
> > 
> > I understand that shared PTP hardware clocks already exist, but they
> > usually have an independent existence as a separate PCI or platform
> > device.  In this case the clock would be accessible through any of the
> > PCI functions that also have a net device.
> > 
> > Options I see are:
> > 
> > 1. Instantiate a clock only for the first function.  But that would
> > preclude making the clock available within multiple VMs and their host.
> 
> So, I guess PCI functions on one card may be divided up among the
> guests in a VM environment?

Yes.  I don't know whether that's actually useful given the jitter that
virtualisation tends to introduce, but I wouldn't want to close off the
possibility.

> Even if you did make your one clock visible to mutiple guests, still
> only one would be able to adjust the clock, right?

Yes.

> And if so, then how will the mutiple, read-only MAC clocks help other
> guests? Seems kinda useless to me.

It would allow them to convert hardware timestamps or sync system time
to NIC time.
 
> > 3. Keep track of controllers in the driver, instantiate a 'platform
> > device' for each of them, and instantiate a clock for each of them.
> > This is a little weird, as it wouldn't have any obvious association to
> > the PCI device hierarchy.  But it would let us control the lifetime of
> > the clock devices independently of any one function.
> 
> I think clock and MAC must go hand in hand. Does one card appear as a
> MAC in more than one VM?

Yes, SR-IOV allows for up to 256 PCI functions (or even more) on a
single endpoint (i.e. a single controller).  The hardware will
filter/steer packets to and from the multiple functions based on the
packet header.  In the current chip, hardware timestamping is limited to
a single port and function, but the next generation should be less
constrained.

Ben.

> > I prefer option 3 as I dislike introducing special cases, but I would be
> > interested to hear your (or other people's) opinion on this.
> 
> Sorry, my brain just isn't letting any of this in.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-02 15:17   ` Ben Hutchings
@ 2013-07-03 18:30     ` Richard Cochran
  2013-07-03 19:52       ` Ben Hutchings
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2013-07-03 18:30 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev

On Tue, Jul 02, 2013 at 04:17:42PM +0100, Ben Hutchings wrote:
> On Tue, 2013-07-02 at 16:24 +0200, Richard Cochran wrote:
> 
> > And if so, then how will the mutiple, read-only MAC clocks help other
> > guests? Seems kinda useless to me.
> 
> It would allow them to convert hardware timestamps or sync system time
> to NIC time.

Okay, so let's talk about HW time stamps. That is a separate issue
from the PHC. You would think that you could offer HW time stamping to
each VM guest using the MAC. But wait, how will the guests enable it?

They will have to call the HWTSTAMP ioctl. Unfortunately, your driver
is one of those that offers fine grained choices (as opposed to all
packets or none), and that means that the guests will be potentially
spoiling each others settings (unless you implement per-function
filters).

WRT the PHC, I guess you could offer:

- gettime	to all functions
- set/adjtime	works in one, throws error in others
- pps hook	to all functions

You just need to decide how to determine which function will have the
writable clock.

Also, it might be worth thinking about how well the pps interrupt will
work. When there are many guest, will the card produce multiple MSI
interrupts every second, on the second? That won't work too well, I
think.

As an alternative to the pps interrupt, the phc2sys program (from
linuxptp) can periodically read out the system and PHC times in a
tight loop in order to discipline the system clock. This works quite
well in practice, but, again, what happens when multiple guest all try
to read the PHC time over PCIe simultaneously?

Thanks,
Richard

BTW, I am working on adding Tx time stamping to the tuntap driver. My
motivation is be able to conveniently test some of the PTP aspects
(like Best Master Clock selection) over a virtual switch. I also
wonder whether this could be used to distribute the host's system time
to VM guests, and how well it would work.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-03 18:30     ` Richard Cochran
@ 2013-07-03 19:52       ` Ben Hutchings
  2013-07-04  5:36         ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2013-07-03 19:52 UTC (permalink / raw)
  To: Richard Cochran; +Cc: linux-net-drivers, netdev, Laurence Evans

On Wed, 2013-07-03 at 20:30 +0200, Richard Cochran wrote:
> On Tue, Jul 02, 2013 at 04:17:42PM +0100, Ben Hutchings wrote:
> > On Tue, 2013-07-02 at 16:24 +0200, Richard Cochran wrote:
> > 
> > > And if so, then how will the mutiple, read-only MAC clocks help other
> > > guests? Seems kinda useless to me.
> > 
> > It would allow them to convert hardware timestamps or sync system time
> > to NIC time.
> 
> Okay, so let's talk about HW time stamps. That is a separate issue
> from the PHC. You would think that you could offer HW time stamping to
> each VM guest using the MAC. But wait, how will the guests enable it?
> 
> They will have to call the HWTSTAMP ioctl. Unfortunately, your driver
> is one of those that offers fine grained choices (as opposed to all
> packets or none), and that means that the guests will be potentially
> spoiling each others settings (unless you implement per-function
> filters).

The SFC9000 family doesn't really support timestamping at all.  It is
implemented on some boards using the management controller, debug probe
signals and an FPGA.  It therefore takes a relatively long time for the
hardware to process each packet, so this has to be limited to only PTP
packets.

For future controllers with integrated timestamping, the driver should
be able to provide more flexibility.

> WRT the PHC, I guess you could offer:
> 
> - gettime	to all functions
> - set/adjtime	works in one, throws error in others
> - pps hook	to all functions
> 
> You just need to decide how to determine which function will have the
> writable clock.

So you think each function should have its own clock device, but it's
only writable on one?  I think that would work, but I thought it would
be undesirable to have multiple aliases for the same physical clock.

> Also, it might be worth thinking about how well the pps interrupt will
> work. When there are many guest, will the card produce multiple MSI
> interrupts every second, on the second? That won't work too well, I
> think.
>
> As an alternative to the pps interrupt, the phc2sys program (from
> linuxptp) can periodically read out the system and PHC times in a
> tight loop in order to discipline the system clock. This works quite
> well in practice, but, again, what happens when multiple guest all try
> to read the PHC time over PCIe simultaneously?

Clock registers are currently only accessible through the management
controller, with high and unpredictable latency.  Forwarding of PPS
events may also be delayed by packet processing etc.  So when the driver
handles a PPS event, the driver and firmware perform this kind of synch
loop for a short while, and then the driver adjusts the PPS event back
to the top of the second with pps_sub_ts().

Running that kind of synch loop from userland probably wouldn't work
nearly as well.

I don't think clock registers will be exposed to the host in future
either, so the driver will still have to handle this in a similar way.
If PPS events are enabled in multiple functions, the driver may spend
some more time spinning but it will do this in process context (work
item) and it won't write anything over the PCIe link while waiting for
the firmware to service its request.

> BTW, I am working on adding Tx time stamping to the tuntap driver. My
> motivation is be able to conveniently test some of the PTP aspects
> (like Best Master Clock selection) over a virtual switch. I also
> wonder whether this could be used to distribute the host's system time
> to VM guests, and how well it would work.

No idea.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-03 19:52       ` Ben Hutchings
@ 2013-07-04  5:36         ` Richard Cochran
  2013-07-04 14:34           ` Ben Hutchings
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2013-07-04  5:36 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev, Laurence Evans

On Wed, Jul 03, 2013 at 08:52:33PM +0100, Ben Hutchings wrote:
> 
> So you think each function should have its own clock device, but it's
> only writable on one?  I think that would work, but I thought it would
> be undesirable to have multiple aliases for the same physical clock.

The aliases would not bother me, as long as the ethtool interface-to-phc
association works properly. Of course, if there is a way to suppress
the aliases in the non-VM case, that would be ideal.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-04  5:36         ` Richard Cochran
@ 2013-07-04 14:34           ` Ben Hutchings
  2013-07-04 15:53             ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2013-07-04 14:34 UTC (permalink / raw)
  To: Richard Cochran; +Cc: linux-net-drivers, netdev, Laurence Evans

On Thu, 2013-07-04 at 07:36 +0200, Richard Cochran wrote:
> On Wed, Jul 03, 2013 at 08:52:33PM +0100, Ben Hutchings wrote:
> > 
> > So you think each function should have its own clock device, but it's
> > only writable on one?  I think that would work, but I thought it would
> > be undesirable to have multiple aliases for the same physical clock.
> 
> The aliases would not bother me, as long as the ethtool interface-to-phc
> association works properly.

Well what would be 'properly' in this case?

> Of course, if there is a way to suppress
> the aliases in the non-VM case, that would be ideal.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-04 14:34           ` Ben Hutchings
@ 2013-07-04 15:53             ` Richard Cochran
  2013-07-04 16:21               ` Ben Hutchings
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2013-07-04 15:53 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev, Laurence Evans

On Thu, Jul 04, 2013 at 03:34:26PM +0100, Ben Hutchings wrote:
> On Thu, 2013-07-04 at 07:36 +0200, Richard Cochran wrote:
> > 
> > The aliases would not bother me, as long as the ethtool interface-to-phc
> > association works properly.
> 
> Well what would be 'properly' in this case?

If a PCIe card provides one interface eth0 and four PHC devcies
/dev/ptp0-3, then doing 'ethtool -T eth0' should show PHC index 0.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-04 15:53             ` Richard Cochran
@ 2013-07-04 16:21               ` Ben Hutchings
  2013-07-04 17:39                 ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2013-07-04 16:21 UTC (permalink / raw)
  To: Richard Cochran; +Cc: linux-net-drivers, netdev, Laurence Evans

On Thu, 2013-07-04 at 17:53 +0200, Richard Cochran wrote:
> On Thu, Jul 04, 2013 at 03:34:26PM +0100, Ben Hutchings wrote:
> > On Thu, 2013-07-04 at 07:36 +0200, Richard Cochran wrote:
> > > 
> > > The aliases would not bother me, as long as the ethtool interface-to-phc
> > > association works properly.
> > 
> > Well what would be 'properly' in this case?
> 
> If a PCIe card provides one interface eth0 and four PHC devcies
> /dev/ptp0-3, then doing 'ethtool -T eth0' should show PHC index 0.

But that is the opposite of what we're talking about.  Say the card has,
16 functions resulting in net devices eth0-eth15, each of which can
access the same physical clock.  You said it's OK to have read-only
aliases for a clock, so then there might be a writable /dev/ptp0 and
read-only /dev/ptp1-ptp15.  What is the proper association between net
devices and clock devices?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-04 16:21               ` Ben Hutchings
@ 2013-07-04 17:39                 ` Richard Cochran
  2013-07-04 18:19                   ` Ben Hutchings
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2013-07-04 17:39 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev, Laurence Evans

On Thu, Jul 04, 2013 at 05:21:18PM +0100, Ben Hutchings wrote:
> 
> But that is the opposite of what we're talking about.  Say the card has,
> 16 functions resulting in net devices eth0-eth15, each of which can
> access the same physical clock.  You said it's OK to have read-only
> aliases for a clock, so then there might be a writable /dev/ptp0 and
> read-only /dev/ptp1-ptp15.  What is the proper association between net
> devices and clock devices?

But the card doesn't have 16 plugs, or does it?

I understood you to have meant that the card has one physical plug,
shared between one host and fifteen guests. Or maybe I am total
confused. No, I am surely confused.

Anyhow, if a host is multihomed, then ideally all of the interfaces
share one and the same clock. But, if the only purpose of the multiple
interfaces in the VM host is to assign them to VM guests, then I guess
it doesn't matter very much how they appear in the VM host.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-04 17:39                 ` Richard Cochran
@ 2013-07-04 18:19                   ` Ben Hutchings
  2013-07-05  5:25                     ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2013-07-04 18:19 UTC (permalink / raw)
  To: Richard Cochran; +Cc: linux-net-drivers, netdev, Laurence Evans

On Thu, 2013-07-04 at 19:39 +0200, Richard Cochran wrote:
> On Thu, Jul 04, 2013 at 05:21:18PM +0100, Ben Hutchings wrote:
> > 
> > But that is the opposite of what we're talking about.  Say the card has,
> > 16 functions resulting in net devices eth0-eth15, each of which can
> > access the same physical clock.  You said it's OK to have read-only
> > aliases for a clock, so then there might be a writable /dev/ptp0 and
> > read-only /dev/ptp1-ptp15.  What is the proper association between net
> > devices and clock devices?
> 
> But the card doesn't have 16 plugs, or does it?

No, but there can be multiple ports, and multiple functions per port.
All sharing the same clock.

> I understood you to have meant that the card has one physical plug,
> shared between one host and fifteen guests. Or maybe I am total
> confused. No, I am surely confused.
> 
> Anyhow, if a host is multihomed, then ideally all of the interfaces
> share one and the same clock. But, if the only purpose of the multiple
> interfaces in the VM host is to assign them to VM guests, then I guess
> it doesn't matter very much how they appear in the VM host.

The functions that are intended to be assigned to guests will still
appear and may have a driver bound to them in the host before they are
assigned to guests.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PHC device sharing between PCI functions
  2013-07-04 18:19                   ` Ben Hutchings
@ 2013-07-05  5:25                     ` Richard Cochran
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Cochran @ 2013-07-05  5:25 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: linux-net-drivers, netdev, Laurence Evans

On Thu, Jul 04, 2013 at 07:19:05PM +0100, Ben Hutchings wrote:

> The functions that are intended to be assigned to guests will still
> appear and may have a driver bound to them in the host before they are
> assigned to guests.

Okay, then to answer your question...

> On Thu, 2013-07-04 at 19:39 +0200, Richard Cochran wrote:
> > On Thu, Jul 04, 2013 at 05:21:18PM +0100, Ben Hutchings wrote:
> > > 
> > > But that is the opposite of what we're talking about.  Say the card has,
> > > 16 functions resulting in net devices eth0-eth15, each of which can
> > > access the same physical clock.  You said it's OK to have read-only
> > > aliases for a clock, so then there might be a writable /dev/ptp0 and
> > > read-only /dev/ptp1-ptp15.  What is the proper association between net
> > > devices and clock devices?

I would say simply:

	ethtool -T eth0  -> phc index 0  RDWR
	ethtool -T eth1  -> phc index 1  RDONLY
	...
	ethtool -T eth15 -> phc index 15 RDONLY

The numbering might come out differently, but it should be 1:1.

The userland PTP stack will just have to be configured to make sense
of this. For example, the host can just use eth0/ptp0 normally. For
the guests, in ptp4l we have a "free_running" option that leaves the
clock alone and just calculates phase and frequency offset.

Local applications can use this information to synchronize to the
remote master. (BTW this method is specified by 802.1AS). Also, you
can discipline the system clock by measuring the sys-phc offset and
adding the phc-master offset.

If the clocks set their .max_adj to zero, then the fact that they are
read only is discoverable.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-07-05  5:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-01 16:56 PHC device sharing between PCI functions Ben Hutchings
2013-07-02 14:24 ` Richard Cochran
2013-07-02 15:17   ` Ben Hutchings
2013-07-03 18:30     ` Richard Cochran
2013-07-03 19:52       ` Ben Hutchings
2013-07-04  5:36         ` Richard Cochran
2013-07-04 14:34           ` Ben Hutchings
2013-07-04 15:53             ` Richard Cochran
2013-07-04 16:21               ` Ben Hutchings
2013-07-04 17:39                 ` Richard Cochran
2013-07-04 18:19                   ` Ben Hutchings
2013-07-05  5:25                     ` Richard Cochran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.