All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pankaj Chauhan <pankaj.chauhan@nxp.com>
To: "Tan, Jianfeng" <jianfeng.tan@intel.com>, <dev@dpdk.org>
Cc: <hemant.agrawal@nxp.com>, <yuanhan.liu@linux.intel.com>,
	<maxime.coquelin@redhat.com>
Subject: Re: [RFC][PATCH V2 1/3] examples/vhost: Add vswitch (generic switch) framework
Date: Thu, 15 Sep 2016 14:30:55 +0530	[thread overview]
Message-ID: <23bebfde-54ad-04fd-64cb-c99d7a4a78e9@nxp.com> (raw)
In-Reply-To: <ccbbdec5-29b5-9c20-8305-1346a5c39c14@intel.com>

On 9/13/2016 12:21 PM, Tan, Jianfeng wrote:
Hi Jianfeng,

>>>
>>> On 9/5/2016 6:54 PM, Pankaj Chauhan wrote:
>>>> Introduce support for a generic framework for handling of switching
>>>> between
>>>> physical and vhost devices. The vswitch framework introduces the
>>>> following
>>>> concept:
>>>>
>>>> 1. vswitch_dev: Vswitch device is a logical switch which can have
>>>> physical and
>>>> virtio devices. The devices are operated/used using standard rte_eth
>>>> API for
>>>> physical devices and lib_vhost API for vhost devices, PMD API is not
>>>> used
>>>> for vhost devices.
>>>>
>>>> 2. vswitch_port: Any physical or virtio device that is added to
>>>> vswitch. The
>>>> port can have its own tx/rx functions for doing data transfer, which
>>>> are exposed
>>>> to the framework using generic function pointers
>>>> (vs_port->do_tx/do_rx). This way
>>>> the generic code can do tx/rx without understanding the type of device
>>>> (Physical or
>>>> virtio). Similarly each port has its own functions to select tx/rx
>>>> queues. The framework
>>>> provides default tx/rx queue selection functions to the port when port
>>>> is added
>>>> (for both physical and vhost devices). But the framework allows the
>>>> switch implementation
>>>> to override the queue selection functions (vs_port->get_txq/rxq) if
>>>> required.
>>>>
>>>> 3. vswitch_ops: The ops is set of function pointers which are used to
>>>> do operations
>>>> like learning, unlearning, add/delete port, lookup_and_forward. The
>>>> user of vswitch
>>>> framework (vhost/main.[c,h])uses these function pointers to perform
>>>> above mentioned
>>>> operations, thus it remains agnostic of the underlying implementation.
>>>>
>>>> Different switching logics can implement their vswitch_device and
>>>> vswitch_ops, and
>>>> register with the framework. This framework makes vhost-switch
>>>> application scalable
>>>> in terms of:
>>>>
>>>> 1. Different switching logics (one of them is vmdq, vhost/vmdq.[c,h]
>>>> 2. Number of ports.
>>>> 3. Policies of selecting ports for rx and tx.
>>>>
>>>> Signed-off-by: Pankaj Chauhan <pankaj.chauhan@nxp.com>
>>>
>>> After this patch set, how's mapping relationship between cores and
>>> vswitch_dev? Old vhost example does not have the concept of switch, so
>>> each core is binded with some VDEVs. Now, we still keep original logic?
>>>
>>> (a) If yes, provided that phys device could has no direct relationship
>>> with vdevs in other switching logics, we may need to bind those physical
>>> devices to cores too? In that case, switch_worker() will receiving pkts
>>> from all devices (phys or virtual) and dispatch, which looks like:
>>>
>>> switch_worker() {
>>>     FOR each port binding to this core {
>>>          rx(port, pkts[], count);
>>>          vs_lookup_n_fwd( information_needed );
>>>     }
>>> }
>>
>> Since we support only one switch device running at one time. We bind
>> the ports of the switchdev to the core. But The switch might have it's
>> own logic to bind the port to the core. For example
>> VMDQ only supports one Physical port, other switch can support more
>> than one Physical port. In order to take care of that i have added two
>> ops in swithdev_ops:
>>
>> 1. vs_sched_rx_port:
>>
>> struct vswitch_port *vs_sched_rx_port(struct vswitch_dev *vs_dev, enum
>>                                       vswitch_port_type ptype,
>> uint16_t core_id)
>>
>> 2. vs_sched_tx_port:
>>
>> struct vswitch_port *vs_sched_tx_port(struct vswitch_dev *vs_dev, enum
>>                                       vswitch_port_type ptype, uint16_t
>> core_id)
>>
>> The idea of providing these functions is that vhost/main requests the
>> underlying switch implementation to schedule a port for rx or Tx, the
>> current running core_id is also passed as parameter. So the switch can
>> take a decision which port to do rx or tx based on core id, and may be
>> some other custom policy.
>>
>> For example VMDQ always returns the one single Physical port in
>> response to these functions called from any core. The other switch
>> can return the ports bound to the cores.
>>
>> Similarly there are two port operations (vs_port->get_rxq and
>> vs_port->get_txq), here also we pass core_id as parameter so that
>> the underlying switch implementation can schedule the rx or tx queue
>> based on the core_id.
>>
>> The above mentioned ops are used in drain_eth_rx() and
>> do_drain_mbuf_table() (main.c) currently, and they leave binding of
>> physical port and the queues to the underlying implementation. This
>> way we can accommodate VMDQ which uses only one physical port and
>> rxqueues based on VMDQ, OR any other switch which uses multiple
>> physical port and rx/tx queue scheduling based on some other logic.
>>
>> Please suggest if this way of scheduling ports and tx/rx queues is
>> fine or not?
>
> According to above explanation, in VMDQ switch, we cannot schedule two
> queues (belongs to the same port) on the same core, right?
>

Yes. This would be a limitation i believe which will not be acceptable.
The method that you suggested further in the email (tread port + 
queue_id as a vs_port) can solve this.

>
>>>
>>> (b) If no, we may bind each core with n switches? If so, switch_worker()
>>> also need to be reworked to keep receiving pkts from all ports of the
>>> switch, and dispatch.
>>>
>>> switch_worker() {
>>>     FOR each switch binding to this core {
>>>          FOR each port of switch {
>>>              rx(port, pkts[], count);
>>>              vs_lookup_n_fwd( information_needed );
>>>         }
>>>     }
>>> }
>> Since we currently support only one switch_dev in a running instance of
>> vhost_switch() we can use binding of ports to core as you suggested in
>> #(a).
>
> OK.
>
>>
>>>
>>> In all, (1) we'd better not use vdev to find phys dev in switch_worker
>>> any more;
>>
>> I agree with you. I have tried to do following in drain_eth_rx ():
>>
>>
>>         core_id = rte_lcore_id();
>>         rx_port = vs_sched_rx_port(vswitch_dev_g, VSWITCH_PTYPE_PHYS,
>>                                     core_id);
>>         if (unlikely(!rx_port))
>>             goto out;
>>
>>        rxq = rx_port->get_rxq(rx_port, vdev, core_id);
>>         rx_count = rx_port->do_rx(rx_port, rxq, NULL, pkts,
>> MAX_PKT_BURST);
>>
>> Here we don't assume any relation between vdev and the physical device
>> or rxq.  But pass vdev to the underlying switch so that it can choose
>> the rxq for this vdev. In case of VMDQ it will return VMDQ queue
>> number for this vdev. In case of any other switch implementation it
>> can have their own logic to decide rxq.
>
> Thanks for explanation.
>
>>
>> (2) we'd better not differentiate phys device and virtual
>>> device in generic framework (it's just an attribute of vswitch_port.
>>>
>>> What do you think?
>>
>> I agree with your thought that given the current API in this patchset we
>> should aim for making switch_worker agnostic of the port type. Ideally
>> it should look something like this:
>>
>> switch_worker() {
>>
>>         rx_port mask = VSWITCH_PTYPE_PHYS | VSWITCH_PTYPE_PHYS;
>>
>>         rx_port = vs_sched_rx_port(vswit_dev_g, rx_port_mask, core_id)
>>         rx_q = rx_port->get_rxq(vs_port, vdev, code_id);
>>         rx_port->do_rx(rx_port, rxq, NULL, pktss, MAX_PKT_BURST);
>
> Can we hide queues inside struct vswitch_port? I mean:
> For VMDQ switch, treat (port_id, queue_id) as a vswitch_port, so far
> you've already stored "struct vhost_dev *" into vswitch_port.priv when
> it's a virtual port, how about store queue_id into vswitch_port.priv
> when it's a physical port.
> For arp_learning switch, make (port_id, all_enabled_queues) as a
> vswitch_port.
> Summarize above two: we treat (port_id, all_enabled_queues[]) as a
> vswitch_port.
>
> How about it?

Thanks for wonderful suggestion! Yes it should be possible, instead of 
having one vs_port for physical device we could create one vs_port for 
each phys device + queue_id combination.

Then we can bind the vs_ports to the cores, and do away with binding 
vdevs to the cores.

In order to add extra vs_ports (port + queue_id) there are two methods:

1. vhost/main.c (or vswitch_common.c) queries the underlying switch 
about how many rx queues to support thourgh a new accessor. VMDQ will 
return the number of vmdq queues, in this case. After getting the number 
of rx queues, vhost/main.c creates the extra ports i.e one vs_port for 
each physical port and queue id combination.

2. Second method is that the switch implementation for example vmdq.c 
create the extra ports (port + queue_id) as part of vs_port->add_port 
operation for the physical port.

Which method do you think we should follow? I think #1 should be done so 
that same logic can be used for other switches also.
>
>>
>>         vs_lookup_n_fwd(rx_port, pkts, rx_count, rxq);
>>
>> }
>>
>> Here i have two problems in achieving switch_worker as mentioned above:
>>
>> 1. while doing Rx from physical port, we need the associated vdev to
>> know the VMDQ rx_q. That was the reason i kept current logic of
>> keeping vdevs bound to the core in the switch_worker.  If we don't
>> keep list of vdevs assigned to the core, how we' will get the rxq on
>> phsyical device in case of VMDQ switch implementation.
>
> I think my proposal above can solve this problem.

Yes thanks!
>
>>
>> 2. drain_mbuf_table() currently assumes that there is only one
>> physical device on which we have to transmit. We may have to rethink
>> where to keep the tx queues (&lcore_tx_queue[core_id]), i case we have
>> multiple physical ports we might have to move tx queues to each port.
>
> This function, drain_mbuf_table(), and its related data structures, are
> used for cache on tx direction. But it's not only useful for physical
> port, but also virtual port (vhost_dev). I suggest to make such cache a
> field of per vswitch_port. And each again, all vswitch_ports are
> connected to a core, each core will periodically clean those vswitch_ports.
>

okay. So we add similar tx_q structures (one per core) to each 
vswitch_port, correct?

Thanks,
Pankaj

  reply	other threads:[~2016-09-15  9:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-04 10:23 [RFC][PATCH V2 0/3] example/vhost: Introduce Vswitch Framework Pankaj Chauhan
2016-09-04 10:23 ` [RFC][PATCH V2 1/3] examples/vhost: Add vswitch (generic switch) framework Pankaj Chauhan
2016-09-09  8:56   ` Tan, Jianfeng
2016-09-12 10:55     ` Pankaj Chauhan
2016-09-13  6:51       ` Tan, Jianfeng
2016-09-15  9:00         ` Pankaj Chauhan [this message]
2016-09-19 12:42           ` Tan, Jianfeng
2016-09-27 11:26             ` Pankaj Chauhan
2016-09-19 14:43         ` Yuanhan Liu
2016-09-20  8:58           ` Pankaj Chauhan
2016-09-26  3:56             ` Yuanhan Liu
2016-09-27 11:35               ` Pankaj Chauhan
2016-09-27 12:10                 ` Yuanhan Liu
2016-09-11 12:21   ` Yuanhan Liu
2016-09-12 10:59     ` Pankaj Chauhan
2016-09-26  4:12   ` Yuanhan Liu
2016-09-27 11:44     ` Pankaj Chauhan
2016-09-27 12:03       ` Yuanhan Liu
2016-09-04 10:23 ` [RFC][PATCH V2 2/3] examples/vhost: Add vswitch command line options Pankaj Chauhan
2016-09-13 12:14   ` Yuanhan Liu
2016-09-15  9:11     ` Pankaj Chauhan
2016-09-04 10:24 ` [RFC][PATCH V2 3/3] examples/vhost: Add VMDQ vswitch device Pankaj Chauhan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23bebfde-54ad-04fd-64cb-c99d7a4a78e9@nxp.com \
    --to=pankaj.chauhan@nxp.com \
    --cc=dev@dpdk.org \
    --cc=hemant.agrawal@nxp.com \
    --cc=jianfeng.tan@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.