All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Jianfeng Tan <jianfeng.tan@intel.com>, dev@dpdk.org
Cc: bruce.richardson@intel.com, konstantin.ananyev@intel.com,
	thomas@monjalon.net
Subject: Re: [PATCH 1/3] eal: add channel for multi-process communication
Date: Mon, 11 Dec 2017 11:04:33 +0000	[thread overview]
Message-ID: <150073b1-50be-4264-9a30-8c4aa62e3078@intel.com> (raw)
In-Reply-To: <1512067450-59203-2-git-send-email-jianfeng.tan@intel.com>

On 30-Nov-17 6:44 PM, Jianfeng Tan wrote:
> Previouly, there are three channels for multi-process
> (i.e., primary/secondary) communication.
>    1. Config-file based channel, in which, the primary process writes
>       info into a pre-defined config file, and the secondary process
>       reads info out.
>    2. vfio submodule has its own channel based on unix socket for the
>       secondary process to get container fd and group fd from the
>       primary process.
>    3. pdump submodule also has its own channel based on unix socket for
>       packet dump.
> 
> It'll be good to have a generic communication channel for multi-process
> communication to accomodate the requirements including:
>    a. Secondary wants to send info to primary, for example, secondary
>       would like to send request (about some specific vdev to primary).
>    b. Sending info at any time, instead of just initialization time.
>    c. Share FDs with the other side, for vdev like vhost, related FDs
>       (memory region, kick) should be shared.
>    d. A send message request needs the other side to response immediately.
> 
> This patch proposes to create a communication channel, as an unix
> socket connection, for above requirements. Primary will listen on
> the unix socket; secondary will connect this socket to talk.
> 
> Three new APIs are added:
> 
>    1. rte_eal_mp_action_register is used to register an action,
>       indexed by a string; if the calling component wants to
>       response the messages from the corresponding component in
>       its primary process or secondary processes.
>    2. rte_eal_mp_action_unregister is used to unregister the action
>       if the calling component does not want to response the messages.
>    3. rte_eal_mp_sendmsg is used to send a message.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> ---

<...snip...>

> +
> +int
> +rte_eal_mp_action_register(const char *action_name, rte_eal_mp_t action)
> +{
> +	struct action_entry *entry = malloc(sizeof(struct action_entry));
> +
> +	if (entry == NULL)
> +		return -ENOMEM;
> +
> +	if (find_action_entry_by_name(action_name) != NULL)
> +		return -EEXIST;

This should probably do a free(entry).

> +
> +	strncpy(entry->action_name, action_name, MAX_ACTION_NAME_LEN);
> +	entry->action = action;
> +	TAILQ_INSERT_TAIL(&action_entry_list, entry, next);
> +	return 0;
> +}
> +

<...snip...>

> +
> +static int
> +add_secondary(void)
> +{
> +	int fd;
> +	struct epoll_event ev;
> +
> +	while (1) {
> +		fd = accept(mp_fds.listen, NULL, NULL);
> +		if (fd < 0 && errno == EAGAIN)
> +			break;
> +		else if (fd < 0) {
> +			RTE_LOG(ERR, EAL, "primary failed to accept: %s\n",
> +				strerror(errno));
> +			return -1;
> +		}
> +
> +		ev.events = EPOLLIN | EPOLLRDHUP;
> +		ev.data.fd = fd;
> +		if (epoll_ctl(mp_fds.efd, EPOLL_CTL_ADD, fd, &ev) < 0) {
> +			RTE_LOG(ERR, EAL, "failed to add secondary: %s\n",
> +				strerror(errno));
> +			break;
> +		}
> +		if (add_sec_proc(fd) < 0) {
> +			RTE_LOG(ERR, EAL, "too many secondary processes\n");
> +			close(fd);
> +			break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static void *
> +mp_handler(void *arg __rte_unused)
> +{
> +	int fd;
> +	int i, n;
> +	struct epoll_event ev;
> +	struct epoll_event *events;
> +	int is_primary = rte_eal_process_type() == RTE_PROC_PRIMARY;
> +
> +	ev.events = EPOLLIN | EPOLLRDHUP;
> +	ev.data.fd = (is_primary) ? mp_fds.listen : mp_fds.primary;
> +	if (epoll_ctl(mp_fds.efd, EPOLL_CTL_ADD, ev.data.fd, &ev) < 0) {
> +		RTE_LOG(ERR, EAL, "failed to epoll_ctl: %s\n",
> +			strerror(errno));
> +		exit(EXIT_FAILURE);

rte_exit?

> +	}
> +
> +	events = calloc(20, sizeof ev);
> +
> +	while (1) {
> +		n = epoll_wait(mp_fds.efd, events, 20, -1);
> +		for (i = 0; i < n; i++) {
> +			if (is_primary && events[i].data.fd == mp_fds.listen) {
> +				if (events[i].events != EPOLLIN) {
> +					RTE_LOG(ERR, EAL, "what happens?\n");

More descriptive error message would be nice :)

> +					exit(EXIT_FAILURE);

rte_exit?

> +				}
> +
> +				if (add_secondary() < 0)
> +					break;

Doing epoll_ctl in multiple different places hurts readability IMO. 
Might be a good idea to refactor add_secondary and mp_handler in a way 
that keeps all epoll handling in one place.

> +
> +				continue;
> +			}
> +
> +			fd = events[i].data.fd;
> +
> +			if ((events[i].events & EPOLLIN)) {
> +				if (process_msg(fd) < 0) {
> +					RTE_LOG(ERR, EAL,
> +						"failed to process msg\n");
> +					if (!is_primary)
> +						exit(EXIT_FAILURE);

rte_exit()?

> +				}
> +				continue;
> +			}
> +
> +			/* EPOLLERR, EPOLLHUP, etc */
> +			if (is_primary) {
> +				RTE_LOG(ERR, EAL, "secondary exit: %d\n", fd);
> +				epoll_ctl(mp_fds.efd, EPOLL_CTL_DEL, fd, NULL);
> +				del_sec_proc(fd);
> +				close(fd);
> +			} else {
> +				RTE_LOG(ERR, EAL, "primary exits, so do I\n");
> +				/* Exit secondary when primary exits? */
> +				exit(EXIT_FAILURE);

This is changing previous behavior. I don't think exiting secondary when 
primary exits is something we want to do, so i would just print an 
error, but not exit the process.

> +			}
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +int
> +rte_eal_mp_channel_init(void)
> +{
> +	int i, fd, ret;
> +	const char *path;
> +	struct sockaddr_un un;
> +	pthread_t tid;
> +	char thread_name[RTE_MAX_THREAD_NAME_LEN];
> +
> +	mp_fds.efd = epoll_create1(0);
> +	if (mp_fds.efd < 0) {
> +		RTE_LOG(ERR, EAL, "epoll_create1 failed\n");
> +		return -1;
> +	}
> +
> +	fd = socket(AF_UNIX, SOCK_STREAM, 0);
> +	if (fd < 0) {
> +		RTE_LOG(ERR, EAL, "Failed to create unix socket\n");
> +		return -1;
> +	}
> +
> +	memset(&un, 0, sizeof(un));
> +	un.sun_family = AF_UNIX;
> +	path = eal_mp_unix_path();
> +	strncpy(un.sun_path, path, sizeof(un.sun_path));
> +	un.sun_path[sizeof(un.sun_path) - 1] = '\0';
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		for (i = 0; i < MAX_SECONDARY_PROCS; ++i)
> +			mp_fds.secondaries[i] = -1;
> +
> +		if (fcntl(fd, F_SETFL, O_NONBLOCK) < 0) {
> +			RTE_LOG(ERR, EAL, "cannot set nonblocking mode\n");
> +			close(fd);
> +			return -1;
> +		}
> +
> +		/* The file still exists since last run */
> +		unlink(path);
> +
> +		ret = bind(fd, (struct sockaddr *)&un, sizeof(un));
> +		if (ret < 0) {
> +			RTE_LOG(ERR, EAL, "failed to bind to %s: %s\n",
> +				path, strerror(errno));
> +			close(fd);
> +			return -1;
> +		}
> +		RTE_LOG(INFO, EAL, "primary bind to %s\n", path);
> +
> +		ret = listen(fd, 1024);
> +		if (ret < 0) {
> +			RTE_LOG(ERR, EAL, "failed to listen: %s\n",
> +				strerror(errno));
> +			close(fd);
> +			return -1;
> +		}
> +		mp_fds.listen = fd;
> +	} else {
> +		ret = connect(fd, (struct sockaddr *)&un, sizeof(un));
> +		if (ret < 0) {
> +			RTE_LOG(ERR, EAL, "failed to connect primary\n");
> +			return -1;

Do we want to prevent secondary from launching if it can't connect to 
primary? Some use cases might rely on previous behavior. Maybe instead 
add some checks in handling functions to ensure that we have a valid 
connection to the primary before doing anything?

> +		}
> +		mp_fds.primary = fd;
> +	}
> +
> +	ret = pthread_create(&tid, NULL, mp_handler, NULL);
> +	if (ret < 0) {
> +		RTE_LOG(ERR, EAL, "failed to create thead: %s\n",
> +			strerror(errno));
> +		close(fd);
> +		close(mp_fds.efd);
> +		return -1;
> +	}

<...snip...>

> +	if (fds_num > SCM_MAX_FD) {
> +		RTE_LOG(ERR, EAL,
> +			"Cannot send more than %d FDs\n", SCM_MAX_FD);
> +		return -E2BIG;
> +	}
> +
> +	len_msg = sizeof(struct msg_hdr) + len_params;
> +	if (len_msg > MAX_MESSAGE_LENGTH) {
> +		RTE_LOG(ERR, EAL, "Message is too long\n");
> +		return -ENOMEM;

Nitpicking, but is this really -ENOMEM? Shouldn't this be -EINVAL or 
-E2BIG? Also, this is external API - maybe return -1 and set rte_errno?

> +	}
> +
> +	RTE_LOG(INFO, EAL, "send msg: %s, %d\n", action_name, len_msg);

Do we want this as INFO, not DEBUG?

> +
> +	msg = malloc(len_msg);
> +	if (!msg) {
> +		RTE_LOG(ERR, EAL, "Cannot alloc memory for msg\n");
> +		return -ENOMEM;
> +	}

<...snip...>

>   
>   /**
> + * Action function typedef used by other components.
> + *
> + * As we create unix socket channel for primary/secondary communication, use
> + * this function typedef to register action for coming messages.
> + */
> +typedef int (*rte_eal_mp_t)(const void *params, int len,
> +			    int fds[], int fds_num);

Nitpicking, but probably needs newlines before comments, here and after 
next function definition.

> +/**
> + * Register an action function for primary/secondary communication.
> + *
> + * Call this function to register an action, if the calling component wants
> + * to response the messages from the corresponding component in its primary
> + * process or secondary processes.
> + *
> + * @param action_name
> + *   The action_name argument plays as the nonredundant key to find the action.
> + *
> + * @param action
> + *   The action argument is the function pointer to the action function.
> + *
> + * @return
> + *  - 0 on success.
> + *  - (<0) on failure.
> + */

<...snip...>

> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 229eec9..a84eab4 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -896,6 +896,15 @@ rte_eal_init(int argc, char **argv)
>   
>   	eal_check_mem_on_local_socket();
>   
> +	if (rte_eal_mp_channel_init() < 0) {
> +		rte_eal_init_alert("failed to init mp channel\n");
> +		rte_errno = EFAULT;
> +		return -1;
> +	}

As noted above, maybe only fail if it's primary process?

> +
> +	if (eal_plugins_init() < 0)
> +		rte_eal_init_alert("Cannot init plugins\n");

This is probably a leftover of some other patch?

> +
>   	eal_thread_init_master(rte_config.master_lcore);
>   
>   	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
> diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
> index f4f46c1..6762397 100644
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -235,4 +235,26 @@ EXPERIMENTAL {
>   	rte_service_set_stats_enable;
>   	rte_service_start_with_defaults;
>   
> +} DPDK_17.08;
> +
> +DPDK_17.11 {
> +	global:
> +
> +	rte_bus_get_iommu_class;
> +	rte_eal_iova_mode;
> +	rte_eal_mbuf_default_mempool_ops;
> +	rte_lcore_has_role;
> +	rte_memcpy_ptr;
> +	rte_pci_get_iommu_class;
> +	rte_pci_match;
> +
> +} DPDK_17.08;
> +

Same here, this looks like leftovers of rebase.

> +DPDK_18.02 {
> +	global:
> +
> +	rte_eal_mp_action_register;
> +	rte_eal_mp_action_unregister;
> +	rte_eal_mp_sendmsg;
> +
>   } DPDK_17.11;
> 


-- 
Thanks,
Anatoly

  reply	other threads:[~2017-12-11 11:04 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30 18:44 [PATCH 0/3] generic channel for multi-process communication Jianfeng Tan
2017-11-30 18:44 ` [PATCH 1/3] eal: add " Jianfeng Tan
2017-12-11 11:04   ` Burakov, Anatoly [this message]
2017-12-11 16:43   ` Ananyev, Konstantin
2017-11-30 18:44 ` [PATCH 2/3] eal: add synchronous " Jianfeng Tan
2017-12-11 11:39   ` Burakov, Anatoly
2017-12-11 16:49     ` Ananyev, Konstantin
2017-11-30 18:44 ` [PATCH 3/3] vfio: use the generic multi-process channel Jianfeng Tan
2017-12-11 12:01   ` Burakov, Anatoly
2017-12-11  9:59 ` [PATCH 0/3] generic channel for multi-process communication Burakov, Anatoly
2017-12-12  7:34   ` Tan, Jianfeng
2017-12-12 16:18     ` Burakov, Anatoly
2018-01-11  4:07 ` [PATCH v2 0/4] " Jianfeng Tan
2018-01-11  4:07   ` [PATCH v2 1/4] eal: add " Jianfeng Tan
2018-01-13 12:57     ` Burakov, Anatoly
2018-01-15 19:52     ` Ananyev, Konstantin
2018-01-11  4:07   ` [PATCH v2 2/4] eal: add and del secondary processes in the primary Jianfeng Tan
2018-01-13 13:11     ` Burakov, Anatoly
2018-01-15 21:45     ` Ananyev, Konstantin
2018-01-11  4:07   ` [PATCH v2 3/4] eal: add synchronous multi-process communication Jianfeng Tan
2018-01-13 13:41     ` Burakov, Anatoly
2018-01-16  0:00     ` Ananyev, Konstantin
2018-01-16  8:10       ` Tan, Jianfeng
2018-01-16 11:12         ` Ananyev, Konstantin
2018-01-16 16:47           ` Tan, Jianfeng
2018-01-17 10:50             ` Ananyev, Konstantin
2018-01-17 13:09               ` Tan, Jianfeng
2018-01-17 13:15                 ` Tan, Jianfeng
2018-01-17 17:20                 ` Ananyev, Konstantin
2018-01-11  4:07   ` [PATCH v2 4/4] vfio: use the generic multi-process channel Jianfeng Tan
2018-01-13 14:03     ` Burakov, Anatoly
2018-03-04 14:57     ` [PATCH v5] vfio: change to use " Jianfeng Tan
2018-03-14 13:27       ` Burakov, Anatoly
2018-03-19  6:53         ` Tan, Jianfeng
2018-03-20 10:33           ` Burakov, Anatoly
2018-03-20 10:56             ` Burakov, Anatoly
2018-03-20  8:50     ` [PATCH v6] " Jianfeng Tan
2018-04-05 14:26       ` Tan, Jianfeng
2018-04-05 14:39         ` Burakov, Anatoly
2018-04-12 23:27         ` Thomas Monjalon
2018-04-12 15:26       ` Burakov, Anatoly
2018-04-15 15:06     ` [PATCH v7] " Jianfeng Tan
2018-04-15 15:10       ` Tan, Jianfeng
2018-04-17 23:04       ` Thomas Monjalon
2018-01-25  4:16 ` [PATCH v3 0/3] generic channel for multi-process communication Jianfeng Tan
2018-01-25  4:16   ` [PATCH v3 1/3] eal: add " Jianfeng Tan
2018-01-25 10:41     ` Thomas Monjalon
2018-01-25 11:27     ` Burakov, Anatoly
2018-01-25 11:34       ` Thomas Monjalon
2018-01-25 12:21     ` Ananyev, Konstantin
2018-01-25  4:16   ` [PATCH v3 2/3] eal: add synchronous " Jianfeng Tan
2018-01-25 12:00     ` Burakov, Anatoly
2018-01-25 12:19       ` Burakov, Anatoly
2018-01-25 12:19       ` Ananyev, Konstantin
2018-01-25 12:25         ` Burakov, Anatoly
2018-01-25 13:00           ` Ananyev, Konstantin
2018-01-25 13:05             ` Burakov, Anatoly
2018-01-25 13:10               ` Burakov, Anatoly
2018-01-25 15:03                 ` Ananyev, Konstantin
2018-01-25 16:22                   ` Burakov, Anatoly
2018-01-25 17:10                     ` Tan, Jianfeng
2018-01-25 18:02                       ` Burakov, Anatoly
2018-01-25 12:22     ` Ananyev, Konstantin
2018-01-25  4:16   ` [PATCH v3 3/3] vfio: use the generic multi-process channel Jianfeng Tan
2018-01-25 10:47     ` Thomas Monjalon
2018-01-25 10:52       ` Burakov, Anatoly
2018-01-25 10:57         ` Thomas Monjalon
2018-01-25 12:15           ` Burakov, Anatoly
2018-01-25 19:14 ` [PATCH v4 0/2] generic channel for multi-process communication Jianfeng Tan
2018-01-25 19:14   ` [PATCH v4 1/2] eal: add synchronous " Jianfeng Tan
2018-01-25 19:14   ` [PATCH v4 2/2] vfio: use the generic multi-process channel Jianfeng Tan
2018-01-25 19:15   ` [PATCH v4 0/2] generic channel for multi-process communication Tan, Jianfeng
2018-01-25 19:21 ` [PATCH v5 " Jianfeng Tan
2018-01-25 19:21   ` [PATCH v5 1/2] eal: add " Jianfeng Tan
2018-01-25 19:21   ` [PATCH v5 2/2] eal: add synchronous " Jianfeng Tan
2018-01-25 21:23   ` [PATCH v5 0/2] generic channel for " Thomas Monjalon
2018-01-26  3:41 ` [PATCH v6 " Jianfeng Tan
2018-01-26  3:41   ` [PATCH v6 1/2] eal: add " Jianfeng Tan
2018-01-26 10:25     ` Burakov, Anatoly
2018-01-29  6:37       ` Tan, Jianfeng
2018-01-29  9:37         ` Burakov, Anatoly
2018-01-26  3:41   ` [PATCH v6 2/2] eal: add synchronous " Jianfeng Tan
2018-01-26 10:31     ` Burakov, Anatoly
2018-01-29 23:52   ` [PATCH v6 0/2] generic channel for " Thomas Monjalon
2018-01-30  6:58 ` [PATCH v7 " Jianfeng Tan
2018-01-30  6:58   ` [PATCH v7 1/2] eal: add " Jianfeng Tan
2018-01-30  6:58   ` [PATCH v7 2/2] eal: add synchronous " Jianfeng Tan
2018-01-30 14:46   ` [PATCH v7 0/2] generic channel for " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=150073b1-50be-4264-9a30-8c4aa62e3078@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=jianfeng.tan@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.