[PATCH v2 00/16] nvme-cli: add "nvme monitor" subcommand

* [PATCH v2 00/16] nvme-cli: add "nvme monitor" subcommand
@ 2021-03-06  0:36 mwilck
  2021-03-06  0:36 ` [PATCH v2 01/16] fabrics: export symbols required for monitor functionality mwilck
                   ` (15 more replies)
  0 siblings, 16 replies; 20+ messages in thread
From: mwilck @ 2021-03-06  0:36 UTC (permalink / raw)
  To: Sagi Grimberg, Hannes Reinecke, Keith Busch
  Cc: Chaitanya Kulkarni, linux-nvme, Enzo Matsumiya, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This patch set adds a new subcommand "nvme monitor". In this mode,
nvme-cli runs continuously, monitors various events relevant for discovery,
and autoconnects to newly discovered subsystems.

This series is based on, and requires, my previously submitted patch series
"Some minor fixes/additions for nvme-cli".

The monitor mode is suitable to be run in a systemd service. An appropriate
unit file is provided. As such, "nvme monitor" can be used as an alternative
to the current auto-connection mechanism based on udev rules and systemd
template units.

This method for discovery and autodetection has some advantages over the
current udev-rule based approach:

 * The monitor creates persistent discovery controllers if possible,
   and monitors them for AENs.

 * The monitor tracks "/etc/nvme/discovery.conf" changes using inotify.

 * The monitor keeps record about existing NMVe transport connections and
   associated discovery controllers (if any). Thus it can avoid recreating
   discovery controllers if a persistent discovery controller is already
   present on a given transport address, without having to search sysfs
   for a matching controller.

 * The monitor is aware of ongoing discoveries (as much as it has started
   them itself) and can queue up additional processes without taking the
   risk to miss any events. Missing events is possible with the current
   systemd-based activation of NVMe discovery.

 * I expect slightly less resource usage compared to the current udev-rule
   based discovery, as less fork()/exec() operations are required. The effect
   will probably to be small though, and I have no numbers.

 * The monitor will be able to support network discovery too, and react
   on mDNS records being published in the network. This functionality
   will be implemented using libavahi; Enzo Matsumiya is working on it.
   Once finished, "nvme monitor" will be able to track discovery events
   for every NVMeoF transport.

I've tested `fc_udev_device` handling for NVMeoFC with an Ontap target, and
AEN handling for RDMA using a Linux nvmet target.

# Changes wrt "RFC: add "nvme monitor" subcommand" patch series

A lot.

 * Separated out those changes that are not directly related to the monitor
   into a separate series, as requested by Sagi (see above). The part
   that changes some symbols in fabrics.c from static to global is still
   part of the "monitor" series though, as it doesn't make sense to do
   this without the monitor.

 * Reorganized the patches into less, bigger chunks, as requested by Hannes.

 * Changed the behavior of the monitor:

   - Autoconnect by default, and allow to use "-n/--no-connect" for opt-out.

   - Always create persistent discovery connections (Sagi): it makes no sense
     to use temporary discovery controllers if the monitor is running.

   - Don't try to create discovery controllers on every transport connection
     found. Sagi had pointed out that this behavior in the RFC was wrong.
     Instead, run discoveries from /etc/nvme/discovery.conf on startup.

   - Don't automatically disable 70-nvmf-autoconnect.rules (Hannes).
     I have put this in the systemd service file for now, because I think
     it makes no sense to run the monitor as a systemd service and run the
     discovery via udev rules at the same time. If this is also unwanted,
     I can remove it entirely of course.

 * Moved the event handling into a separate "library". This was motivated
   by the additional events monitored in the v2 series, and by the prospect
   of adding more (and network-related ones, where timeout handling will
   become important) when the mDNS support is merged. I've actually spent
   most work on this part, stabilizing the API, creating tests and fixing
   issues. I have published this separately on https://github.com/mwilck/minivent,
   together with the unit tests that I didn't want to add to the nvme-cli
   patch set at this time.

 * Added new features:

   - /etc/nvme/discovery.conf: Parse it on startup, and monitor changes with
     inotify.

   - parent/child messaging: allow children running discovery to communicate
     with the parent monitor process via a Unix socket. Without this, the
     discovery of newly created discovery controllers by the parent is
     fragile, because the monitor has no way to figure out whether a given
     controller was created by its own child or by another process. Also,
     it wasn't possible to pass existing discovery controller devices to
     children running discovery from the conf file, or for referrals. This
     had the effect that children would create a temporary discovery controller
     even though persistent controller for the same connection existed
     already.

 * Use the "udev" udev monitor socket by default rather than "kernel".
   When I made the first submission, I was unaware that filtering on "kernel"
   netlink sockets is much less efficient than on "udev" sockets. Thus
   "kernel" is only used if udevd is not available.

 * Lots of bugs and minor issues fixed.

# Todo

 * Implement support for RDMA and TCP protocols. As noted above, Enzo
   Matsumiya has been working on this, and we are cooperating to merge
   our efforts.

Reviews and comments welcome.
Thanks,

PS: I've pushed both this series and the "minor fixes" series to
    https://github.com/linux-nvme/nvme-cli/pull/877. The CI fails
    because I don't know how to resolve the dependency of libudev
    in the Ubuntu / powerpc cross-compilation environment used there.
    Help would be appreciated.    

Martin Wilck (16):
  fabrics: export symbols required for monitor functionality
  nvme-cli: add code for event and timeout handling
  monitor: add basic "nvme monitor" functionality
  monitor: implement uevent handling
  conn-db: add simple connection registry
  monitor: monitor_discovery(): try to reuse existing controllers
  monitor: kill running discovery tasks on exit
  monitor: add option --cleanup / -C
  monitor: handling of add/remove uevents for nvme controllers
  monitor: discover from conf file on startup
  monitor: watch discovery.conf with inotify
  monitor: add parent/child messaging and "notify" message exchange
  monitor: add "query device" message exchange
  completions: add completions for nvme monitor
  nvmf-autoconnect: add unit file for nvme-monitor.service
  nvme-monitor(1): add man page for nvme-monitor

 .github/workflows/c-cpp.yml                   |    4 +
 Documentation/cmds-main.txt                   |    4 +
 Documentation/nvme-monitor.1                  |  180 +++
 Documentation/nvme-monitor.html               | 1018 ++++++++++++
 Documentation/nvme-monitor.txt                |  144 ++
 Makefile                                      |   21 +-
 common.h                                      |   17 +
 completions/bash-nvme-completion.sh           |    6 +-
 conn-db.c                                     |  425 +++++
 conn-db.h                                     |  171 ++
 event/event.c                                 |  481 ++++++
 event/event.h                                 |  460 ++++++
 event/timeout.c                               |  373 +++++
 event/timeout.h                               |  110 ++
 event/ts-util.c                               |  107 ++
 event/ts-util.h                               |  129 ++
 fabrics.c                                     |  436 +++---
 fabrics.h                                     |   52 +
 list.h                                        |  349 +++++
 monitor.c                                     | 1370 +++++++++++++++++
 monitor.h                                     |   14 +
 nvme-builtin.h                                |    1 +
 nvme.c                                        |   13 +
 nvmf-autoconnect/systemd/nvme-monitor.service |   18 +
 util/cleanup.c                                |    2 +
 util/cleanup.h                                |    1 +
 26 files changed, 5676 insertions(+), 230 deletions(-)
 create mode 100644 Documentation/nvme-monitor.1
 create mode 100644 Documentation/nvme-monitor.html
 create mode 100644 Documentation/nvme-monitor.txt
 create mode 100644 conn-db.c
 create mode 100644 conn-db.h
 create mode 100644 event/event.c
 create mode 100644 event/event.h
 create mode 100644 event/timeout.c
 create mode 100644 event/timeout.h
 create mode 100644 event/ts-util.c
 create mode 100644 event/ts-util.h
 create mode 100644 list.h
 create mode 100644 monitor.c
 create mode 100644 monitor.h
 create mode 100644 nvmf-autoconnect/systemd/nvme-monitor.service

-- 
2.29.2

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 20+ messages in thread