All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roberto Sassu <roberto.sassu@huaweicloud.com>
To: corbet@lwn.net, paul@paul-moore.com, jmorris@namei.org,
	serge@hallyn.com, akpm@linux-foundation.org, shuah@kernel.org,
	mcoquelin.stm32@gmail.com, alexandre.torgue@foss.st.com,
	mic@digikod.net
Cc: linux-security-module@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	bpf@vger.kernel.org, zohar@linux.ibm.com,
	dmitry.kasatkin@gmail.com, linux-integrity@vger.kernel.org,
	wufan@linux.microsoft.com, pbrobinson@gmail.com,
	zbyszek@in.waw.pl, hch@lst.de, mjg59@srcf.ucam.org,
	pmatilai@redhat.com, jannh@google.com, dhowells@redhat.com,
	jikos@kernel.org, mkoutny@suse.com, ppavlu@suse.com,
	petr.vorel@gmail.com, mzerqung@0pointer.de, kgold@linux.ibm.com,
	Roberto Sassu <roberto.sassu@huawei.com>
Subject: [PATCH v4 14/14] docs: Add documentation of the digest_cache LSM
Date: Mon, 15 Apr 2024 16:24:36 +0200	[thread overview]
Message-ID: <20240415142436.2545003-15-roberto.sassu@huaweicloud.com> (raw)
In-Reply-To: <20240415142436.2545003-1-roberto.sassu@huaweicloud.com>

From: Roberto Sassu <roberto.sassu@huawei.com>

Add the documentation of the digest_cache LSM in Documentation/security.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 Documentation/security/digest_cache.rst | 763 ++++++++++++++++++++++++
 Documentation/security/index.rst        |   1 +
 MAINTAINERS                             |   1 +
 3 files changed, 765 insertions(+)
 create mode 100644 Documentation/security/digest_cache.rst

diff --git a/Documentation/security/digest_cache.rst b/Documentation/security/digest_cache.rst
new file mode 100644
index 000000000000..f7c2b1bcf25b
--- /dev/null
+++ b/Documentation/security/digest_cache.rst
@@ -0,0 +1,763 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Digest_cache LSM
+================
+
+Introduction
+============
+
+Integrity detection and protection has long been a desirable feature, to
+reach a large user base and mitigate the risk of flaws in the software and
+attacks.
+
+However, while solutions exist, they struggle to reach the large user base,
+due to requiring higher than desired constraints on performance,
+flexibility and configurability, that only security conscious people are
+willing to accept.
+
+This is where the new digest_cache LSM comes into play, it offers
+additional support for new and existing integrity solutions, to make them
+faster and easier to deploy.
+
+
+Motivation
+==========
+
+The digest_cache LSM helps to address two important shortcomings of the
+Integrity Measurement Architecture (IMA): predictability of the Platform
+Configuration Registers (PCRs), and the provisioning of reference values to
+compare the calculated file digest against.
+
+Remote attestation, according to Trusted Computing Group (TCG)
+specifications, is done by replicating the PCR extend operation in
+software with the digests in the event log (in this case the IMA
+measurement list), and by comparing the obtained value with the PCR value
+signed by the TPM with the quote operation.
+
+Due to how the extend operation is performed, if measurements are done in
+a different order, the final PCR value will be different. That means that
+if measurements are done in parallel, there is no way to predict what the
+final PCR value will be, making impossible to seal data to a PCR value. If
+the PCR value was predictable, a system could for example prove its
+integrity by unsealing and using its private key, without sending every
+time the full list of measurements.
+
+Provisioning reference values for file digests is also a difficult task.
+The solution so far was to add file signatures to RPM packages, and
+possibly to DEB packages, so that IMA can verify them. While this undoubtly
+works, it also requires Linux distribution vendors to support the feature
+by rebuilding all their packages, and eventually extending their PKI to
+perform the additional signatures. It could also require developers extra
+work to deal with the additional data.
+
+On the other hand, since often packages carry the file digests themselves,
+it won't be actually needed to add file signatures. If the kernel was able
+to extract the file digests by itself, all the tasks mentioned above for
+the Linux distribution vendors won't be needed too. All current and past
+Linux distributions can be easily retrofitted to enable IMA appraisal with
+the file digests from the packages.
+
+Narrowing down the scope of a package parser to only extract specific
+information makes it small enough to accurately verify that it cannot harm
+the kernel. In fact, the parsers included with the digest_cache LSM have
+been verified with the formal verification tool Frama-C, albeit with a
+limited buffer size (the verification time grows considerably with bigger
+buffer sizes). The parsers with the Frama-C assertions are available here:
+
+https://github.com/robertosassu/rpm-formal/
+
+Frama-C asserts that the parsers don't read beyond their assigned buffer
+for any byte combination.
+
+An additional mitigation against corrupted digest lists consists in
+verifying the signature of the package first, before attempting to extract
+the file digests.
+
+
+Solution
+========
+
+The digest_cache LSM can help IMA to extend a PCR in a deterministic way.
+If IMA knows that a file comes from a Linux distribution, it can measure
+files in a different way: measure the list of digests coming from the
+distribution (e.g. RPM package headers), and subsequently measure a file if
+it is not found in that list.
+
+If the system executes known files, it does not matter in which order they
+are executed, because the PCR is not extended. That however means that the
+lists of digests must be measured in a deterministic way. The digest_cache
+LSM has a prefetching mechanism to make this happen, consisting in
+sequentially reading digest lists in a directory until it finds the
+requested one.
+
+The resulting IMA measurement list however has a disadvantage: it does not
+tell to remote verifiers whether files with digest in the measured digest
+lists have been accessed or not and when. Also the IMA measurement list
+would change after a software update.
+
+The digest_cache LSM can also help IMA for appraisal. Currently, IMA has
+to evaluate the signature of each file individually, and expects that the
+Linux vendors include those signatures together with the files in the
+packages.
+
+With the digest_cache LSM, IMA can simply lookup in the list of digests
+extracted from package headers, once the signature of those headers has
+been verified. The same approach can be followed by other LSMs, such as
+Integrity Policy Enforcement (IPE).
+
+
+Design
+======
+
+Digest cache
+------------
+
+Main idea
+~~~~~~~~~
+
+The digest_cache LSM extracts digests from a file, referred to as a digest
+list, and stores them in kernel memory in a structure named digest_cache.
+
+The digest_cache structure contains a set of per algorithm hash tables,
+where digests are stored, the digest list pathname, a reference counter,
+the integrity state of the digest list, and the inodes for which the digest
+cache is used.
+
+If a digest cache is created from a directory, its hash tables are empty
+and instead it contains a snapshot of the directory entries discovered with
+iterate_dir().
+
+The integrity state of digest caches created from regular files is
+evaluated independently by other LSMs, for example by verifying the
+signature of the digest list, and is provided to the digest_cache LSM
+through a dedicated API.
+
+The extracted digests can be used as reference values initially for
+integrity verification of file data and at a later stage for integrity
+verification of file metadata.
+
+The digest_cache LSM can extract digests from a digest list, only if it has
+a parser for its format. Currently, it supports a TLV-based and the RPM
+package header formats, and can support more in the future.
+
+
+Digest list lookup
+~~~~~~~~~~~~~~~~~~
+
+In order to build a digest cache and return it to the caller for performing
+a query, the digest_cache LSM must know which digest list to use. There are
+a few alternatives.
+
+(1) There is only one digest list and its path is specified as default
+location at build-time in the kernel configuration or at run-time through
+securityfs. The digest_cache LSM builds a single digest cache from that
+digest list and returns it to the caller.
+
+(2) The default location is a directory containing multiple digest lists.
+Unlike (1), the digest_cache LSM does not know which digest list to select,
+and creates an iterator with a snapshot of the directory entries. During a
+query, the digest_cache LSMs iteratively creates a digest cache for each
+directory entry and searches for the digest until there is a match.
+
+(3) Same as (2), but the digest list file name is stored as value of the
+new security.digest_list xattr in the inode for which the digest cache is
+requested. The digest_cache LSM can directly retrieve the digest list using
+the default directory as the base path and the xattr value as last path
+component.
+
+(4) Similar to (3), but the digest_cache LSM still creates a directory
+iterator like in (2). It reads digest lists with a file name that does not
+match the security.digest_list xattr, to trigger a measurement, and creates
+a digest cache from the matching one. This is also known as the prefetching
+mechanism, introduced later.
+
+
+Digest cache creation
+~~~~~~~~~~~~~~~~~~~~~
+
+Digest list naming convention
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Once the digest_cache LSM selected the digest list to use, it reads the
+file and calls the appropriate parser to extract the digests, based on the
+filename prefix.
+
+The expected digest list file name format is::
+
+ <digest list format>-<file name>
+
+where format can be for example ``tlv`` or ``rpm``, which make the
+digest_cache LSM call respectively the TLV or RPM parser.
+
+Alternatively, also the following format is supported::
+
+ <seq num>-<digest list format>-<file name>
+
+``<seq num>-`` defines how directory entries should be ordered in the
+directory iterator.
+
+Digest cache create API
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The digest_cache LSM offers an API for parsers to initialize and add
+digests to the digest cache hash tables.
+
+It exposes digest_cache_htable_init() to initialize a hash table for a
+given algorithm, and to size it depending on the number of digests to add,
+normally known by the parsers before adding digests.
+
+The number of hash table slots is determined by dividing the number of
+digests to add by the desired average collision depth. The latter can be
+changed in the kernel configuration, to have a different tradeoff between
+digest lookup speed and memory occupation.
+
+It also exposes digest_cache_htable_add(), to let parsers add extracted
+digests to the new hash table. If parsers need to add digests created with
+different algorithms, they can create as many hash tables as they need.
+
+Finally, parsers can also call digest_cache_htable_lookup() to lookup a
+digest in the passed digest_cache.
+
+Digest cache caching on create
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since the same digest cache can be requested multiple times for
+verification of different inodes (e.g. installed files belonging to the
+same software package), a pointer to the newly created digest cache (named
+dig_owner) is stored in the inode security blob of the digest list.
+
+Dig_owner check and assignment is protected by the dig_owner_mutex, also
+stored in the inode security blob. The first requestor instantiates and
+populates the new digest cache. The other lock contenders wait until the
+lock is released and until the first requestor clears the INIT_IN_PROGRESS
+bit in the digest cache bit mask. The latter is needed to avoid lock
+inversion with the code tracking changes on digest lists/default directory.
+
+
+Digest cache request
+~~~~~~~~~~~~~~~~~~~~
+
+Users of the digest_cache LSM can request a digest cache by calling
+digest_cache_get(), passing the inode for which they need a digest cache,
+and can release it with digest_cache_put() once they are done. As mentioned
+above, the digest_cache LSM determines which digest list the digest cache
+should be built/retrieved from.
+
+Digest cache caching on request
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To avoid having to find the digest list inode for every digest_cache_get()
+call, also a pointer to the retrieved digest cache (named dig_user) is
+stored in the security blob of the inode for which the digest cache is
+requested.
+
+Dig_user is also protected by its own dig_user_mutex (stored in the same
+inode security blob) for check and assignment. Multiple requestors of a
+digest cache for the same inode have to wait until the first requestor
+finds the digest list inode and obtains the digest cache.
+
+Digest cache reference count
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since digest cache pointers are stored in the inode security blobs and
+returned to the digest_cache_get() callers, the digest_cache LSM must track
+how many pointers are around, to avoid freeing a digest cache while it is
+still in use.
+
+The digest_cache LSM records the number of such pointers in a per digest
+cache reference count, and increments it every time the pointer is stored
+in a new inode security blob (either dig_owner or dig_user), or returned by
+digest_cache_get(), and decrements it when an inode is evicted from memory
+or a caller of digest_cache_get() calls digest_cache_put().
+
+
+Digest lookup
+~~~~~~~~~~~~~
+
+After a caller of digest_cache_get() obtains the desired digest cache, it
+can perform operations on it. The most important operation is querying for
+a digest, which can be performed by calling digest_cache_lookup().
+
+digest_cache_lookup() returns a numeric reference (digest_cache_found_t
+type), representing the digest cache containing the queried digest. It is
+not a pointer, to avoid it being accidentally passed to digest_cache_put().
+
+If the digest_cache LSM took the option (2) (multiple digest lists in the
+default directory but which one to use is unknown), digest_cache_get()
+returns an iterator instead, to be passed to digest_cache_lookup().
+
+Only the latter finally returns the digest cache containing the searched
+digest. If the digest is not found, digest_cache_lookup() returns zero.
+
+
+Verification data
+~~~~~~~~~~~~~~~~~
+
+Until now, the caller of the digest_cache LSM is assumed to always trust
+the returned digest cache from being created from authentic data. Or, there
+are security measures in place but not able to correlate reading a digest
+list with building a digest cache from it.
+
+The digest_cache LSM introduces a new mechanism for integrity providers to
+store verification data, i.e. their evaluation result of a digest list. It
+also allows callers of digest_cache_get() to later retrieve that
+information and decide whether or not they should use that digest cache.
+
+It achieves that by reserving space in the file descriptor security blob,
+and by setting the digest cache pointer in the digest list file descriptor.
+
+The digest_cache LSM supports multiple integrity providers at the same
+time, since multiple LSMs can implement the kernel_post_read_file LSM hook.
+Each provider is expected to choose an unique ID, so that the verification
+data can be given back through the same ID.
+
+Those integrity providers should implement the kernel_post_read_file LSM
+hook and call digest_cache_verif_set(), passing the digest list file
+descriptor, the unique ID and their evaluation result of the digest list.
+
+Callers of digest_cache_get() can call digest_cache_verif_get() to get
+the verification data, passing the returned digest cache pointer and the
+desired integrity provider ID. However, if the digest cache returned was an
+iterator, that call results in a NULL pointer, since the iterator is
+not populated with any digest list.
+
+In that case, those callers have to call digest_cache_lookup() to get the
+numeric reference of the digest cache containing the digest (thus populated
+from a digest list), and pass it to digest_cache_verif_get() after
+conversion to a digest cache pointer (with digest_cache_from_found_t()).
+
+
+Tracking changes
+~~~~~~~~~~~~~~~~
+
+After a digest cache has been built and its pointer has been set in the
+inode security blob, it might happen that there are changes in the digest
+lists, in the default directory and in the value of the
+security.digest_list xattr.
+
+All these changes may influence which digest cache is returned to callers
+of digest_cache_get() and which digests in the digest cache might be
+searched.
+
+The digest_cache LSM monitors such changes by registering to multiple LSM
+hooks (path_truncate, file_release, inode_unlink, inode_rename,
+inode_post_setxattr and inode_post_removexattr). Except for the last two,
+it accesses the dig_owner pointer in the affected inode security blob and
+sets the RESET bit.
+
+The next time that digest cache is requested, both dig_user and dig_owner
+are passed to digest_cache_put() and cleared. A new digest cache is
+created, as if there wasn't one in the first place.
+
+For the last two hooks, the RESET_USER bit is set instead, to limit
+clearing dig_user, since only retrieval of the digest list could change
+after modifying the security.digest_list xattr, and not the digest cache
+itself.
+
+Nothing changes for callers of digest_cache_get(), since they still hold
+the old digest cache pointer, despite that has been replaced in the inode
+security blobs. The old digest cache pointer will not be freed until those
+callers also call digest_cache_put() and the reference count reaches zero.
+
+Notify changes
+~~~~~~~~~~~~~~
+
+While new calls to digest_cache_get() result in a new digest cache to be
+returned, resetting the previous digest cache does not reflect in a reset
+of possibly cached security decisions based on that digest cache.
+
+IMA for example, would not be able to recheck a file digest against a
+modified digest cache, since it is not aware of the reset in the first
+place.
+
+Introduce a subscription-based notification mechanism, that dispatches to
+the interested parties events which include the type of event (e.g. reset)
+and the digest cache and inodes affected. A user of the digest_cache LSM
+can become a subscriber by calling digest_cache_register_notifier() and can
+unsubscribe by calling digest_cache_unregister_notifier().
+
+During a digest_cache_get(), add the inode for which the digest cache was
+requested to a notification list of the same digest cache. When the RESET
+bit is set, emit a event for each inode in that notification list, so that
+IMA and the other integrity providers can eventually invalidate their
+cached security decision on that inode.
+
+On a file digest cache reset, notify also users of the parent directory
+digest cache, since they might have looked up digests through that digest
+cache. Those users will see changes by performing another lookup.
+
+When the RESET_USER bit is set, emit a notification just for the inode
+signalled by the LSM hook, since the operation causing a reset
+(set/removexattr) only affects the link between the inode and the digest
+cache, and not the digest cache itself.
+
+Prefetching mechanism
+~~~~~~~~~~~~~~~~~~~~~
+
+One of the objectives of the digest_cache LSM is to make a TPM PCR
+predictable, by having digest lists measured in a deterministic order.
+Without the prefetching mechanism, digest lists are measured in a
+non-deterministic order, since the inodes for which a digest cache can be
+requested are accessed in a non-deterministic order too.
+
+The prefetching mechanism, when enabled by setting the new
+security.dig_prefetch xattr to 1, forces digest lists to be looked up by
+their file name in the list of the directory entries of the iterator
+created for the default directory.
+
+The predictability of the PCR is ensured by reading both matching and
+non-matching digest lists during the search, so that integrity providers
+can measure them, and by only creating a digest cache for the matching one.
+In this way, it does not matter if a digest list later in the list of
+directory entries is requested before a earlier one, since all digest lists
+until that point are measured anyway.
+
+However, while this mechanism ensures predictability of the PCR, it could
+also introduce significant latencies, especially if the matching digest
+list is very late in the list of directory entries. Before a digest cache
+is returned from that digest list, hundreds or thousands of digest lists
+could have to be read first.
+
+Then, the ``[<seq num>-]`` prefix in the digest list file name comes at
+hand, since it determines the order of directory entries in the iterator
+(entries with lower seq nums are before entries with higher seq nums).
+Digest lists without that prefix are added at the end of iterator list,
+in the same order as iterate_dir() shows them.
+
+With ``[<seq num>-]``, the latency of digest cache creation when the
+prefetching mechanism is enabled can be significantly reduced for example
+by ordering digest lists by their appearance in the IMA measurement list,
+since that list reflects the order in which digest lists are requested at
+boot.
+
+While digest lists can be requested in a slightly different order due to
+the non-deterministic access to inodes, the differences should be minimal,
+causing only fewer extra digest lists to be read before the right one is
+found.
+
+Ordering directory entries can also improve digest queries requiring
+iteration on all digest lists in the default directory. If directory
+entries are ordered by their appearance in the IMA measurement list, a
+digest is found faster because most likely it is searched in the same
+order as when the IMA measurement list was recorded, and thus its
+digest list comes earlier than the others in the list of the directory
+entries of the iterator.
+
+
+Data structures and API
+=======================
+
+Data structures
+---------------
+
+These are the data structures defined and used internally by the
+digest_cache LSM.
+
+.. kernel-doc:: security/digest_cache/internal.h
+
+
+Public API
+----------
+
+This API is meant to be used by users of the digest_cache LSM.
+
+.. kernel-doc:: include/linux/digest_cache.h
+		:identifiers: digest_cache_found_t
+		              digest_cache_from_found_t
+
+.. kernel-doc:: security/digest_cache/main.c
+		:identifiers: digest_cache_get digest_cache_put
+
+.. kernel-doc:: security/digest_cache/htable.c
+		:identifiers: digest_cache_lookup
+
+.. kernel-doc:: security/digest_cache/verif.c
+		:identifiers: digest_cache_verif_set digest_cache_verif_get
+
+.. kernel-doc:: security/digest_cache/notifier.c
+		:identifiers: digest_cache_register_notifier
+			      digest_cache_unregister_notifier
+
+
+Parser API
+----------
+
+This API is meant to be used by digest list parsers.
+
+.. kernel-doc:: security/digest_cache/htable.c
+		:identifiers: digest_cache_htable_init
+		              digest_cache_htable_add
+			      digest_cache_htable_lookup
+
+
+Digest List Formats
+===================
+
+tlv
+---
+
+The Type-Length-Value (TLV) format was chosen for its extensibility.
+Additional fields can be added without breaking compatibility with old
+versions of the parser.
+
+The layout of a tlv digest list is the following::
+
+ [header: DIGEST_LIST_FILE, num fields, total len]
+ [field: DIGEST_LIST_ALGO, length, value]
+ [field: DIGEST_LIST_ENTRY#1, length, value (below)]
+  |- [header: DIGEST_LIST_ENTRY_DATA, num fields, total len]
+  |- [DIGEST_LIST_ENTRY_DIGEST#1, length, file digest]
+  |- [DIGEST_LIST_ENTRY_PATH#1, length, file path]
+ [field: DIGEST_LIST_ENTRY#N, length, value (below)]
+  |- [header: DIGEST_LIST_ENTRY_DATA, num fields, total len]
+  |- [DIGEST_LIST_ENTRY_DIGEST#N, length, file digest]
+  |- [DIGEST_LIST_ENTRY_PATH#N, length, file path]
+
+DIGEST_LIST_ALGO is a field to specify the algorithm of the file digest.
+DIGEST_LIST_ENTRY is a nested TLV structure with the following fields:
+DIGEST_LIST_ENTRY_DIGEST contains the file digest; DIGEST_LIST_ENTRY_PATH
+contains the file path.
+
+
+rpm
+---
+
+The rpm digest list is basically a subset of the RPM package header.
+Its format is::
+
+ [RPM magic number]
+ [RPMTAG_IMMUTABLE]
+
+RPMTAG_IMMUTABLE is a section of the full RPM header containing the part
+of the header that was signed, and whose signature is stored in the
+RPMTAG_RSAHEADER section.
+
+
+Appended Signature
+------------------
+
+Digest lists can have a module-style appended signature, that can be used
+for appraisal with IMA. The signature type can be PKCS#7, as for kernel
+modules, or a different type.
+
+
+History
+=======
+
+The original name of this work was IMA Digest Lists, which was somehow
+considered too invasive. The code was moved to a separate component named
+DIGLIM (DIGest Lists Integrity Module), with the purpose of removing the
+complexity away of IMA, and also adding the possibility of using it with
+other kernel components (e.g. Integrity Policy Enforcement, or IPE).
+
+The design changed significantly, so DIGLIM was renamed to digest_cache
+LSM, as the name better reflects what the new component does.
+
+Since it was originally proposed, in 2017, this work grew up a lot thanks
+to various comments/suggestions. It became integrally part of the openEuler
+distribution since end of 2020.
+
+The most important difference between the old the current version is moving
+from a centralized repository of file digests to a per-package repository.
+This significantly reduces the memory pressure, since digest lists are
+loaded into kernel memory only when they are actually needed. Also, file
+digests are automatically unloaded from kernel memory at the same time
+inodes are evicted from memory during reclamation.
+
+
+Performance
+===========
+
+System specification
+--------------------
+
+The tests have been performed on a Fedora 38 virtual machine with 4 cores
+(AMD EPYC-Rome, no hyperthreading), 16 GB of RAM, no TPM/TPM passthrough/
+emulated. The QEMU process has been pinned to 4 real CPU cores and its
+priority was set to -20.
+
+
+Benchmark tool
+--------------
+
+The digest_cache LSM has been tested with an ad-hoc benchmark tool that
+creates 20000 files with a random size up to 100 bytes and randomly adds
+their digest to one of 303 digest lists. The number of digest lists has
+been derived from the ratio (66) digests/packages (124174/1883) found in
+the testing virtual machine (hence, 20000/66 = 303). IMA signatures have
+been done with ECDSA NIST P-384.
+
+The benchmark tool then creates a list of 20000 files to be accessed,
+randomly chosen (there can be duplicates). This is necessary to make the
+results reproducible across reboots (by always replaying the same
+operations). The benchmark reads (sequentially and in parallel) the files
+from the list 2 times, flushing the kernel caches before each read.
+
+Each test has been performed 5 times, and the average value is taken.
+
+
+Purpose of the benchmark
+------------------------
+
+The purpose of the benchmark is to show the performance difference of IMA
+between the current behavior, and by using the digest_cache LSM.
+
+
+IMA measurement policy: no cache
+--------------------------------
+
+.. code-block:: bash
+
+ measure func=FILE_CHECK fowner=2001 pcr=12
+
+
+IMA measurement policy: cache
+-----------------------------
+
+.. code-block:: bash
+
+ measure func=DIGEST_LIST_CHECK pcr=12
+ measure func=FILE_CHECK fowner=2001 digest_cache=data pcr=12
+
+
+IMA Measurement Results
+-----------------------
+
+Sequential
+~~~~~~~~~~
+
+This test was performed reading files sequentially, and waiting for the
+current read to terminate before beginning a new one.
+
+::
+
+                      +-------+------------------------+-----------+
+                      | meas. | time no/p/vTPM (sec.)  | slab (KB) |
+ +--------------------+-------+------------------------+-----------+
+ | no cache           | 12313 | 33.65 / 102.51 / 47.13 |   84170   |
+ +--------------------+-------+------------------------+-----------+
+ | cache, no prefetch |   304 | 34.04 / 33.32 / 33.09  |   81159   |
+ +--------------------+-------+------------------------+-----------+
+ | cache, prefetch    |   304 | 34.02 / 33.31 / 33.15  |   81122   |
+ +--------------------+-------+------------------------+-----------+
+
+The table shows that 12313 measurements (boot_aggregate + files) have been
+made without the digest cache, and 304 with the digest cache
+(boot_aggregate + digest lists). Consequently, the memory occupation
+without the cache is higher due to the higher number of measurements.
+
+Not surprisingly, for the same reason, also the test time is significantly
+higher without the digest cache when the physical or virtual TPM is used.
+
+In terms of pure performance, first number in the third column, it can be
+seen that there are not really performance differences between using or not
+using the digest cache.
+
+Prefetching does not add overhead, also because digest lists were ordered
+according to their appearance in the IMA measurement list (which minimize
+the digest lists to prefetch).
+
+
+Parallel
+~~~~~~~~
+
+This test was performed reading files in parallel, not waiting for the
+current read to terminate.
+
+::
+
+                      +-------+-----------------------+-----------+
+                      | meas. | time no/p/vTPM (sec.) | slab (KB) |
+ +--------------------+-------+-----------------------+-----------+
+ | no cache           | 12313 | 14.08 / 79.09 / 22.70 |   85138   |
+ +--------------------+-------+-----------------------+-----------+
+ | cache, no prefetch |   304 | 14.44 / 15.11 / 14.96 |   85777   |
+ +--------------------+-------+-----------------------+-----------+
+ | cache, prefetch    |   304 | 14.30 / 15.41 / 14.40 |   83294   |
+ +--------------------+-------+-----------------------+-----------+
+
+Also in this case, the physical TPM causes the biggest delay especially
+without digest cache, where a higher number of measurements need to be
+extended in the TPM.
+
+The digest_cache LSM does not introduce a noticeable overhead in all
+scenarios.
+
+
+IMA appraisal policy: no cache
+------------------------------
+
+.. code-block:: bash
+
+ appraise func=FILE_CHECK fowner=2001
+
+
+IMA appraisal policy: cache
+---------------------------
+
+.. code-block:: bash
+
+ appraise func=DIGEST_LIST_CHECK
+ appraise func=FILE_CHECK fowner=2001 digest_cache=data
+
+
+IMA Appraisal Results
+---------------------
+
+Sequential
+~~~~~~~~~~
+
+This test was performed reading files sequentially, and waiting for the
+current read to terminate before beginning a new one.
+
+::
+
+                              +-------------+-------------+-----------+
+                              |    files    | time (sec.) | slab (KB) |
+ +----------------------------+-------------+-------------+-----------+
+ | appraise (ECDSA sig)       |    12312    |    96.74    |   78827   |
+ +----------------------------+-------------+-------------+-----------+
+ | appraise (cache)           | 12312 + 303 |    33.09    |   80854   |
+ +----------------------------+-------------+-------------+-----------+
+ | appraise (cache, prefetch) | 12312 + 303 |    33.42    |   81050   |
+ +----------------------------+-------------+-------------+-----------+
+
+This test shows a huge performance difference from verifying the signature
+of 12312 files as opposed to just verifying the signature of 303 digest
+lists, and looking up the digest of the files being read.
+
+There are some differences in terms of memory occupation, which is quite
+expected due to the fact that we have to take into account the digest
+caches loaded in memory, while with the standard appraisal they don't
+exist.
+
+
+Parallel
+~~~~~~~~
+
+This test was performed reading files in parallel, not waiting for the
+current read to terminate.
+
+::
+
+                              +-------------+-------------+-----------+
+                              |    files    | time (sec.) | slab (KB) |
+ +----------------------------+-------------+-------------+-----------+
+ | appraise (ECDSA sig)       |    12312    |    27.68    |   80596   |
+ +----------------------------+-------------+-------------+-----------+
+ | appraise (cache)           | 12313 + 303 |    14.96    |   80778   |
+ +----------------------------+-------------+-------------+-----------+
+ | appraise (cache, prefetch) | 12313 + 303 |    14.78    |   83354   |
+ +----------------------------+-------------+-------------+-----------+
+
+The difference is less marked when performing the read in parallel. Also,
+more memory seems to be occupied in the prefetch case.
+
+
+How to Test
+===========
+
+Please follow the instructions here:
+
+https://github.com/linux-integrity/digest-cache-tools
diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 59f8fc106cb0..34933e13c509 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -19,3 +19,4 @@ Security Documentation
    digsig
    landlock
    secrets/index
+   digest_cache
diff --git a/MAINTAINERS b/MAINTAINERS
index d7f700da009e..67b1fb3ab0ac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6197,6 +6197,7 @@ DIGEST_CACHE LSM
 M:	Roberto Sassu <roberto.sassu@huawei.com>
 L:	linux-security-module@vger.kernel.org
 S:	Maintained
+F:	Documentation/security/digest_cache.rst
 F:	security/digest_cache/
 F:	tools/testing/selftests/digest_cache/
 
-- 
2.34.1


  parent reply	other threads:[~2024-04-15 14:30 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15 14:24 [PATCH v4 00/14] security: digest_cache LSM Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 01/14] lib: Add TLV parser Roberto Sassu
2024-04-15 19:19   ` Jarkko Sakkinen
2024-04-15 21:07     ` Randy Dunlap
2024-04-16 14:23       ` Jarkko Sakkinen
2024-04-15 14:24 ` [PATCH v4 02/14] security: Introduce the digest_cache LSM Roberto Sassu
2024-04-15 19:31   ` Jarkko Sakkinen
2024-04-16  7:09     ` Roberto Sassu
2024-04-16 14:33       ` Jarkko Sakkinen
2024-04-17 17:00         ` Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 03/14] digest_cache: Add securityfs interface Roberto Sassu
2024-04-15 19:32   ` Jarkko Sakkinen
2024-04-16 10:15     ` Roberto Sassu
2024-04-16 14:38       ` Jarkko Sakkinen
2024-04-15 14:24 ` [PATCH v4 04/14] digest_cache: Add hash tables and operations Roberto Sassu
2024-04-15 19:36   ` Jarkko Sakkinen
2024-04-16 10:28     ` Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 05/14] digest_cache: Populate the digest cache from a digest list Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 06/14] digest_cache: Parse tlv digest lists Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 07/14] digest_cache: Parse rpm " Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 08/14] digest_cache: Add management of verification data Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 09/14] digest_cache: Add support for directories Roberto Sassu
2024-04-15 19:39   ` Jarkko Sakkinen
2024-04-16 10:30     ` Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 10/14] digest cache: Prefetch digest lists if requested Roberto Sassu
2024-04-15 19:42   ` Jarkko Sakkinen
2024-04-16 10:34     ` Roberto Sassu
2024-04-16 14:47       ` Jarkko Sakkinen
2024-04-15 14:24 ` [PATCH v4 11/14] digest_cache: Reset digest cache on file/directory change Roberto Sassu
2024-04-15 19:44   ` Jarkko Sakkinen
2024-04-16 10:37     ` Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 12/14] digest_cache: Notify digest cache events Roberto Sassu
2024-04-15 14:24 ` [PATCH v4 13/14] selftests/digest_cache: Add selftests for digest_cache LSM Roberto Sassu
2024-04-15 19:47   ` Jarkko Sakkinen
2024-04-16 10:39     ` Roberto Sassu
2024-04-15 14:24 ` Roberto Sassu [this message]
2024-04-15 19:18 ` [PATCH v4 00/14] security: " Jarkko Sakkinen
2024-04-16  6:56   ` Roberto Sassu
2024-04-16  4:49 ` Bagas Sanjaya
     [not found]   ` <66201cd2.df0a0220.a8ad5.6fbaSMTPIN_ADDED_BROKEN@mx.google.com>
2024-04-19 11:18     ` Bagas Sanjaya
2024-04-19 20:05       ` Jarkko Sakkinen
2024-04-19 23:29       ` Roberto Sassu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240415142436.2545003-15-roberto.sassu@huaweicloud.com \
    --to=roberto.sassu@huaweicloud.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexandre.torgue@foss.st.com \
    --cc=bpf@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dhowells@redhat.com \
    --cc=dmitry.kasatkin@gmail.com \
    --cc=hch@lst.de \
    --cc=jannh@google.com \
    --cc=jikos@kernel.org \
    --cc=jmorris@namei.org \
    --cc=kgold@linux.ibm.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=mic@digikod.net \
    --cc=mjg59@srcf.ucam.org \
    --cc=mkoutny@suse.com \
    --cc=mzerqung@0pointer.de \
    --cc=paul@paul-moore.com \
    --cc=pbrobinson@gmail.com \
    --cc=petr.vorel@gmail.com \
    --cc=pmatilai@redhat.com \
    --cc=ppavlu@suse.com \
    --cc=roberto.sassu@huawei.com \
    --cc=serge@hallyn.com \
    --cc=shuah@kernel.org \
    --cc=wufan@linux.microsoft.com \
    --cc=zbyszek@in.waw.pl \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.