From: Richard Weinberger <richard@nod.at>
To: bfields <bfields@fieldses.org>
Cc: linux-nfs <linux-nfs@vger.kernel.org>,
david <david@sigma-star.at>,
luis turcitu <luis.turcitu@appsbroker.com>,
david young <david.young@appsbroker.com>,
david oberhollenzer <david.oberhollenzer@sigma-star.at>,
trond myklebust <trond.myklebust@hammerspace.com>,
anna schumaker <anna.schumaker@netapp.com>,
chris chilvers <chris.chilvers@appsbroker.com>
Subject: Re: [RFC PATCH 1/6] Implement reexport helper library
Date: Wed, 9 Mar 2022 10:43:34 +0100 (CET) [thread overview]
Message-ID: <692661836.127800.1646819014252.JavaMail.zimbra@nod.at> (raw)
In-Reply-To: <20220308214437.GB22644@fieldses.org>
Bruce,
----- Ursprüngliche Mail -----
> Von: "bfields" <bfields@fieldses.org>
> On Thu, Feb 17, 2022 at 02:15:26PM +0100, Richard Weinberger wrote:
>> +#define REEXPDB_SHM_NAME "/nfs_reexport_db_lock"
>> +#define REEXPDB_SHM_SZ 4096
>> +#define REEXPDB_INIT_LOCK NFS_STATEDIR "/reexpdb_init.lock"
>> +#define REEXPDB_DBFILE NFS_STATEDIR "/reexpdb.sqlite3"
>
> I don't know much about sqlite--why do we need to do our own file
> locking? If we do need to do it ourself, could we lock the database
> file instead instead of using a separate lock file?
Concurrent access to the database is synchronized using a shared rwlock (on shared memory).
reexpdb_init.lock is used to make sure that creating and initializing the shared memory/lock
happens once.
>> +static const char initdb_sql[] = "CREATE TABLE IF NOT EXISTS fsidnums (num
>> INTEGER PRIMARY KEY CHECK (num > 0 AND num < 4294967296), path TEXT UNIQUE);
>> CREATE TABLE IF NOT EXISTS subvolumes (path TEXT PRIMARY KEY); CREATE INDEX IF
>> NOT EXISTS idx_ids_path ON fsidnums (path);";
>
> I'd personally find it easier to read if these were defined in the place
> where they're used. (And, honestly, if this is just used once, maybe
> the definition is unnecessary.)
Ok.
> What are the two tables used for? Naively I'd've thought the
> "subvolumes" table was redundant.
fsidnums is used to store generated and predefined fsid numbers.
It is only used in reexport modes auto-fsidnum and predefined-fsidnum.
subvolumes contains a list of subvolumes which a are likely in use by
a client. Up start all these paths will get touched such that they can
be exported.
>> +/*
>> + * This query is a little tricky. We use SQL to find and claim the smallest
>> free fsid number.
>
> Yes, that is a little tricky. Is it necessary? My SQL Is rusty, but
> the database should be able to pick a unique value for us, shouldn't it?
SQLite can generate a unique value, but we cannot select the range.
It will give a value between 0 and 2^64.
We need an id between 1 and 2^32.
>> + * To find a free fsid the fsidnums is left joined to itself but with an offset
>> of 1.
>> + * Everything after the UNION statement is to handle the corner case where
>> fsidnums
>> + * is empty. In this case we want 1 as first fsid number.
>> + */
>> +static const char new_fsidnum_by_path_sql[] = "INSERT INTO fsidnums VALUES
>> ((SELECT ids1.num + 1 FROM fsidnums AS ids1 LEFT JOIN fsidnums AS ids2 ON
>> ids2.num = ids1.num + 1 WHERE ids2.num IS NULL UNION SELECT 1 WHERE NOT EXISTS
>> (SELECT NULL FROM fsidnums WHERE num = 1) LIMIT 1), ?1) RETURNING num;";
>> +static const char fsidnum_by_path_sql[] = "SELECT num FROM fsidnums WHERE path
>> = ?1;";
>> +static const char add_crossed_volume_sql[] = "REPLACE INTO subvolumes
>> VALUES(?1);";
>> +static const char drop_crossed_volume_sql[] = "DELETE FROM subvolumes WHERE
>> path = ?1;";
>> +static const char get_crossed_volumes_sql[] = "SELECT path from subvolumes;";
> ...
>> +/*
>> + * reexpdb_init - Initialize reexport database
>> + *
>> + * Setup shared lock (database is concurrently used by multiple processes),
>
> So, this should all work when rpc.mountd is run with --num_threads > 1?
Yes, that's why we need the shared rwlock.
Thanks,
//richard
next prev parent reply other threads:[~2022-03-09 9:43 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-17 13:15 [RFC PATCH 0/6] nfs-utils: Improving NFS re-exports Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 1/6] Implement reexport helper library Richard Weinberger
2022-03-08 21:44 ` J. Bruce Fields
2022-03-09 9:43 ` Richard Weinberger [this message]
2022-03-09 14:19 ` bfields
2022-03-09 15:02 ` Richard Weinberger
2022-03-09 15:28 ` bfields
2022-02-17 13:15 ` [RFC PATCH 2/6] exports: Implement new export option reexport= Richard Weinberger
2022-03-08 22:10 ` J. Bruce Fields
2022-03-09 9:43 ` Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 3/6] export: Implement logic behind reexport= Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 4/6] export: Record mounted volumes Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 5/6] nfsd: statfs() every known subvolume upon start Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 6/6] export: Garbage collect orphaned subvolumes " Richard Weinberger
2022-02-17 16:33 ` [RFC PATCH 0/6] nfs-utils: Improving NFS re-exports J. Bruce Fields
2022-02-17 17:27 ` Richard Weinberger
2022-02-17 19:27 ` bfields
2022-02-17 20:15 ` Richard Weinberger
2022-02-17 20:18 ` bfields
2022-02-17 20:29 ` Richard Weinberger
2022-03-07 9:25 ` Richard Weinberger
2022-03-07 22:29 ` bfields
2022-04-19 20:20 ` Steve Dickson
2022-04-19 20:31 ` Richard Weinberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=692661836.127800.1646819014252.JavaMail.zimbra@nod.at \
--to=richard@nod.at \
--cc=anna.schumaker@netapp.com \
--cc=bfields@fieldses.org \
--cc=chris.chilvers@appsbroker.com \
--cc=david.oberhollenzer@sigma-star.at \
--cc=david.young@appsbroker.com \
--cc=david@sigma-star.at \
--cc=linux-nfs@vger.kernel.org \
--cc=luis.turcitu@appsbroker.com \
--cc=trond.myklebust@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).