linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: bfields <bfields@fieldses.org>
Cc: linux-nfs <linux-nfs@vger.kernel.org>,
	david <david@sigma-star.at>,
	luis turcitu <luis.turcitu@appsbroker.com>,
	david young <david.young@appsbroker.com>,
	david oberhollenzer <david.oberhollenzer@sigma-star.at>,
	trond myklebust <trond.myklebust@hammerspace.com>,
	anna schumaker <anna.schumaker@netapp.com>,
	chris chilvers <chris.chilvers@appsbroker.com>
Subject: Re: [RFC PATCH 1/6] Implement reexport helper library
Date: Wed, 9 Mar 2022 10:43:34 +0100 (CET)	[thread overview]
Message-ID: <692661836.127800.1646819014252.JavaMail.zimbra@nod.at> (raw)
In-Reply-To: <20220308214437.GB22644@fieldses.org>

Bruce,

----- Ursprüngliche Mail -----
> Von: "bfields" <bfields@fieldses.org>
> On Thu, Feb 17, 2022 at 02:15:26PM +0100, Richard Weinberger wrote:
>> +#define REEXPDB_SHM_NAME "/nfs_reexport_db_lock"
>> +#define REEXPDB_SHM_SZ 4096
>> +#define REEXPDB_INIT_LOCK NFS_STATEDIR "/reexpdb_init.lock"
>> +#define REEXPDB_DBFILE NFS_STATEDIR "/reexpdb.sqlite3"
> 
> I don't know much about sqlite--why do we need to do our own file
> locking?  If we do need to do it ourself, could we lock the database
> file instead instead of using a separate lock file?

Concurrent access to the database is synchronized using a shared rwlock (on shared memory).
reexpdb_init.lock is used to make sure that creating and initializing the shared memory/lock
happens once.
 
>> +static const char initdb_sql[] = "CREATE TABLE IF NOT EXISTS fsidnums (num
>> INTEGER PRIMARY KEY CHECK (num > 0 AND num < 4294967296), path TEXT UNIQUE);
>> CREATE TABLE IF NOT EXISTS subvolumes (path TEXT PRIMARY KEY); CREATE INDEX IF
>> NOT EXISTS idx_ids_path ON fsidnums (path);";
> 
> I'd personally find it easier to read if these were defined in the place
> where they're used.  (And, honestly, if this is just used once, maybe
> the definition is unnecessary.)

Ok.
 
> What are the two tables used for?  Naively I'd've thought the
> "subvolumes" table was redundant.

fsidnums is used to store generated and predefined fsid numbers.
It is only used in reexport modes auto-fsidnum and predefined-fsidnum.

subvolumes contains a list of subvolumes which a are likely in use by
a client. Up start all these paths will get touched such that they can
be exported.

>> +/*
>> + * This query is a little tricky. We use SQL to find and claim the smallest
>> free fsid number.
> 
> Yes, that is a little tricky.  Is it necessary?  My SQL Is rusty, but
> the database should be able to pick a unique value for us, shouldn't it?

SQLite can generate a unique value, but we cannot select the range.
It will give a value between 0 and 2^64.
We need an id between 1 and 2^32. 
 
>> + * To find a free fsid the fsidnums is left joined to itself but with an offset
>> of 1.
>> + * Everything after the UNION statement is to handle the corner case where
>> fsidnums
>> + * is empty. In this case we want 1 as first fsid number.
>> + */
>> +static const char new_fsidnum_by_path_sql[] = "INSERT INTO fsidnums VALUES
>> ((SELECT ids1.num + 1 FROM fsidnums AS ids1 LEFT JOIN fsidnums AS ids2 ON
>> ids2.num = ids1.num + 1 WHERE ids2.num IS NULL UNION SELECT 1 WHERE NOT EXISTS
>> (SELECT NULL FROM fsidnums WHERE num = 1) LIMIT 1), ?1) RETURNING num;";
>> +static const char fsidnum_by_path_sql[] = "SELECT num FROM fsidnums WHERE path
>> = ?1;";
>> +static const char add_crossed_volume_sql[] = "REPLACE INTO subvolumes
>> VALUES(?1);";
>> +static const char drop_crossed_volume_sql[] = "DELETE FROM subvolumes WHERE
>> path = ?1;";
>> +static const char get_crossed_volumes_sql[] = "SELECT path from subvolumes;";
> ...
>> +/*
>> + * reexpdb_init - Initialize reexport database
>> + *
>> + * Setup shared lock (database is concurrently used by multiple processes),
> 
> So, this should all work when rpc.mountd is run with --num_threads > 1?

Yes, that's why we need the shared rwlock.

Thanks,
//richard

  reply	other threads:[~2022-03-09  9:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17 13:15 [RFC PATCH 0/6] nfs-utils: Improving NFS re-exports Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 1/6] Implement reexport helper library Richard Weinberger
2022-03-08 21:44   ` J. Bruce Fields
2022-03-09  9:43     ` Richard Weinberger [this message]
2022-03-09 14:19       ` bfields
2022-03-09 15:02         ` Richard Weinberger
2022-03-09 15:28           ` bfields
2022-02-17 13:15 ` [RFC PATCH 2/6] exports: Implement new export option reexport= Richard Weinberger
2022-03-08 22:10   ` J. Bruce Fields
2022-03-09  9:43     ` Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 3/6] export: Implement logic behind reexport= Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 4/6] export: Record mounted volumes Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 5/6] nfsd: statfs() every known subvolume upon start Richard Weinberger
2022-02-17 13:15 ` [RFC PATCH 6/6] export: Garbage collect orphaned subvolumes " Richard Weinberger
2022-02-17 16:33 ` [RFC PATCH 0/6] nfs-utils: Improving NFS re-exports J. Bruce Fields
2022-02-17 17:27   ` Richard Weinberger
2022-02-17 19:27     ` bfields
2022-02-17 20:15       ` Richard Weinberger
2022-02-17 20:18         ` bfields
2022-02-17 20:29           ` Richard Weinberger
2022-03-07  9:25   ` Richard Weinberger
2022-03-07 22:29     ` bfields
2022-04-19 20:20       ` Steve Dickson
2022-04-19 20:31         ` Richard Weinberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=692661836.127800.1646819014252.JavaMail.zimbra@nod.at \
    --to=richard@nod.at \
    --cc=anna.schumaker@netapp.com \
    --cc=bfields@fieldses.org \
    --cc=chris.chilvers@appsbroker.com \
    --cc=david.oberhollenzer@sigma-star.at \
    --cc=david.young@appsbroker.com \
    --cc=david@sigma-star.at \
    --cc=linux-nfs@vger.kernel.org \
    --cc=luis.turcitu@appsbroker.com \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).