From: Eric Wong <email@example.com> To: Bjorn Helgaas <firstname.lastname@example.org> Cc: Joey Pabalinas <email@example.com>, firstname.lastname@example.org, email@example.com, Linus Torvalds <firstname.lastname@example.org>, Greg Kroah-Hartman <email@example.com>, Konstantin Ryabitsev <firstname.lastname@example.org>, Eric Biederman <email@example.com>, Jasper Spaans <firstname.lastname@example.org> Subject: Re: [RFC] LKML Archive in Maildir Format Date: Tue, 5 Mar 2019 23:26:00 +0000 [thread overview] Message-ID: <20190305232600.GA12110@dcvr> (raw) In-Reply-To: <CAErSpo5a2oO=5byEuA5AouS=kBmj7ihw2EVYAvJcdti29Tf1HQ@mail.gmail.com> Bjorn Helgaas <email@example.com> wrote: > OK, so I understand how to clone archives from lore.kernel.org and how > to convert a git archive to a maildir (thanks, Konstantin!) > > What I *don't* understand is how to effectively read this locally. > Ideally I'd like to run mutt, possibly with notmuch for indexing. But > a maildir with 3M files seems impractical. I did actually try it > (without notmuch), but it takes mutt about 5 minutes to start up. And > the maildir is about 23G, compared with 7.5G for the git archive. Right, relying on Maildir for long-term storage of giant archives is not a usable solution with any general purpose FSes I know about. git itself had the same problem with loose object scalability in the old days and packs were invented as a result. > Any pointers? I guess there's no mutt backend that can read a > public-inbox archive directly? There's mutt patches to support reading over NNTP, so that works: mutt -f news://$INBOX_HOST/$INBOX_NEWSGROUP I don't think mutt handles mboxrd 100% correctly, but it's close enough that you can can download the gzipped mboxrd of a search query and open it via "mutt -f /path/to/downloaded/mbox.gz" curl -XPOST -OJ "$INBOX_URL/?q=$SEARCH_QUERY&x=m" POST is required(*), and -OJ lets it use the Content-Disposition: header for a meaningful server-generated name, but you can also redirect the result to whatever you want. For all messages since March 1, you could use: SEARCH_QUERY=d:20190301.. All the supported search queries are documented in $INBOX_URL/_/text/help/ and the search prefixes (e.g. "d:", "s:", "b:") are modeled after what's in mairix. You'll need to escape the queries for URIs (e.g. " " => "+", and so on). Xapian requires date ranges to be denoted with ".." whereas mairix uses "-" for ranges. The main thing public-inbox search misses from mairix is support for "-t" which grabs non-matching messages from the same thread. I would like to support that someday, but don't have enough time (or funding) to make it happen at the moment. (*) to reliably avoid wasting resources from spiders/prefetchers
next prev parent reply other threads:[~2019-03-05 23:26 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-12-16 19:06 Joey Pabalinas 2018-12-16 19:17 ` Joe Perches 2018-12-16 19:21 ` Joey Pabalinas 2018-12-16 19:55 ` Konstantin Ryabitsev 2018-12-16 21:55 ` Joey Pabalinas 2018-12-18 20:26 ` Jasper Spaans 2018-12-18 22:53 ` Joey Pabalinas 2018-12-16 19:46 ` Konstantin Ryabitsev 2018-12-16 19:53 ` Joey Pabalinas 2019-01-04 1:35 ` Eric Wong 2019-03-05 20:48 ` Bjorn Helgaas 2019-03-05 23:26 ` Eric Wong [this message] 2019-03-06 20:50 ` Bjorn Helgaas 2019-03-07 3:44 ` Eric Wong
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190305232600.GA12110@dcvr \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [RFC] LKML Archive in Maildir Format' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).