Hi Joey, On Sun, Dec 16, 2018 at 09:21:35AM -1000, Joey Pabalinas wrote: > > > I spent a lot of time trying to find an LKML archive in Maildir format > > > that I could use for local searches with nutmuch or something, but all > > > the links I was able to find were all dead. > > > > You might instead use > > > > https://www.kernel.org/lore.html > > https://git.kernel.org/pub/scm/public-inbox/vger.kernel.org/git.git/ > > That was my first attempt, but the ducumentation for the public-inbox > format is sort of terrible, and after a few hours trying to convert it > to Maildir I just gave up. > > I ended up just slowly scraping lkml.org for a couple weeks so I > wouldn't disrupt anything and it worked fairly well. Just looking for > advice on where to host this now so others might be able to use it. Now you've caught my attention; first of all, there are more than 3M messages stored in the lkml.org datase, so I guess you've missed some messages or something is really broken. Besides, unless you figured out how to get to the raw data, you've just scraped a rendering which discards stuff like pgp signatures etc and has very incomplete headers. Unless you don't care for those of course :) Note that I've also been toying with the lore dataset, and wrote a tiny tool to get Maildir-like data out of it; this code is a bit of a single-use-jig so you'll need to do some coding if you really want to use it. Attached anyway. All the best and enjoy, Jasper