* [mlmmj] mlmmj patches from distributions: Gentoo
@ 2022-12-08 1:13 Robin H. Johnson
2022-12-08 8:01 ` Baptiste Daroussin
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Robin H. Johnson @ 2022-12-08 1:13 UTC (permalink / raw)
To: mlmmj
[-- Attachment #1.1: Type: text/plain, Size: 1071 bytes --]
It's great to see more active development again.
Gentoo Linux still runs mlmmj for all our lists, and we've got some
patches that would be great to land upstream.
mlmmj-1.3.0-gcc-10.patch - GCC 10 fix
mlmmj-1.2.19.0-listcontrol-customheaders.patch
Include the customheaders file content in list control messages.
mlmmj-1.3.0-logging.patch - this is a brand new patch, not tested yet.
When dropping messages from non-subscribers, we wanted a better trail
about it. Ideally we'd
1) log the message-id
2) give a per-message SMTP-time rejection (need postfix filter stuff)
There's one scaling discussion we need to have, but the handling needs
to include how to incrementally get there:
How to have millions of mails in /archive/!
Our most active list is now approaching 1.5M emails in that directory,
and it worries me.
--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
[-- Attachment #1.2: mlmmj-1.3.0-logging.patch --]
[-- Type: text/plain, Size: 2632 bytes --]
On a high-mail system, it's hard to link errors back to specific mails.
Log the list address, poster address and envelope to aid that.
Better work here would be capturing the message-id and logging that, but it's
not presently captured, so that is a more invasive change.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
--- mlmmj-1.3.0.orig/src/mlmmj-process.c 2022-11-24 16:09:30.839848253 -0800
+++ mlmmj-1.3.0/src/mlmmj-process.c 2022-11-24 16:52:32.311365699 -0800
@@ -193,7 +193,8 @@ static void newmoderated(const char *lis
if (notifymod) {
childpid = fork();
if(childpid < 0)
- log_error(LOG_ARGS, "Could not fork; poster not notified");
+ log_error(LOG_ARGS, "Could not fork; poster not notified"
+ "; list=%s poster=%s envelope=%s", listaddr, posteraddr, efromsender);
} else
childpid = -1;
@@ -919,8 +920,10 @@ int main(int argc, char **argv)
log_error(LOG_ARGS, "Discarding %s because list"
" address was not in To: or Cc:,"
" and From: was the list or"
- " notoccdenymails was set",
- mailfile);
+ " notoccdenymails was set"
+ "; list=%s poster=%s envelope=%s",
+ mailfile,
+ listaddr, posteraddr, efrom);
myfree(listaddr);
unlink(donemailname);
myfree(donemailname);
@@ -971,8 +974,10 @@ int main(int argc, char **argv)
" it was denied by an access"
" rule, and From: was the list"
" address or noaccessdenymails"
- " was set",
- mailfile);
+ " was set"
+ "; list=%s poster=%s envelope=%s",
+ mailfile,
+ listaddr, posteraddr, efrom);
myfree(listaddr);
unlink(donemailname);
myfree(donemailname);
@@ -1046,8 +1051,10 @@ int main(int argc, char **argv)
if (strcasecmp(listaddr, posteraddr) == 0) {
log_error(LOG_ARGS, "Discarding %s because"
" there are sender restrictions but"
- " From: was the list address",
- mailfile);
+ " From: was the list address"
+ "; list=%s poster=%s envelope=%s",
+ mailfile,
+ listaddr, posteraddr, efrom);
myfree(listaddr);
unlink(donemailname);
myfree(donemailname);
@@ -1072,8 +1079,10 @@ int main(int argc, char **argv)
(modonlypost &&
statctrl(listdir, "nomodonlydenymails"))) {
log_error(LOG_ARGS, "Discarding %s because"
- " no{sub|mod}onlydenymails was set",
- mailfile);
+ " no{sub|mod}onlydenymails was set"
+ "; list=%s poster=%s envelope=%s",
+ mailfile,
+ listaddr, posteraddr, efrom);
myfree(listaddr);
unlink(donemailname);
myfree(donemailname);
[-- Attachment #1.3: mlmmj-1.3.0-gcc-10.patch --]
[-- Type: text/plain, Size: 656 bytes --]
--- a/include/mlmmj.h
+++ b/include/mlmmj.h
@@ -81,7 +81,7 @@ enum subtype {
SUB_NONE /* For when an address is not subscribed at all */
};
-char *subtype_strs[7]; /* count matches enum above; defined in subscriberfuncs.c */
+extern char *subtype_strs[7]; /* count matches enum above; defined in subscriberfuncs.c */
enum subreason {
SUB_REQUEST,
@@ -92,7 +92,7 @@ enum subreason {
SUB_SWITCH
};
-char * subreason_strs[6]; /* count matches enum above; defined in subscriberfuncs.c */
+extern char * subreason_strs[6]; /* count matches enum above; defined in subscriberfuncs.c */
void print_version(const char *prg);
[-- Attachment #1.4: mlmmj-1.2.19.0-listcontrol-customheaders.patch --]
[-- Type: text/plain, Size: 1231 bytes --]
List control emails do not include customheaders, and can lead to RBL issues
for forged senders.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
diff -Nuar --exclude '*~' mlmmj-1.2.19.0.orig/src/mlmmj-process.c mlmmj-1.2.19.0/src/mlmmj-process.c
--- mlmmj-1.2.19.0.orig/src/mlmmj-process.c 2014-03-23 17:57:24.000000000 -0700
+++ mlmmj-1.2.19.0/src/mlmmj-process.c 2016-05-04 13:50:26.034174788 -0700
@@ -702,8 +702,19 @@
"output mail file");
exit(EXIT_FAILURE);
}
- if(do_all_the_voodoo_here(rawmailfd, donemailfd, -1,
- -1, delheaders,
+ /* hdrfd is checked in do_all_the_voodoo_here(), because the
+ * customheaders file might not exist */
+ headerfilename = concatstr(2, listdir, "/control/customheaders");
+ hdrfd = open(headerfilename, O_RDONLY);
+ myfree(headerfilename);
+
+ /* footfd is checked in do_all_the_voodoo_here(), see above */
+ footerfilename = concatstr(2, listdir, "/control/footer");
+ footfd = open(footerfilename, O_RDONLY);
+ myfree(footerfilename);
+
+ if(do_all_the_voodoo_here(rawmailfd, donemailfd, hdrfd,
+ footfd, delheaders,
NULL, &allheaders, NULL) < 0) {
log_error(LOG_ARGS, "do_all_the_voodoo_here");
exit(EXIT_FAILURE);
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [mlmmj] mlmmj patches from distributions: Gentoo
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
@ 2022-12-08 8:01 ` Baptiste Daroussin
2022-12-08 12:49 ` Franky Van Liedekerke
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Baptiste Daroussin @ 2022-12-08 8:01 UTC (permalink / raw)
To: mlmmj
On Thu, Dec 08, 2022 at 01:13:26AM +0000, Robin H. Johnson wrote:
> It's great to see more active development again.
> Gentoo Linux still runs mlmmj for all our lists, and we've got some
> patches that would be great to land upstream.
>
> mlmmj-1.3.0-gcc-10.patch - GCC 10 fix
This is already in
>
> mlmmj-1.2.19.0-listcontrol-customheaders.patch
> Include the customheaders file content in list control messages.
I will look at it and see to incomporate
>
> mlmmj-1.3.0-logging.patch - this is a brand new patch, not tested yet.
> When dropping messages from non-subscribers, we wanted a better trail
> about it. Ideally we'd
> 1) log the message-id
> 2) give a per-message SMTP-time rejection (need postfix filter stuff)
I will look into it as well.
>
> There's one scaling discussion we need to have, but the handling needs
> to include how to incrementally get there:
>
> How to have millions of mails in /archive/!
>
> Our most active list is now approaching 1.5M emails in that directory,
> and it worries me.
Why is it worrying you? a directory can hold way more than that.
On FreeBSD, one thing we are doing is when creating the public archives via:
https://fossil.nours.eu/mlmmj-archiver/doc/trunk/README.md (this may move to
codeberg), I append the archives to a mbox, so I can cleanup what ever is in the
archives directory if I need (I did not need up to now).
DISCLAIMER: mlmmj-archiver is not portable at all yet, and is not even in alpha
state :D (I am happily accepting portability patches, but this is not my focus
yet).
Best regards,
Bapt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [mlmmj] mlmmj patches from distributions: Gentoo
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
2022-12-08 8:01 ` Baptiste Daroussin
@ 2022-12-08 12:49 ` Franky Van Liedekerke
2022-12-08 23:18 ` Robin H. Johnson
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Franky Van Liedekerke @ 2022-12-08 12:49 UTC (permalink / raw)
To: mlmmj
[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]
On Thu, 2022-12-08 at 09:01 +0100, Baptiste Daroussin wrote:
> On Thu, Dec 08, 2022 at 01:13:26AM +0000, Robin H. Johnson wrote:
> > There's one scaling discussion we need to have, but the handling
> > needs
> > to include how to incrementally get there:
> >
> > How to have millions of mails in /archive/!
> >
> > Our most active list is now approaching 1.5M emails in that
> > directory,
> > and it worries me.
>
> Why is it worrying you? a directory can hold way more than that.
>
> On FreeBSD, one thing we are doing is when creating the public
> archives via:
> https://fossil.nours.eu/mlmmj-archiver/doc/trunk/README.md (this may
> move to
> codeberg), I append the archives to a mbox, so I can cleanup what
> ever is in the
> archives directory if I need (I did not need up to now).
>
> DISCLAIMER: mlmmj-archiver is not portable at all yet, and is not
> even in alpha
> state :D (I am happily accepting portability patches, but this is not
> my focus
> yet).
I can share some stuff I did based on mhonarc to create a public
archive (also based on mlmmj-webarchiver in fact), but in the end:
- mlmmj archives are already in subdirectories archive/1 ... archive/15
- the mhonarc archive (public) is in subdirectories per-month, so I'm
totally not woried
Franky
[-- Attachment #2: Type: text/html, Size: 4596 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [mlmmj] mlmmj patches from distributions: Gentoo
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
2022-12-08 8:01 ` Baptiste Daroussin
2022-12-08 12:49 ` Franky Van Liedekerke
@ 2022-12-08 23:18 ` Robin H. Johnson
2022-12-09 5:25 ` Baptiste Daroussin
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Robin H. Johnson @ 2022-12-08 23:18 UTC (permalink / raw)
To: mlmmj
On Thu, Dec 08, 2022 at 09:01:29AM +0100, Baptiste Daroussin wrote:
> > There's one scaling discussion we need to have, but the handling needs
> > to include how to incrementally get there:
> >
> > How to have millions of mails in /archive/!
> >
> > Our most active list is now approaching 1.5M emails in that directory,
> > and it worries me.
>
> Why is it worrying you? a directory can hold way more than that.
>
> On FreeBSD, one thing we are doing is when creating the public archives via:
> https://fossil.nours.eu/mlmmj-archiver/doc/trunk/README.md (this may move to
> codeberg), I append the archives to a mbox, so I can cleanup what ever is in the
> archives directory if I need (I did not need up to now).
Cleaning up the archives directory will break the +get-NNNN
functionality.
It makes the directory take extremely long to scan during backups.
I think it should be partitioned more, but any +get-NNNN code will need
to support both partitioned & non-partitioned.
mbox support for +get would be nice, but that's much more work than just
partitioning the directory.
We do have our own archives site:
https://archives.gentoo.org/
The older versions used to use mhonarc, but also ran into scaling
concerns, so we built something more custom (earlier versions used
ElasticSearch as a DB; I don't recall what the current version uses
underneath, but we can re-ingest everything if we need to).
--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [mlmmj] mlmmj patches from distributions: Gentoo
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
` (2 preceding siblings ...)
2022-12-08 23:18 ` Robin H. Johnson
@ 2022-12-09 5:25 ` Baptiste Daroussin
2022-12-09 5:36 ` Robin H. Johnson
2022-12-09 5:56 ` Baptiste Daroussin
5 siblings, 0 replies; 7+ messages in thread
From: Baptiste Daroussin @ 2022-12-09 5:25 UTC (permalink / raw)
To: mlmmj
Le 9 décembre 2022 00:18:46 GMT+01:00, "Robin H. Johnson" <robbat2@gentoo.org> a écrit :
>On Thu, Dec 08, 2022 at 09:01:29AM +0100, Baptiste Daroussin wrote:
>> > There's one scaling discussion we need to have, but the handling needs
>> > to include how to incrementally get there:
>> >
>> > How to have millions of mails in /archive/!
>> >
>> > Our most active list is now approaching 1.5M emails in that directory,
>> > and it worries me.
>>
>> Why is it worrying you? a directory can hold way more than that.
>>
>> On FreeBSD, one thing we are doing is when creating the public archives via:
>> https://fossil.nours.eu/mlmmj-archiver/doc/trunk/README.md (this may move to
>> codeberg), I append the archives to a mbox, so I can cleanup what ever is in the
>> archives directory if I need (I did not need up to now).
>Cleaning up the archives directory will break the +get-NNNN
>functionality.
>
>It makes the directory take extremely long to scan during backups.
>I think it should be partitioned more, but any +get-NNNN code will need
>to support both partitioned & non-partitioned.
>
Good point and easy to implement, I ll looking into it.
>mbox support for +get would be nice, but that's much more work than just
>partitioning the directory.
>
>We do have our own archives site:
>https://archives.gentoo.org/
>
>The older versions used to use mhonarc, but also ran into scaling
>concerns, so we built something more custom (earlier versions used
>ElasticSearch as a DB; I don't recall what the current version uses
>underneath, but we can re-ingest everything if we need to).
>
If this is public I am interrested into it, i wrote mlmmj-archiver for scaling issues as well.
Best regards,
Bapt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [mlmmj] mlmmj patches from distributions: Gentoo
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
` (3 preceding siblings ...)
2022-12-09 5:25 ` Baptiste Daroussin
@ 2022-12-09 5:36 ` Robin H. Johnson
2022-12-09 5:56 ` Baptiste Daroussin
5 siblings, 0 replies; 7+ messages in thread
From: Robin H. Johnson @ 2022-12-09 5:36 UTC (permalink / raw)
To: mlmmj
On Fri, Dec 09, 2022 at 06:25:58AM +0100, Baptiste Daroussin wrote:
> >Cleaning up the archives directory will break the +get-NNNN
> >functionality.
> >
> >It makes the directory take extremely long to scan during backups.
> >I think it should be partitioned more, but any +get-NNNN code will need
> >to support both partitioned & non-partitioned.
> Good point and easy to implement, I ll looking into it.
Great to hear. Please do consider how to migrate existing /archive/
directories into it.
> >mbox support for +get would be nice, but that's much more work than just
> >partitioning the directory.
> >
> >We do have our own archives site:
> >https://archives.gentoo.org/
> >
> >The older versions used to use mhonarc, but also ran into scaling
> >concerns, so we built something more custom (earlier versions used
> >ElasticSearch as a DB; I don't recall what the current version uses
> >underneath, but we can re-ingest everything if we need to).
> If this is public I am interrested into it, i wrote mlmmj-archiver for
> scaling issues as well.
Yes, it's public (and still running ES as it turns out):
https://gitweb.gentoo.org/sites/archives/frontend.git/
https://gitweb.gentoo.org/sites/archives/backend.git/
Some crappy instructions here:
https://wiki.gentoo.org/wiki/Project:Infrastructure/Service_Catalog/Archives
This is the 4th or 5th rewrite of that service in the last 20 years.
--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [mlmmj] mlmmj patches from distributions: Gentoo
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
` (4 preceding siblings ...)
2022-12-09 5:36 ` Robin H. Johnson
@ 2022-12-09 5:56 ` Baptiste Daroussin
5 siblings, 0 replies; 7+ messages in thread
From: Baptiste Daroussin @ 2022-12-09 5:56 UTC (permalink / raw)
To: mlmmj
Le 9 décembre 2022 06:36:54 GMT+01:00, "Robin H. Johnson" <robbat2@gentoo.org> a écrit :
>On Fri, Dec 09, 2022 at 06:25:58AM +0100, Baptiste Daroussin wrote:
>> >Cleaning up the archives directory will break the +get-NNNN
>> >functionality.
>> >
>> >It makes the directory take extremely long to scan during backups.
>> >I think it should be partitioned more, but any +get-NNNN code will need
>> >to support both partitioned & non-partitioned.
>> Good point and easy to implement, I ll looking into it.
>Great to hear. Please do consider how to migrate existing /archive/
>directories into it.
>
My current plan is pretty simple make a 100k partition via mlmmj-maintd, is scans the archives and move mails around. So the rest of the code remains the same.
+get will look into 2 places the root of the archive and the partitionned directory
This should (famous last word) be easy and not really intrusive.
Best regards
Bapt
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-12-09 5:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-08 1:13 [mlmmj] mlmmj patches from distributions: Gentoo Robin H. Johnson
2022-12-08 8:01 ` Baptiste Daroussin
2022-12-08 12:49 ` Franky Van Liedekerke
2022-12-08 23:18 ` Robin H. Johnson
2022-12-09 5:25 ` Baptiste Daroussin
2022-12-09 5:36 ` Robin H. Johnson
2022-12-09 5:56 ` Baptiste Daroussin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.