Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)

All of lore.kernel.org
 help / color / mirror / Atom feed

* Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
@ 2024-02-27 22:32 Konstantin Ryabitsev
  2024-02-27 23:35 ` Junio C Hamano
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Konstantin Ryabitsev @ 2024-02-27 22:32 UTC (permalink / raw)
  To: users, tools, workflows

Hi, all:

I was playing with shell-gpt and wrote a quickie integration that would allow
retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
asking it to provide some basic analysis of the thread contents. Here's a
recorded demo session:

https://asciinema.org/a/643435

A few notes:

1. This is obviously not a replacement for actually reading email, but can
   potentially be a useful asset for a busy maintainer who just wants a quick
   summary of a lengthy thread before they look at it in detail.
2. This is not free or cheap! To digest a lengthy thread, you can expect
   ChatGPT to generate enough tokens to cost you $1 or more in API usage fees.
   I know it's nothing compared to how expensive some of y'all's time is, and
   you can probably easily get that expensed by your employers, but for many
   others it's a pretty expensive toy. I managed to make it a bit cheaper by
   doing some surgery on the threads before feeding them to chatgpt (like
   removing most of the message headers and throwing out some of the quoted
   content), but there's a limit to how much we can throw out before the
   analysis becomes dramatically less useful.
3. This only works with ChatGPT-4, as most threads are too long for
   ChatGPT-3.5 to even process.

So, the question is -- is this useful at all? Am I wasting time poking in this
direction, or is this something that would be of benefit to any of you? If the
latter, I will document how to set this up and commit the thread minimization
code I hacked together to make it cheaper.

Best regards,
-K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-27 22:32 Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration) Konstantin Ryabitsev
@ 2024-02-27 23:35 ` Junio C Hamano
  2024-02-28  0:43 ` Linus Torvalds
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-02-27 23:35 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: users, tools, workflows

As somebody who (1) did not read the actual thread and only read the
text in the demo, (2) would probably not understand the details being
discussed if he read the actual thread anyway, hence (3) would not
know if ChatGPT in the demo were merely hallucinating, it was a very
nice demonstration. Of course, you are interested in knowing if people
find it "useful", which I unfortunately cannot offer much useful
input.

On Tue, Feb 27, 2024 at 2:32 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> Hi, all:
>
> I was playing with shell-gpt and wrote a quickie integration that would allow
> retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
> asking it to provide some basic analysis of the thread contents. Here's a
> recorded demo session:
>
> https://asciinema.org/a/643435
>
> A few notes:
>
> 1. This is obviously not a replacement for actually reading email, but can
>    potentially be a useful asset for a busy maintainer who just wants a quick
>    summary of a lengthy thread before they look at it in detail.
> 2. This is not free or cheap! To digest a lengthy thread, you can expect
>    ChatGPT to generate enough tokens to cost you $1 or more in API usage fees.
>    I know it's nothing compared to how expensive some of y'all's time is, and
>    you can probably easily get that expensed by your employers, but for many
>    others it's a pretty expensive toy. I managed to make it a bit cheaper by
>    doing some surgery on the threads before feeding them to chatgpt (like
>    removing most of the message headers and throwing out some of the quoted
>    content), but there's a limit to how much we can throw out before the
>    analysis becomes dramatically less useful.
> 3. This only works with ChatGPT-4, as most threads are too long for
>    ChatGPT-3.5 to even process.
>
> So, the question is -- is this useful at all? Am I wasting time poking in this
> direction, or is this something that would be of benefit to any of you? If the
> latter, I will document how to set this up and commit the thread minimization
> code I hacked together to make it cheaper.
>
> Best regards,
> -K
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-27 22:32 Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration) Konstantin Ryabitsev
  2024-02-27 23:35 ` Junio C Hamano
@ 2024-02-28  0:43 ` Linus Torvalds
  2024-02-28 20:46   ` Shuah Khan
  2024-02-29  0:33   ` James Bottomley
  2024-02-28  5:00 ` Willy Tarreau
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: Linus Torvalds @ 2024-02-28  0:43 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: users, tools, workflows

On Tue, 27 Feb 2024 at 14:32, Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> So, the question is -- is this useful at all?

I find the demo intriguing. Useful? Who knows, but the potential is
certainly there.

The summary didn't seem to take Rafael's last (fairly strongly
negative) email into account as much as maybe it should have been (my
first reaction was that maybe Rafael's last email hadn't made it into
the summary, but going back to it I think it was there, just not
nearly as forcefully as Rafael's email read to me).

I personally would find it more intriguing still if it actually
reacted to the code instead of just the discussion, but I suspect
that's still some way away.

IOW - interesting. But hard to tell how useful and how maintainers
would actually end up using it.

Would it be something where when you bring a new developer into a long
discussion, you'd introduce him with a summary of what has gone
before?  I'm not sure what the use cases would be in practice.

           Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-27 22:32 Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration) Konstantin Ryabitsev
  2024-02-27 23:35 ` Junio C Hamano
  2024-02-28  0:43 ` Linus Torvalds
@ 2024-02-28  5:00 ` Willy Tarreau
  2024-02-28 14:03   ` Mark Brown
  2024-02-28 15:04   ` Hannes Reinecke
  2024-02-28 18:55 ` Bart Van Assche
  2024-02-28 19:32 ` Luis Chamberlain
  4 siblings, 2 replies; 23+ messages in thread
From: Willy Tarreau @ 2024-02-28  5:00 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: users, tools, workflows

Hi Konstantin,

On Tue, Feb 27, 2024 at 05:32:34PM -0500, Konstantin Ryabitsev wrote:
> Hi, all:
> 
> I was playing with shell-gpt and wrote a quickie integration that would allow
> retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
> asking it to provide some basic analysis of the thread contents. Here's a
> recorded demo session:
> 
> https://asciinema.org/a/643435
> 
> A few notes:
> 
> 1. This is obviously not a replacement for actually reading email, but can
>    potentially be a useful asset for a busy maintainer who just wants a quick
>    summary of a lengthy thread before they look at it in detail.
> 2. This is not free or cheap! To digest a lengthy thread, you can expect
>    ChatGPT to generate enough tokens to cost you $1 or more in API usage fees.
>    I know it's nothing compared to how expensive some of y'all's time is, and
>    you can probably easily get that expensed by your employers, but for many
>    others it's a pretty expensive toy. I managed to make it a bit cheaper by
>    doing some surgery on the threads before feeding them to chatgpt (like
>    removing most of the message headers and throwing out some of the quoted
>    content), but there's a limit to how much we can throw out before the
>    analysis becomes dramatically less useful.
> 3. This only works with ChatGPT-4, as most threads are too long for
>    ChatGPT-3.5 to even process.
> 
> So, the question is -- is this useful at all? Am I wasting time poking in this
> direction, or is this something that would be of benefit to any of you? If the
> latter, I will document how to set this up and commit the thread minimization
> code I hacked together to make it cheaper.

Amusing, I've run experiments about something comparable with my own
e-mails (I'd like to get a few lines summary before reading them), and
thought about being able to summarize long LKML threads to still know
what is currently going on without having to spend a lot of time on
all of them.

I figured a number of shortcomings about this: I suspect that those
most interested in such output are either, a bit like me, not much
active on kernel development, or focus on a specific area and mostly
want to stay aware of ongoing changes in other areas they're really
not familiar with.

And because of this I didn't find on what boundaries to cut the analysis,
If it's "since last time I read my email", it can only be done locally
and will be per-user. If it's a summary of a final thread, it's not
super interesting and it's better explained (IMHO) on LWN where the
hot topics are summarized and developed. If it's the list of threads
of the day, I've suspected that there are so many that it's unlikely
I'd read all of them every evening or every morning. I've been wondering
if an interesting approach would be to only summarize long threads,
since most short ones are a patch, a review and an ACK and do not need
to be summarized, but I think that most of us seeing a subject repeat
over many e-mails will just look at a few exchanges there to get an
idea of what's going on.

Ideally having a link in each thread to a place where a summary is
being held could be nice, except that it's not how such tools work.
You certainly don't want to re-run the analysis on a whole thread
every time it grows by a few messages due to processing time and
cost.

Also regarding processing costs, I've had extremely good results using
the Mixtral-8x7B LLM in instruct mode running locally. It has a 32k context
like GPT4. And if not enough, given that most of a long thread's contents
is in fact quoted text, it could be sufficient to drop multiple indents
to preserve a response and its context while dropping most of the repeat
(it cuts your example thread in roughly half). But this still takes quite
a bit of processing time: processing the 14 mails from the thread above
took 13 minutes on a 80-core Ampere Altra system (no GPU involved here).
This roughly costs 1 minute per e-mail, that's a lot per day, not counting
the time needed to tune the prompt to get the best results!

Overall, while I think that some people might find "something like this"
useful, most of them would want it "slightly different" to be useful to
them.

Just my two cents,
Willy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28  5:00 ` Willy Tarreau
@ 2024-02-28 14:03   ` Mark Brown
  2024-02-28 14:39     ` Willy Tarreau
  2024-02-28 15:22     ` Konstantin Ryabitsev
  2024-02-28 15:04   ` Hannes Reinecke
  1 sibling, 2 replies; 23+ messages in thread
From: Mark Brown @ 2024-02-28 14:03 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Konstantin Ryabitsev, users, tools, workflows

[-- Attachment #1: Type: text/plain, Size: 2808 bytes --]

On Wed, Feb 28, 2024 at 06:00:07AM +0100, Willy Tarreau wrote:
> On Tue, Feb 27, 2024 at 05:32:34PM -0500, Konstantin Ryabitsev wrote:

> > So, the question is -- is this useful at all? Am I wasting time poking in this
> > direction, or is this something that would be of benefit to any of you? If the
> > latter, I will document how to set this up and commit the thread minimization
> > code I hacked together to make it cheaper.

> I figured a number of shortcomings about this: I suspect that those
> most interested in such output are either, a bit like me, not much
> active on kernel development, or focus on a specific area and mostly
> want to stay aware of ongoing changes in other areas they're really
> not familiar with.

I can imagine using this sort of thing for the case where I get to my
inbox in the morning and there's some enormous thread appeared overnight
with people arguing and I'm trying to get a handle on what the various
subthreads are all about.  The demo didn't cover exactly that case but
it looked like it might be able to give some sort of useful steer.

> And because of this I didn't find on what boundaries to cut the analysis,
> If it's "since last time I read my email", it can only be done locally
> and will be per-user. If it's a summary of a final thread, it's not
> super interesting and it's better explained (IMHO) on LWN where the
> hot topics are summarized and developed. If it's the list of threads
> of the day, I've suspected that there are so many that it's unlikely
> I'd read all of them every evening or every morning. I've been wondering
> if an interesting approach would be to only summarize long threads,
> since most short ones are a patch, a review and an ACK and do not need
> to be summarized, but I think that most of us seeing a subject repeat
> over many e-mails will just look at a few exchanges there to get an
> idea of what's going on.

For the above case it'd be an on demand thing which I'd look for
occasionally.

> Also regarding processing costs, I've had extremely good results using
> the Mixtral-8x7B LLM in instruct mode running locally. It has a 32k context
> like GPT4. And if not enough, given that most of a long thread's contents
> is in fact quoted text, it could be sufficient to drop multiple indents
> to preserve a response and its context while dropping most of the repeat
> (it cuts your example thread in roughly half). But this still takes quite
> a bit of processing time: processing the 14 mails from the thread above
> took 13 minutes on a 80-core Ampere Altra system (no GPU involved here).
> This roughly costs 1 minute per e-mail, that's a lot per day, not counting
> the time needed to tune the prompt to get the best results!

That actually sounds potentially viable for my case, at least while I'm
at home.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 14:03   ` Mark Brown
@ 2024-02-28 14:39     ` Willy Tarreau
  2024-02-28 15:22     ` Konstantin Ryabitsev
  1 sibling, 0 replies; 23+ messages in thread
From: Willy Tarreau @ 2024-02-28 14:39 UTC (permalink / raw)
  To: Mark Brown; +Cc: Konstantin Ryabitsev, users, tools, workflows

Hi Mark,

On Wed, Feb 28, 2024 at 02:03:31PM +0000, Mark Brown wrote:
> On Wed, Feb 28, 2024 at 06:00:07AM +0100, Willy Tarreau wrote:
> > On Tue, Feb 27, 2024 at 05:32:34PM -0500, Konstantin Ryabitsev wrote:
> 
> > > So, the question is -- is this useful at all? Am I wasting time poking in this
> > > direction, or is this something that would be of benefit to any of you? If the
> > > latter, I will document how to set this up and commit the thread minimization
> > > code I hacked together to make it cheaper.
> 
> > I figured a number of shortcomings about this: I suspect that those
> > most interested in such output are either, a bit like me, not much
> > active on kernel development, or focus on a specific area and mostly
> > want to stay aware of ongoing changes in other areas they're really
> > not familiar with.
> 
> I can imagine using this sort of thing for the case where I get to my
> inbox in the morning and there's some enormous thread appeared overnight
> with people arguing and I'm trying to get a handle on what the various
> subthreads are all about.

So maybe I'm not the only one expurging my mailbox during coffee time
after all :-)

> The demo didn't cover exactly that case but
> it looked like it might be able to give some sort of useful steer.

Please see below for an example of what I got locally.

> > Also regarding processing costs, I've had extremely good results using
> > the Mixtral-8x7B LLM in instruct mode running locally. It has a 32k context
> > like GPT4. And if not enough, given that most of a long thread's contents
> > is in fact quoted text, it could be sufficient to drop multiple indents
> > to preserve a response and its context while dropping most of the repeat
> > (it cuts your example thread in roughly half). But this still takes quite
> > a bit of processing time: processing the 14 mails from the thread above
> > took 13 minutes on a 80-core Ampere Altra system (no GPU involved here).
> > This roughly costs 1 minute per e-mail, that's a lot per day, not counting
> > the time needed to tune the prompt to get the best results!
> 
> That actually sounds potentially viable for my case, at least while I'm
> at home.

OK, if you're interested in giving it a try at home, here's what I've been
using:
  - github.com/ggerganov/llama.cpp 
  - the mixtral LLM from:
    https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/tree/main
    I'm used to Q5_K_M quantization which generally provides the best
    compromise of accuracy/performance/speed, but for e-mail summaries,
    maybe using a smaller one would give good enough results faster.
  - ~35G of available RAM for the model above and plenty of cores (80 in
    my case)

I downloaded the whole thread in mbox format from lore (message ID
20240223143833.1509961-1-guanyulin@google.com), passed it through
formail to drop useless headers that take a lot of context space
and time, and drop 2nd-level replies:

  $ (echo "<|im_start|>data"; zcat t.mbox.gz | \
    formail -I X- -I Received -I ARC- -I Authentication- -I DKIM- \
            -I List- -I Precedence -I Mime- -I Message-ID -s | \
    grep -v '^>[ ]*>'; echo "<|im_end|>") > q1.txt

The mbox roughly shrank in half (~4k words).

Then I ran the attempt below with the following command (the model
understands different prompt formats, this one works fine enough):

  $ ./main -c 0 --temp 0.3 -n -1 --threads 80 -tbd 40 \
  -m /mnt/models/mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf \
  -p "<|im_start|>system
    You're an e-mail summarizaiton bot, please follow the instructions below.<|im_end|>
   <|im_start|>user
   Please summarize the conversation in the DATA section below, focusing on
   the important points, objections and decisions taken, in three to four
   paragraphs. Please do not forget to mention who thinks what, and if
   anyone agrees or objects.<|im_end|>
   " -f q1.txt

Note that you can remove the debugging pollution using --log-disable.
Also, 80 and 40 are the number of CPUs to use when parsing and responding
respectively, they're optimal on this machine but normally you just use
--threads $(nproc) and don't specify -tbd.

This time with the trimmed 2nd-level responses, it took 9m45, and
produced the following, which I found reasonably accurate based on some
visual checks I did in this thread:

| The conversation in the provided data section is centered around a patch
| submitted by Guan-Yu Lin to the Linux kernel mailing list. The patch
| introduces a new sysfs entry called "conditional_skip" that allows the
| system to selectively skip certain device power management state
| transitions. This feature is intended to address conflicts that can
| arise in systems with a main processor and a co-processor, where
| asynchronous controller management can lead to issues such as the main
| processor attempting to suspend a device while the co-processor is
| actively using it.
| 
| Andy Shevchenko raises concerns about the use of an int for the ".event"
| member in struct pm_message, suggesting that it might give subtle errors
| related to promoted signdness. Guan-Yu Lin agrees and asks if they
| should refrain from using bitwise operation here or just change the type
| to u32.
| 
| Florian Fainelli questions the use of a sysfs entry for this feature,
| suggesting that it would be more appropriate to use firmware's
| description for this purpose. Guan-Yu Lin explains that they anticipate
| that control of the peripheral will be shared between operating system
| kernels and that they need to store information about whether another
| operating system is actively using this peripheral. Florian Fainelli
| suggests unbinding and rebinding the device from its driver when it
| needs to be taken over by another operating system.
| 
| Rafael J. Wysocki expresses concerns about the idea of conditionally
| skipping system-wide power management transitions for a device,
| questioning why a device would be skipped in one transition but not
| another and expressing doubts about the overall usefulness of this
| feature. Guan-Yu Lin attempts to clarify their reasoning, but Rafael J.
| Wysocki remains unconvinced.
| 
| Overall, the conversation revolves around the design and implementation
| of a new feature for the Linux kernel that allows for more fine-grained
| control over device power management state transitions in systems with
| multiple processors or co-processors. There are concerns about the use
| of a sysfs entry for this purpose and doubts about the overall
| usefulness of the feature, but Guan-Yu Lin attempts to address these
| concerns and clarify their reasoning throughout the conversation.
| [end of text]

Not bad at all if you want my opinion, and from what I've seen, it
managed to connect questions and comments with the quoted parts. The
speed can be adjusted by trying smaller quantization (e.g. Q3) which
will use less memory bandwidth and sometimes perform more SIMD ops at
once. In addition it can be worth trying smaller models such as its
cousin Mistral-7B-v0.2 which is very good as well, but they generally
have a smaller window. It can be stretched to 32k using -c 32768 but
the quality normally degrades with larger windows. That remains to
be attempted. Some even newer 1.6-3B models exist as well and seem
to summarize well some short texts, but I don't know how it goes when
stretching their context, nor if they're at ease with mbox formats.
Their real benefit would be to run on more common hardware at
acceptable speeds.

Regards,
Willy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28  5:00 ` Willy Tarreau
  2024-02-28 14:03   ` Mark Brown
@ 2024-02-28 15:04   ` Hannes Reinecke
  2024-02-28 15:15     ` Willy Tarreau
  2024-02-28 17:43     ` Jonathan Corbet
  1 sibling, 2 replies; 23+ messages in thread
From: Hannes Reinecke @ 2024-02-28 15:04 UTC (permalink / raw)
  To: Willy Tarreau, Konstantin Ryabitsev; +Cc: users, tools, workflows, Jon Corbet

On 2/28/24 06:00, Willy Tarreau wrote:
> Hi Konstantin,
> 
[ .. ]
> And because of this I didn't find on what boundaries to cut the analysis,
> If it's "since last time I read my email", it can only be done locally
> and will be per-user. If it's a summary of a final thread, it's not
> super interesting and it's better explained (IMHO) on LWN where the
> hot topics are summarized and developed. If it's the list of threads
> of the day, I've suspected that there are so many that it's unlikely
> I'd read all of them every evening or every morning. I've been wondering
> if an interesting approach would be to only summarize long threads,
> since most short ones are a patch, a review and an ACK and do not need
> to be summarized, but I think that most of us seeing a subject repeat
> over many e-mails will just look at a few exchanges there to get an
> idea of what's going on.
> 
[ snip ]

Out of personal interest I really have to chime in here.
You are aware, though, that following this line of thought
and improving the underlying model (which really shouldn't
be too hard given the vast body of training data which is LKML)
we will actually putting LWN and its editors out of business.

I would be _VERY_ interesting having a discussion here with the
relevant parties (Hello Jon!) to see which direction we as a
community will want to go. Maybe Plumbers and/or Kernel Summit in Vienna 
would be a good place to start.

Cheers,

Hannes


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 15:04   ` Hannes Reinecke
@ 2024-02-28 15:15     ` Willy Tarreau
  2024-02-28 17:43     ` Jonathan Corbet
  1 sibling, 0 replies; 23+ messages in thread
From: Willy Tarreau @ 2024-02-28 15:15 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Konstantin Ryabitsev, users, tools, workflows, Jon Corbet

On Wed, Feb 28, 2024 at 04:04:28PM +0100, Hannes Reinecke wrote:
> On 2/28/24 06:00, Willy Tarreau wrote:
> > Hi Konstantin,
> > 
> [ .. ]
> > And because of this I didn't find on what boundaries to cut the analysis,
> > If it's "since last time I read my email", it can only be done locally
> > and will be per-user. If it's a summary of a final thread, it's not
> > super interesting and it's better explained (IMHO) on LWN where the
> > hot topics are summarized and developed. If it's the list of threads
> > of the day, I've suspected that there are so many that it's unlikely
> > I'd read all of them every evening or every morning. I've been wondering
> > if an interesting approach would be to only summarize long threads,
> > since most short ones are a patch, a review and an ACK and do not need
> > to be summarized, but I think that most of us seeing a subject repeat
> > over many e-mails will just look at a few exchanges there to get an
> > idea of what's going on.
> > 
> [ snip ]
> 
> Out of personal interest I really have to chime in here.
> You are aware, though, that following this line of thought
> and improving the underlying model (which really shouldn't
> be too hard given the vast body of training data which is LKML)
> we will actually putting LWN and its editors out of business.

No, really, and I mentioned it in the first message, LWN provides good
explanations, links to articles, personal opinions, references etc. I
wouldn't put LWN and a chatbot on the same level at all! One has skilled
writers knowing their topic and explaining how stuff works and why
certain choices are made, the other one is basically the chatty coworker
that everyone prefers not to hear for too long.

Here it's "just" a rough summary of a conversation and it's not capable
of dealing with too large a context. I really love to read LWN to *learn*
some very interesting stuff. A summary like the ones Konstantin and I
shared is more aimed at letting you decide if you're interested in going
deepeer into that thread. And BTW my initial experiments were clearly on
this: "am I interested in reading that thread or should I skip it".

> I would be _VERY_ interesting having a discussion here with the
> relevant parties (Hello Jon!) to see which direction we as a
> community will want to go. Maybe Plumbers and/or Kernel Summit in Vienna
> would be a good place to start.

Most likely, yes.

Willy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 14:03   ` Mark Brown
  2024-02-28 14:39     ` Willy Tarreau
@ 2024-02-28 15:22     ` Konstantin Ryabitsev
  2024-02-28 15:29       ` Willy Tarreau
  1 sibling, 1 reply; 23+ messages in thread
From: Konstantin Ryabitsev @ 2024-02-28 15:22 UTC (permalink / raw)
  To: Mark Brown; +Cc: Willy Tarreau, users, tools, workflows

On Wed, Feb 28, 2024 at 02:03:31PM +0000, Mark Brown wrote:
> > I figured a number of shortcomings about this: I suspect that those
> > most interested in such output are either, a bit like me, not much
> > active on kernel development, or focus on a specific area and mostly
> > want to stay aware of ongoing changes in other areas they're really
> > not familiar with.
> 
> I can imagine using this sort of thing for the case where I get to my
> inbox in the morning and there's some enormous thread appeared overnight
> with people arguing and I'm trying to get a handle on what the various
> subthreads are all about.  The demo didn't cover exactly that case but
> it looked like it might be able to give some sort of useful steer.

Another use for this that I could think is a way to summarize digests.
Currently, if you choose a digest subscription, you will receive a single
email with message subjects and all the new messages as individual
attachments. It would be interesting to see if we can send out a "here's
what's new" summary with links to threads instead.

The challenge would be to do it in a way that doesn't bankrupt LFIT in the
process. :)

-K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 15:22     ` Konstantin Ryabitsev
@ 2024-02-28 15:29       ` Willy Tarreau
  2024-02-28 17:52         ` Konstantin Ryabitsev
  0 siblings, 1 reply; 23+ messages in thread
From: Willy Tarreau @ 2024-02-28 15:29 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Mark Brown, users, tools, workflows

On Wed, Feb 28, 2024 at 10:22:41AM -0500, Konstantin Ryabitsev wrote:
> On Wed, Feb 28, 2024 at 02:03:31PM +0000, Mark Brown wrote:
> > > I figured a number of shortcomings about this: I suspect that those
> > > most interested in such output are either, a bit like me, not much
> > > active on kernel development, or focus on a specific area and mostly
> > > want to stay aware of ongoing changes in other areas they're really
> > > not familiar with.
> > 
> > I can imagine using this sort of thing for the case where I get to my
> > inbox in the morning and there's some enormous thread appeared overnight
> > with people arguing and I'm trying to get a handle on what the various
> > subthreads are all about.  The demo didn't cover exactly that case but
> > it looked like it might be able to give some sort of useful steer.
> 
> Another use for this that I could think is a way to summarize digests.
> Currently, if you choose a digest subscription, you will receive a single
> email with message subjects and all the new messages as individual
> attachments. It would be interesting to see if we can send out a "here's
> what's new" summary with links to threads instead.

Indeed!

> The challenge would be to do it in a way that doesn't bankrupt LFIT in the
> process. :)

That's exactly why it would make sense to invest in one large machine
and let it operate locally while "only" paying the power bill.

Willy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 15:04   ` Hannes Reinecke
  2024-02-28 15:15     ` Willy Tarreau
@ 2024-02-28 17:43     ` Jonathan Corbet
  2024-02-28 18:52       ` Alex Elder
  1 sibling, 1 reply; 23+ messages in thread
From: Jonathan Corbet @ 2024-02-28 17:43 UTC (permalink / raw)
  To: Hannes Reinecke, Willy Tarreau, Konstantin Ryabitsev
  Cc: users, tools, workflows

Hannes Reinecke <hare@suse.de> writes:

> Out of personal interest I really have to chime in here.
> You are aware, though, that following this line of thought
> and improving the underlying model (which really shouldn't
> be too hard given the vast body of training data which is LKML)
> we will actually putting LWN and its editors out of business.
>
> I would be _VERY_ interesting having a discussion here with the
> relevant parties (Hello Jon!) to see which direction we as a
> community will want to go. Maybe Plumbers and/or Kernel Summit in Vienna 
> would be a good place to start.

Trust me, if you're in a business that is in any way associated with
publishing, you're aware of such things.

Whether the kernel community goes in for a tool like this will
ultimately have little effect on LWN's fate - others will certainly pick
this stuff up regardless.  There are, after all, certain other sites out
there reporting on the kernel that have looked like bad LLM output for
years anyway...:)

I worry about some of this stuff in general.  It seems there's a
definite risk of creating a checkpatchGPT that is allowed to push
developers around far more than would be warranted, for example.  But
the tools are out there and getting cheaper; if they are useful I guess
we should use them.

It would be nicer to base them on an open-source (or something close to
it) model, of course.

Meanwhile, in the short term I suspect LWN can find a place for itself.
After that I hope to be retired.

Thanks,

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 15:29       ` Willy Tarreau
@ 2024-02-28 17:52         ` Konstantin Ryabitsev
  2024-02-28 17:58           ` Willy Tarreau
  0 siblings, 1 reply; 23+ messages in thread
From: Konstantin Ryabitsev @ 2024-02-28 17:52 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Mark Brown, users, tools, workflows

On Wed, Feb 28, 2024 at 04:29:53PM +0100, Willy Tarreau wrote:
> > Another use for this that I could think is a way to summarize digests.
> > Currently, if you choose a digest subscription, you will receive a single
> > email with message subjects and all the new messages as individual
> > attachments. It would be interesting to see if we can send out a "here's
> > what's new" summary with links to threads instead.
> 
> Indeed!
> 
> > The challenge would be to do it in a way that doesn't bankrupt LFIT in the
> > process. :)
> 
> That's exactly why it would make sense to invest in one large machine
> and let it operate locally while "only" paying the power bill.

I'm not sure how realistic this is, if it takes 10 minutes to process a single
4000-word thread. :) With ChatGPT it would probably cost thousands of dollars
daily if we did this for large lists (and it doesn't really make sense to do
this on small lists anyway, as the whole purpose behind the idea is to
summarize lists with lots of traffic).

For the moment, I will document how I got this working and maybe look into
further shrinking the amount of data that would be needed to be sent to the
LLM. I will definitely need to make it easy to use a local model, since
relying on a proprietary service (of questionable repute in the eyes of many)
would not be in the true spirit of what we are all trying to do here. As I
said, I was mostly toying around with $25 worth credits that I had with
OpenAI.

-K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 17:52         ` Konstantin Ryabitsev
@ 2024-02-28 17:58           ` Willy Tarreau
  2024-02-28 19:16             ` Konstantin Ryabitsev
  0 siblings, 1 reply; 23+ messages in thread
From: Willy Tarreau @ 2024-02-28 17:58 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Mark Brown, users, tools, workflows

On Wed, Feb 28, 2024 at 12:52:43PM -0500, Konstantin Ryabitsev wrote:
> On Wed, Feb 28, 2024 at 04:29:53PM +0100, Willy Tarreau wrote:
> > > Another use for this that I could think is a way to summarize digests.
> > > Currently, if you choose a digest subscription, you will receive a single
> > > email with message subjects and all the new messages as individual
> > > attachments. It would be interesting to see if we can send out a "here's
> > > what's new" summary with links to threads instead.
> > 
> > Indeed!
> > 
> > > The challenge would be to do it in a way that doesn't bankrupt LFIT in the
> > > process. :)
> > 
> > That's exactly why it would make sense to invest in one large machine
> > and let it operate locally while "only" paying the power bill.
> 
> I'm not sure how realistic this is, if it takes 10 minutes to process a single
> 4000-word thread. :)

I know. People are getting way better perfs with GPUs as well as on Macs
particularly. I have not investigated such options at all, I'm only
relying on commodity hardware. I shared the commands so that those
interested and with the hardware can attempt it as well. I don't know
how far we can shrink that time.

> With ChatGPT it would probably cost thousands of dollars
> daily if we did this for large lists (and it doesn't really make sense to do
> this on small lists anyway, as the whole purpose behind the idea is to
> summarize lists with lots of traffic).

Sure.

> For the moment, I will document how I got this working and maybe look into
> further shrinking the amount of data that would be needed to be sent to the
> LLM. I will definitely need to make it easy to use a local model, since
> relying on a proprietary service (of questionable repute in the eyes of many)
> would not be in the true spirit of what we are all trying to do here.

I tend to think that these solutions will evolve very quickly both hosted
and local, and it's prudent not to stick to a single approach anyway.

> As I
> said, I was mostly toying around with $25 worth credits that I had with
> OpenAI.

And that was a great experience showing really interesting results!

Cheers,
Willy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 17:43     ` Jonathan Corbet
@ 2024-02-28 18:52       ` Alex Elder
  0 siblings, 0 replies; 23+ messages in thread
From: Alex Elder @ 2024-02-28 18:52 UTC (permalink / raw)
  To: Jonathan Corbet, Hannes Reinecke, Willy Tarreau, Konstantin Ryabitsev
  Cc: users, tools, workflows

On 2/28/24 11:43 AM, Jonathan Corbet wrote:
> Meanwhile, in the short term I suspect LWN can find a place for itself.

I am certain that ChatGPT and the like will *not* replace LWN
any time soon--not at *all*.  The human analysis and ownership
provides enormous value.

> After that I hope to be retired.

Nooo!!!  But yeah, we're all getting older.

					-Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-27 22:32 Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration) Konstantin Ryabitsev
                   ` (2 preceding siblings ...)
  2024-02-28  5:00 ` Willy Tarreau
@ 2024-02-28 18:55 ` Bart Van Assche
  2024-02-29  7:18   ` Hannes Reinecke
  2024-02-29  9:30   ` James Bottomley
  2024-02-28 19:32 ` Luis Chamberlain
  4 siblings, 2 replies; 23+ messages in thread
From: Bart Van Assche @ 2024-02-28 18:55 UTC (permalink / raw)
  To: Konstantin Ryabitsev, users, tools, workflows

On 2/27/24 14:32, Konstantin Ryabitsev wrote:
> I was playing with shell-gpt and wrote a quickie integration that would allow
> retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
> asking it to provide some basic analysis of the thread contents. Here's a
> recorded demo session:
> 
> https://asciinema.org/a/643435
> 
> A few notes:
> 
> 1. This is obviously not a replacement for actually reading email, but can
>     potentially be a useful asset for a busy maintainer who just wants a quick
>     summary of a lengthy thread before they look at it in detail.
> 2. This is not free or cheap! To digest a lengthy thread, you can expect
>     ChatGPT to generate enough tokens to cost you $1 or more in API usage fees.
>     I know it's nothing compared to how expensive some of y'all's time is, and
>     you can probably easily get that expensed by your employers, but for many
>     others it's a pretty expensive toy. I managed to make it a bit cheaper by
>     doing some surgery on the threads before feeding them to chatgpt (like
>     removing most of the message headers and throwing out some of the quoted
>     content), but there's a limit to how much we can throw out before the
>     analysis becomes dramatically less useful.
> 3. This only works with ChatGPT-4, as most threads are too long for
>     ChatGPT-3.5 to even process.
> 
> So, the question is -- is this useful at all? Am I wasting time poking in this
> direction, or is this something that would be of benefit to any of you? If the
> latter, I will document how to set this up and commit the thread minimization
> code I hacked together to make it cheaper.

Please do not publish the summaries generated by ChatGPT on the web. If
these summaries would be published on the world wide web, ChatGPT or
other LLMs probably would use these summaries as input data. If there
would be any mistakes in these summaries, then these mistakes would end
up being used as input data by multiple LLMs.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 17:58           ` Willy Tarreau
@ 2024-02-28 19:16             ` Konstantin Ryabitsev
  0 siblings, 0 replies; 23+ messages in thread
From: Konstantin Ryabitsev @ 2024-02-28 19:16 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Mark Brown, users, tools, workflows

On Wed, Feb 28, 2024 at 06:58:05PM +0100, Willy Tarreau wrote:
> > For the moment, I will document how I got this working and maybe look into
> > further shrinking the amount of data that would be needed to be sent to the
> > LLM. I will definitely need to make it easy to use a local model, since
> > relying on a proprietary service (of questionable repute in the eyes of many)
> > would not be in the true spirit of what we are all trying to do here.
> 
> I tend to think that these solutions will evolve very quickly both hosted
> and local, and it's prudent not to stick to a single approach anyway.

Well, in theory, and given time and budget, we can totally imagine a service
for maintainers that is trained on the entirety of lore and continuously
updated. It would be like lei on steroids.

Imagine a session like the following, typed by an imaginary maintainer:

    >>> Are there any new patches posted for me?

    Using your previously specified criteria, the following relevant series have
    been submitted since your last inquiry:

     1. [PATCH 0/5] foodrv: fixes to match the kernel coding style
     2. [PATCH v2] foodrv: initial support for foodev-alpha3000
     3. [PATCH v3 000/255] mm: remove the FLUB allocator
     4. [PATCH RFC v17 00/23] foodrv: rewrite in rust

    >>> Remove any series that received negative reviews and any threads that I already replied to.

    After removing the patch series that received negative reviews from automated
    testing tools and the series containing your reply, the following threads
    remain:

     1. [PATCH 0/5] foodrv: fixes to match the kernel coding style
     2. [PATCH v2] foodrv: initial support for foodev-alpha3000

    >>> Apply these series to my tree in individual branches and queue tests.

     1. Applied as review/foodrv-fixes-match-kernel-coding-style
     2. Applied as review/v2-foodrv-initial-support-foodev-alpha3000
     3. Pushed to ci.corpforge.example.com for automated testing

    >>> Any new relevant discussions?

    Using your previously specified criteria, the following new discussions are
    relevant to you:

     1. CVE-5555-1212: denial of service in foodrv device drivers
     2. [Bug 5551212] foodrv melted my keyboard!
     3. Is anyone else getting weird results running test t/barbaz_123?

    >>> Summarize the bug report.

    The bug report, submitted by Krazy Person <krazy@example.com>, claims that the
    version of foodrv shipping with stable kernel 6.6.18 melted their keyboard due
    to sonar feedback loop between their tinfoil hat and the CIA monitoring
    devices implanted in their USB cables.

    >>> Close this bug report as invalid. Politely explain that this is not possible.

    Posted a new comment from Busy Maintainer:

    Hello, thank you for your bug report. At this time, we do not believe that
    interaction between foodrv and the keyboard would lead to a thermal event
    of this magnitude.

    Bug 5551212 closed as RESOLVED/INVALID.

    >>> Send the other two threads to my inbox and include any future updates.

    lei: sending to your inbox:

     1. CVE-5555-1212: denial of service in foodrv device drivers (5 messages)
     2. Is anyone else getting weird results running test t/barbaz_123? (72 messages)

    lei: configured to retrieve any further thread updates

    >>> (etc)

Large parts of this can be accomplished today already, some with minor
integration functions (lei, bugtracker, git, etc). We should, of course, worry
about LLM hallucinating or just being plain buggy, but we can certainly
imagine a scenario where this is a usable workflow alongside the more
traditional reading and replying to mail, and it's not even from the
realm of some trek-inspired fantasy.

-K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-27 22:32 Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration) Konstantin Ryabitsev
                   ` (3 preceding siblings ...)
  2024-02-28 18:55 ` Bart Van Assche
@ 2024-02-28 19:32 ` Luis Chamberlain
  4 siblings, 0 replies; 23+ messages in thread
From: Luis Chamberlain @ 2024-02-28 19:32 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Sasha Levin; +Cc: users, tools, workflows

On Tue, Feb 27, 2024 at 05:32:34PM -0500, Konstantin Ryabitsev wrote:
> Hi, all:
> 
> I was playing with shell-gpt and wrote a quickie integration that would allow
> retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
> asking it to provide some basic analysis of the thread contents. Here's a
> recorded demo session:
> 
> https://asciinema.org/a/643435
> 
> A few notes:
> 
> 1. This is obviously not a replacement for actually reading email, but can
>    potentially be a useful asset for a busy maintainer who just wants a quick
>    summary of a lengthy thread before they look at it in detail.
> 2. This is not free or cheap! To digest a lengthy thread, you can expect
>    ChatGPT to generate enough tokens to cost you $1 or more in API usage fees.
>    I know it's nothing compared to how expensive some of y'all's time is, and
>    you can probably easily get that expensed by your employers, but for many
>    others it's a pretty expensive toy. I managed to make it a bit cheaper by
>    doing some surgery on the threads before feeding them to chatgpt (like
>    removing most of the message headers and throwing out some of the quoted
>    content), but there's a limit to how much we can throw out before the
>    analysis becomes dramatically less useful.
> 3. This only works with ChatGPT-4, as most threads are too long for
>    ChatGPT-3.5 to even process.
> 
> So, the question is -- is this useful at all? Am I wasting time poking in this
> direction, or is this something that would be of benefit to any of you? If the
> latter, I will document how to set this up and commit the thread minimization
> code I hacked together to make it cheaper.

While I probably wouldn't use it day to day, I expect younger
generations might use this more than us older generations to be more
productive, even if they get halluciations.

An LLM trained with more data relevant to patches might be more
suitable, and it is why I wanted the tooling for stable candidate patches
to be opened up, so to enable more exploring in areas like this.

A use case example might be training for identifying subsystems with
more memory safety issues.

Another might be to help to summarize further pull requests in one or
two sentences, or optionally few bullets. So for instance, I try to
document major bullet list changes for modules here:

https://kernelnewbies.org/KernelProjects/modules

So it is easier to track / go down memory lane. Doing this automatically
would allow me to use a tool to do this.

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28  0:43 ` Linus Torvalds
@ 2024-02-28 20:46   ` Shuah Khan
  2024-02-29  0:33   ` James Bottomley
  1 sibling, 0 replies; 23+ messages in thread
From: Shuah Khan @ 2024-02-28 20:46 UTC (permalink / raw)
  To: Linus Torvalds, Konstantin Ryabitsev; +Cc: users, tools, workflows, Shuah Khan

On 2/27/24 17:43, Linus Torvalds wrote:
> On Tue, 27 Feb 2024 at 14:32, Konstantin Ryabitsev
> <konstantin@linuxfoundation.org> wrote:
>>
>> So, the question is -- is this useful at all?
> 
> I find the demo intriguing. Useful? Who knows, but the potential is
> certainly there.
> 
> The summary didn't seem to take Rafael's last (fairly strongly
> negative) email into account as much as maybe it should have been (my
> first reaction was that maybe Rafael's last email hadn't made it into
> the summary, but going back to it I think it was there, just not
> nearly as forcefully as Rafael's email read to me).
> 
> I personally would find it more intriguing still if it actually
> reacted to the code instead of just the discussion, but I suspect
> that's still some way away.
> 
> IOW - interesting. But hard to tell how useful and how maintainers
> would actually end up using it.
> 
> Would it be something where when you bring a new developer into a long
> discussion, you'd introduce him with a summary of what has gone
> before?  I'm not sure what the use cases would be in practice.
> 

Konstantin,

As I was watching the demo, the same thought popped into my head that
this could be useful to a new developer. I am starting a mentoring
session on March 1st with 34 new developers.

Probably something we could experiment with to see if this helps them?

thanks,
-- Shuah



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28  0:43 ` Linus Torvalds
  2024-02-28 20:46   ` Shuah Khan
@ 2024-02-29  0:33   ` James Bottomley
  1 sibling, 0 replies; 23+ messages in thread
From: James Bottomley @ 2024-02-29  0:33 UTC (permalink / raw)
  To: Linus Torvalds, Konstantin Ryabitsev; +Cc: users, tools, workflows

On Tue, 2024-02-27 at 16:43 -0800, Linus Torvalds wrote:
> I personally would find it more intriguing still if it actually
> reacted to the code instead of just the discussion, but I suspect
> that's still some way away.

So this bit may be closer than you think.  The Software Heritage Org
announced a LLM trained on all FOSS code in the archive:

https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/

So it should be possible for Konstantin to plug into it.  The question,
as with all static checkers, would be whether it produces a decent
enough signal to noise ratio for us to find it useful.  I bet in the
early days it won't so someone will have to help curate it and improve
the model as a full time job, but I'm sure the SW Heritage folks would
love to help.

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 18:55 ` Bart Van Assche
@ 2024-02-29  7:18   ` Hannes Reinecke
  2024-02-29  8:37     ` Theodore Ts'o
  2024-03-01  1:13     ` Bart Van Assche
  2024-02-29  9:30   ` James Bottomley
  1 sibling, 2 replies; 23+ messages in thread
From: Hannes Reinecke @ 2024-02-29  7:18 UTC (permalink / raw)
  To: Bart Van Assche, Konstantin Ryabitsev, users, tools, workflows

On 2/28/24 19:55, Bart Van Assche wrote:
> On 2/27/24 14:32, Konstantin Ryabitsev wrote:
>> I was playing with shell-gpt and wrote a quickie integration that 
>> would allow
>> retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
>> asking it to provide some basic analysis of the thread contents. Here's a
>> recorded demo session:
>>
>> https://asciinema.org/a/643435
>>
>> A few notes:
>>
>> 1. This is obviously not a replacement for actually reading email, but 
>> can
>>     potentially be a useful asset for a busy maintainer who just wants 
>> a quick
>>     summary of a lengthy thread before they look at it in detail.
>> 2. This is not free or cheap! To digest a lengthy thread, you can expect
>>     ChatGPT to generate enough tokens to cost you $1 or more in API 
>> usage fees.
>>     I know it's nothing compared to how expensive some of y'all's time 
>> is, and
>>     you can probably easily get that expensed by your employers, but 
>> for many
>>     others it's a pretty expensive toy. I managed to make it a bit 
>> cheaper by
>>     doing some surgery on the threads before feeding them to chatgpt 
>> (like
>>     removing most of the message headers and throwing out some of the 
>> quoted
>>     content), but there's a limit to how much we can throw out before the
>>     analysis becomes dramatically less useful.
>> 3. This only works with ChatGPT-4, as most threads are too long for
>>     ChatGPT-3.5 to even process.
>>
>> So, the question is -- is this useful at all? Am I wasting time poking 
>> in this
>> direction, or is this something that would be of benefit to any of 
>> you? If the
>> latter, I will document how to set this up and commit the thread 
>> minimization
>> code I hacked together to make it cheaper.
> 
> Please do not publish the summaries generated by ChatGPT on the web. If
> these summaries would be published on the world wide web, ChatGPT or
> other LLMs probably would use these summaries as input data. If there
> would be any mistakes in these summaries, then these mistakes would end
> up being used as input data by multiple LLMs.
> 
Now there's a thought. Maybe we should do exactly the opposite, and 
posting _more_ ChatGPT generated content on the web?
Sending them into a deadly self-enforcing feedback loop?

But that's probably beside the point.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-29  7:18   ` Hannes Reinecke
@ 2024-02-29  8:37     ` Theodore Ts'o
  2024-03-01  1:13     ` Bart Van Assche
  1 sibling, 0 replies; 23+ messages in thread
From: Theodore Ts'o @ 2024-02-29  8:37 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Bart Van Assche, Konstantin Ryabitsev, users, tools, workflows

On Thu, Feb 29, 2024 at 08:18:43AM +0100, Hannes Reinecke wrote:
> On 2/28/24 19:55, Bart Van Assche wrote:
> > On 2/27/24 14:32, Konstantin Ryabitsev wrote:
> > Please do not publish the summaries generated by ChatGPT on the web. If
> > these summaries would be published on the world wide web, ChatGPT or
> > other LLMs probably would use these summaries as input data. If there
> > would be any mistakes in these summaries, then these mistakes would end
> > up being used as input data by multiple LLMs.
> > 
> Now there's a thought. Maybe we should do exactly the opposite, and posting
> _more_ ChatGPT generated content on the web?
> Sending them into a deadly self-enforcing feedback loop?

Well, I'll note that last July, when a number of AI companies,
including Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and
OpenAI, met with President Biden at the White House, they made a
commitment to develop watermarking standards to allow AI generated
contexted to be detected[1].  Obviously, it's a lot easier to do this
with images, and Google was the first company to release a
watermarking system for AI-generated images[2].  However, there is
research on-going on how to add watermarking to text[3].

[1] https://www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/
[2] https://www.technologyreview.com/2023/08/29/1078620/google-deepmind-has-launched-a-watermarking-tool-for-ai-generated-images/
[3] https://www.nytimes.com/interactive/2023/02/17/business/ai-text-detection.html

I doubt whether anything we do is going to make a huge difference; one
of the largest uses of OpenAI's ChatGPT is to generate text to enable
Search Engine Optimization spam[4].  Another major use of LLM is to
lay off journalists by creating text explaining why a particular stock
when up by X% when the market went up or down by Y%.  After all, why
have to have a human making up stories explaining stock moves, when
you can have an AI model hallucinate them instead?  :-)

[4] https://www.opace.co.uk/blog/blog/how-openai-gpt-3-enhances-ai-chat-text-generation-for-seo

The bottom line is that there is a vast amount of AI-generated text
that has been put out on the web *already*.  This is going to be
poisoning future LLM training, even before we start generating
summaries of LKML traffic and making them available on the web.  It
also means that companies who are doing AI work have a large, vested
interest in develop stardized ways of watermarking AI-generated
context --- not just because they made a promise to some politicians,
but if all the companies can use some common watermarking standard,
hopefully they can all avoid this self-poisoning feedback loop.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-28 18:55 ` Bart Van Assche
  2024-02-29  7:18   ` Hannes Reinecke
@ 2024-02-29  9:30   ` James Bottomley
  1 sibling, 0 replies; 23+ messages in thread
From: James Bottomley @ 2024-02-29  9:30 UTC (permalink / raw)
  To: Bart Van Assche, Konstantin Ryabitsev, users, tools, workflows

On Wed, 2024-02-28 at 10:55 -0800, Bart Van Assche wrote:
> On 2/27/24 14:32, Konstantin Ryabitsev wrote:
> > I was playing with shell-gpt and wrote a quickie integration that
> > would allow
> > retrieving (slimmed-down) threads from lore, feeding them to
> > ChatGPT, and
> > asking it to provide some basic analysis of the thread contents.
> > Here's a
> > recorded demo session:
> > 
> > https://asciinema.org/a/643435
> > 
> > A few notes:
> > 
> > 1. This is obviously not a replacement for actually reading email,
> > but can
> >     potentially be a useful asset for a busy maintainer who just
> > wants a quick
> >     summary of a lengthy thread before they look at it in detail.
> > 2. This is not free or cheap! To digest a lengthy thread, you can
> > expect
> >     ChatGPT to generate enough tokens to cost you $1 or more in API
> > usage fees.
> >     I know it's nothing compared to how expensive some of y'all's
> > time is, and
> >     you can probably easily get that expensed by your employers,
> > but for many
> >     others it's a pretty expensive toy. I managed to make it a bit
> > cheaper by
> >     doing some surgery on the threads before feeding them to
> > chatgpt (like
> >     removing most of the message headers and throwing out some of
> > the quoted
> >     content), but there's a limit to how much we can throw out
> > before the
> >     analysis becomes dramatically less useful.
> > 3. This only works with ChatGPT-4, as most threads are too long for
> >     ChatGPT-3.5 to even process.
> > 
> > So, the question is -- is this useful at all? Am I wasting time
> > poking in this direction, or is this something that would be of
> > benefit to any of you? If the latter, I will document how to set
> > this up and commit the thread minimization code I hacked together
> > to make it cheaper.
> 
> Please do not publish the summaries generated by ChatGPT on the web.
> If these summaries would be published on the world wide web, ChatGPT
> or other LLMs probably would use these summaries as input data. If
> there would be any mistakes in these summaries, then these mistakes
> would end up being used as input data by multiple LLMs.

I don't believe this is true: any output from an LLM trained on the web
will have only add a neutral bias to the existing web content (it won't
push a learning model one way or another because it's the output
summary of the current learning).  Or to put it another way if mistakes
are made in the summary because of the training, training a model on
the mistaken output won't increase (or decrease) the number of mistakes
it makes next time.  Now if the model was only partially trained it
will bias towards the partial training, but most models try to be fully
trained.

James


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
  2024-02-29  7:18   ` Hannes Reinecke
  2024-02-29  8:37     ` Theodore Ts'o
@ 2024-03-01  1:13     ` Bart Van Assche
  1 sibling, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2024-03-01  1:13 UTC (permalink / raw)
  To: Hannes Reinecke, Konstantin Ryabitsev, users, tools, workflows

On 2/28/24 23:18, Hannes Reinecke wrote:
> Now there's a thought. Maybe we should do exactly the opposite, and 
> posting _more_ ChatGPT generated content on the web? Sending them
> into a deadly self-enforcing feedback loop?

That's a rather negative view. Besides concerns about the energy spent
on training LLMs, in my opinion LLMs can be very useful. If I need to
look up information about a kernel subsystem that I'm not very familiar
with, I type in the question in the user interface of an LLM. While it
can be very hard to look up information in mailing list archives with
a traditional web search engine, LLMs can give accurate answers quickly.
Here is an example of such a question: "Why does the Linux kernel cgroup
implementation propagate statistics to parent cgroups periodically
instead of immediately?" Both ChatGPT and Gemini produce reasonable
answers to this question but the web search engines that I tried did not.

Bart.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-03-01  1:13 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-27 22:32 Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration) Konstantin Ryabitsev
2024-02-27 23:35 ` Junio C Hamano
2024-02-28  0:43 ` Linus Torvalds
2024-02-28 20:46   ` Shuah Khan
2024-02-29  0:33   ` James Bottomley
2024-02-28  5:00 ` Willy Tarreau
2024-02-28 14:03   ` Mark Brown
2024-02-28 14:39     ` Willy Tarreau
2024-02-28 15:22     ` Konstantin Ryabitsev
2024-02-28 15:29       ` Willy Tarreau
2024-02-28 17:52         ` Konstantin Ryabitsev
2024-02-28 17:58           ` Willy Tarreau
2024-02-28 19:16             ` Konstantin Ryabitsev
2024-02-28 15:04   ` Hannes Reinecke
2024-02-28 15:15     ` Willy Tarreau
2024-02-28 17:43     ` Jonathan Corbet
2024-02-28 18:52       ` Alex Elder
2024-02-28 18:55 ` Bart Van Assche
2024-02-29  7:18   ` Hannes Reinecke
2024-02-29  8:37     ` Theodore Ts'o
2024-03-01  1:13     ` Bart Van Assche
2024-02-29  9:30   ` James Bottomley
2024-02-28 19:32 ` Luis Chamberlain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.