All of lore.kernel.org
 help / color / mirror / Atom feed
* Stats on Xen tarball downloads
@ 2024-02-19 10:01 George Dunlap
  2024-02-19 10:31 ` Roger Pau Monné
  2024-02-19 15:34 ` Elliott Mitchell
  0 siblings, 2 replies; 9+ messages in thread
From: George Dunlap @ 2024-02-19 10:01 UTC (permalink / raw)
  To: Xen-devel; +Cc: committers, Kelly Choi

[-- Attachment #1: Type: text/plain, Size: 3282 bytes --]

Hey all,

One of the questions we had with respect to changing our release
practice (for instance, making the process more light-weight so that
we could do a point release after every XSA) was, "How many people are
actually using the tarballs?"  I finally got access (again) to
downloads.xenproject.org, and took a look at the logs.  It appears
that we only keep about 2 weeks of logs there.

Short answer: It's pretty clear from looking at the logs that there
are large numbers of automated build systems building various versions
of Xen from tarballs.  It *looks* like there are over 300 people a
week downloading 4.18.0 specifically from various web browsers.

Attached is a report generated by goaccess on the log after filtering
for xen-4.18.0.tar.gz (which includes the .sig), as well as a report
filtering on any Xen release (again including .sig files).

Between 6 and 19 Feb, the xen-4.18.0 tarball was downloaded by 704
"visitors" (unique by IP address, user agent, and date).  There are a
handful of IPs that had more than one "visit', but the vast majority
had only one "visit".  (The "Visitor Hostname and IPs" tab is somewhat
confusing on this point; it lists 800+ visits, but only 366 unique IP
addresses.  I verified independently that there were around 800 unique
IP addresses in the logs, so this may just be a limit as to how much
data is included in the report.)

Of the user agents, only 48 of these visits are classified as
"crawlers"; the vast majority are (in order) Chrome, Firefox, Edge,
Opera, or Safari.  There are a handful (<20) of downloads from
non-browser, non-bot/crawler agents (all wget).

Looking at the *non*-4.18 downloads, nearly all of them have user
agents that make it clear they're part of automated build systems:
user agents like curl and wget, but also "Go-http-client", "libfetch",
and "ansible-http".  There are several references to package managers
as well (xbps, pkgmon, slackrepo).  What is *not* significantly
represented are user-agent strings that look like web browsers; there
are intermittent ones, but not very many.

It's not really clear to me why we'd be getting 300-ish people
downloading the Xen 4.18.0 tarball, 2/3 of which are on Windows.  But
then I'm also not sure why someone would *fake* hundreds of downloads
a week from unique IP addresses; and in particular, if you were going
to fake hundreds of downloads a week, I'm not sure why you'd only fake
the most recent release.

I think we can fairly conclusively conclude that there are regular
users of older versions of the Xen tarballs, as part of automated
build systems which could be disrupted by any significant change to
the way the tarballs worked.

I think we can *tentatively* conclude that there are hundreds of
people per week downloading the most recent release tarball via the
website.  It would be interesting to see if we could determine some
way of trying to evaluate how many of those resulted in a build (e.g.,
by looking at "extfiles" downloads or something).

More conclusions than that (e.g., whether it's worth changing the
tarball layout to make it more automate-able, whether it's worth
investing time making the build-from-tarball experience better, and/or
pointing people to better ways to get Xen) I haven't considered yet.

 -George

[-- Attachment #2: 4-18-report.html --]
[-- Type: text/html, Size: 656135 bytes --]

[-- Attachment #3: report.html --]
[-- Type: text/html, Size: 701205 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-19 10:01 Stats on Xen tarball downloads George Dunlap
@ 2024-02-19 10:31 ` Roger Pau Monné
  2024-02-19 10:38   ` Jan Beulich
  2024-02-19 15:34 ` Elliott Mitchell
  1 sibling, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2024-02-19 10:31 UTC (permalink / raw)
  To: George Dunlap; +Cc: Xen-devel, committers, Kelly Choi

On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
> Hey all,
> 
> One of the questions we had with respect to changing our release
> practice (for instance, making the process more light-weight so that
> we could do a point release after every XSA) was, "How many people are
> actually using the tarballs?"

What would this more lightweight process involve from a downstream
PoV?  IOW: in what would the contents of the tarball change compared
to the current releases?

> I finally got access (again) to
> downloads.xenproject.org, and took a look at the logs.  It appears
> that we only keep about 2 weeks of logs there.
> 
> Short answer: It's pretty clear from looking at the logs that there
> are large numbers of automated build systems building various versions
> of Xen from tarballs.  It *looks* like there are over 300 people a
> week downloading 4.18.0 specifically from various web browsers.

As someone who packages Xen for FreeBSD, I've recently switched the
build to use the git sources directly, as otherwise keeping up with
XSA tends to be a pain, specially when XSAs happen to depend on the
context of some of the backports that happened between the point
release and the XSA disclosure.

Overall as a consumer of Xen it would be helpful if we could make a
release for each (batch) or XSAs, as that would possibly make me
switch to build from the release tarballs instead of git.

I don't think it would be much of a disruption if such change to
generate more lightweight tarball is done starting from a major
release (ie: 4.19) and minor releases of previous versions (4.18.x)
are kept using the non-lightweight process.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-19 10:31 ` Roger Pau Monné
@ 2024-02-19 10:38   ` Jan Beulich
  2024-02-21  2:55     ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2024-02-19 10:38 UTC (permalink / raw)
  To: Roger Pau Monné, George Dunlap; +Cc: Xen-devel, committers, Kelly Choi

On 19.02.2024 11:31, Roger Pau Monné wrote:
> On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
>> One of the questions we had with respect to changing our release
>> practice (for instance, making the process more light-weight so that
>> we could do a point release after every XSA) was, "How many people are
>> actually using the tarballs?"
> 
> What would this more lightweight process involve from a downstream
> PoV?  IOW: in what would the contents of the tarball change compared
> to the current releases?

From all prior discussion my conclusion was "no tarball at all".

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-19 10:01 Stats on Xen tarball downloads George Dunlap
  2024-02-19 10:31 ` Roger Pau Monné
@ 2024-02-19 15:34 ` Elliott Mitchell
  2024-02-21  2:52   ` George Dunlap
  1 sibling, 1 reply; 9+ messages in thread
From: Elliott Mitchell @ 2024-02-19 15:34 UTC (permalink / raw)
  To: George Dunlap; +Cc: Xen-devel, committers, Kelly Choi

On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
> 
> Looking at the *non*-4.18 downloads, nearly all of them have user
> agents that make it clear they're part of automated build systems:
> user agents like curl and wget, but also "Go-http-client", "libfetch",
                   ^^^^     ^^^^

I reject this claim.  `curl` or `wget` could be part of an interactive
operation.  Telling a browser to copy a URL into the paste buffer, then
using `wget`/`curl` is entirely possible.  I may be the outlier, but I
routinely do this.

I don't know whether Gentoo's `emerge` uses `wget`/`curl`, but that could
be semi-interactive.


> It's not really clear to me why we'd be getting 300-ish people
> downloading the Xen 4.18.0 tarball, 2/3 of which are on Windows.  But
> then I'm also not sure why someone would *fake* hundreds of downloads
> a week from unique IP addresses; and in particular, if you were going
> to fake hundreds of downloads a week, I'm not sure why you'd only fake
> the most recent release.

Remember the browser wars?  At one point many sites were looking for
IE/Windows and sending back error messages without those.  Getting the
tarball on Windows doesn't seem too likely, faking the browser was
pretty common for a while.


-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sigmsg@m5p.com  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-19 15:34 ` Elliott Mitchell
@ 2024-02-21  2:52   ` George Dunlap
  0 siblings, 0 replies; 9+ messages in thread
From: George Dunlap @ 2024-02-21  2:52 UTC (permalink / raw)
  To: Elliott Mitchell; +Cc: Xen-devel, committers, Kelly Choi

On Mon, Feb 19, 2024 at 11:34 PM Elliott Mitchell <ehem+xen@m5p.com> wrote:
>
> On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
> >
> > Looking at the *non*-4.18 downloads, nearly all of them have user
> > agents that make it clear they're part of automated build systems:
> > user agents like curl and wget, but also "Go-http-client", "libfetch",
>                    ^^^^     ^^^^
>
> I reject this claim.  `curl` or `wget` could be part of an interactive
> operation.  Telling a browser to copy a URL into the paste buffer, then
> using `wget`/`curl` is entirely possible.  I may be the outlier, but I
> routinely do this.

It's not just the user agent; there are certain statistical
regularities that make me think it's automated.  e.g., a specific
version of curl always downloading a specific version of the tarball,
the tar.gz and the tar.gz.sig being downloaded exactly the same time
distance apart.  There certainly *are* manual wget / curl invocations,
but the majority of them look to me like they're part of automated
systems.

(And the "Go-http-client" instances are kind of fascinating to me --
someone wrote something on golang to download the Xen tarball?  And
always 4.15.1?  And it's being run from both NH, USA and from Finland,
and a handful of other places that seem unrelated?  What project is
this?)

> > It's not really clear to me why we'd be getting 300-ish people
> > downloading the Xen 4.18.0 tarball, 2/3 of which are on Windows.  But
> > then I'm also not sure why someone would *fake* hundreds of downloads
> > a week from unique IP addresses; and in particular, if you were going
> > to fake hundreds of downloads a week, I'm not sure why you'd only fake
> > the most recent release.
>
> Remember the browser wars?  At one point many sites were looking for
> IE/Windows and sending back error messages without those.  Getting the
> tarball on Windows doesn't seem too likely, faking the browser was
> pretty common for a while.

Right, which is why I wanted to look more into the rest of the data to
see if I could get a feel for it.  There are very few Windows user
agents for the other versions; the handful of browser agents for
non-4.18.0 tarballs look very normal and unix-y.  So the question is,
why would you fake loads of downloads for Chrome / Firefox / Edge on
Windows *only* for 4.18.0?

I agree that none of the current explanations make a lot of sense; but
I continue to believe that the "We have loads of actual humans
downloading the 4.18.0 tarball via browsers, even on Windows" is the
least-bad fit.  (Feel free to propose others, though.)

 -George


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-19 10:38   ` Jan Beulich
@ 2024-02-21  2:55     ` George Dunlap
  2024-02-21 22:53       ` Julien Grall
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2024-02-21  2:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monné, Xen-devel, committers, Kelly Choi

On Mon, Feb 19, 2024 at 6:38 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.02.2024 11:31, Roger Pau Monné wrote:
> > On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
> >> One of the questions we had with respect to changing our release
> >> practice (for instance, making the process more light-weight so that
> >> we could do a point release after every XSA) was, "How many people are
> >> actually using the tarballs?"
> >
> > What would this more lightweight process involve from a downstream
> > PoV?  IOW: in what would the contents of the tarball change compared
> > to the current releases?
>
> From all prior discussion my conclusion was "no tarball at all".

Or at very least, the tarball would be a simple `git archive` of a
release tag.   Right now the tarball creation has a number of
annoyingly manual parts about it.


 -George


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-21  2:55     ` George Dunlap
@ 2024-02-21 22:53       ` Julien Grall
  2024-02-22  9:49         ` Roger Pau Monné
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Grall @ 2024-02-21 22:53 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich
  Cc: Roger Pau Monné, Xen-devel, committers, Kelly Choi

Hi George,

On 21/02/2024 02:55, George Dunlap wrote:
> On Mon, Feb 19, 2024 at 6:38 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 19.02.2024 11:31, Roger Pau Monné wrote:
>>> On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
>>>> One of the questions we had with respect to changing our release
>>>> practice (for instance, making the process more light-weight so that
>>>> we could do a point release after every XSA) was, "How many people are
>>>> actually using the tarballs?"
>>>
>>> What would this more lightweight process involve from a downstream
>>> PoV?  IOW: in what would the contents of the tarball change compared
>>> to the current releases?
>>
>>  From all prior discussion my conclusion was "no tarball at all".
> 
> Or at very least, the tarball would be a simple `git archive` of a
> release tag.   Right now the tarball creation has a number of
> annoyingly manual parts about it.
At the moment we have the following steps:

1) Checkout tag
2) Create the tarball
3) Check the source tarball can build
4) Sign the tarball
5) Upload it

I managed to script it so I have only two commands to execute (mostly 
because I build and sign on a different host).

AFAIU, your command 'git archive' will only replace 2. Am I correct? If 
so, it is not entirely clear how your proposal is going to make it better.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-21 22:53       ` Julien Grall
@ 2024-02-22  9:49         ` Roger Pau Monné
  2024-02-22  9:56           ` Juergen Gross
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2024-02-22  9:49 UTC (permalink / raw)
  To: Julien Grall
  Cc: George Dunlap, Jan Beulich, Xen-devel, committers, Kelly Choi

On Wed, Feb 21, 2024 at 10:53:49PM +0000, Julien Grall wrote:
> Hi George,
> 
> On 21/02/2024 02:55, George Dunlap wrote:
> > On Mon, Feb 19, 2024 at 6:38 PM Jan Beulich <jbeulich@suse.com> wrote:
> > > 
> > > On 19.02.2024 11:31, Roger Pau Monné wrote:
> > > > On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
> > > > > One of the questions we had with respect to changing our release
> > > > > practice (for instance, making the process more light-weight so that
> > > > > we could do a point release after every XSA) was, "How many people are
> > > > > actually using the tarballs?"
> > > > 
> > > > What would this more lightweight process involve from a downstream
> > > > PoV?  IOW: in what would the contents of the tarball change compared
> > > > to the current releases?
> > > 
> > >  From all prior discussion my conclusion was "no tarball at all".
> > 
> > Or at very least, the tarball would be a simple `git archive` of a
> > release tag.   Right now the tarball creation has a number of
> > annoyingly manual parts about it.
> At the moment we have the following steps:
> 
> 1) Checkout tag
> 2) Create the tarball
> 3) Check the source tarball can build
> 4) Sign the tarball
> 5) Upload it
> 
> I managed to script it so I have only two commands to execute (mostly
> because I build and sign on a different host).
> 
> AFAIU, your command 'git archive' will only replace 2. Am I correct? If so,
> it is not entirely clear how your proposal is going to make it better.

IMO building for release tarballs is easier than from a git checkout
(or archive).  It's a bit annoying to have to pre-download the
external project sources, now even more as QEMU is using git
submodules.

Most distro binary builders have infrastructure to deal with all this,
but requires a bit more logic in the recipe than a plain just fetch a
tarball and build from it.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Stats on Xen tarball downloads
  2024-02-22  9:49         ` Roger Pau Monné
@ 2024-02-22  9:56           ` Juergen Gross
  0 siblings, 0 replies; 9+ messages in thread
From: Juergen Gross @ 2024-02-22  9:56 UTC (permalink / raw)
  To: Roger Pau Monné, Julien Grall
  Cc: George Dunlap, Jan Beulich, Xen-devel, committers, Kelly Choi


[-- Attachment #1.1.1: Type: text/plain, Size: 2428 bytes --]

On 22.02.24 10:49, Roger Pau Monné wrote:
> On Wed, Feb 21, 2024 at 10:53:49PM +0000, Julien Grall wrote:
>> Hi George,
>>
>> On 21/02/2024 02:55, George Dunlap wrote:
>>> On Mon, Feb 19, 2024 at 6:38 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 19.02.2024 11:31, Roger Pau Monné wrote:
>>>>> On Mon, Feb 19, 2024 at 06:01:54PM +0800, George Dunlap wrote:
>>>>>> One of the questions we had with respect to changing our release
>>>>>> practice (for instance, making the process more light-weight so that
>>>>>> we could do a point release after every XSA) was, "How many people are
>>>>>> actually using the tarballs?"
>>>>>
>>>>> What would this more lightweight process involve from a downstream
>>>>> PoV?  IOW: in what would the contents of the tarball change compared
>>>>> to the current releases?
>>>>
>>>>   From all prior discussion my conclusion was "no tarball at all".
>>>
>>> Or at very least, the tarball would be a simple `git archive` of a
>>> release tag.   Right now the tarball creation has a number of
>>> annoyingly manual parts about it.
>> At the moment we have the following steps:
>>
>> 1) Checkout tag
>> 2) Create the tarball
>> 3) Check the source tarball can build
>> 4) Sign the tarball
>> 5) Upload it
>>
>> I managed to script it so I have only two commands to execute (mostly
>> because I build and sign on a different host).
>>
>> AFAIU, your command 'git archive' will only replace 2. Am I correct? If so,
>> it is not entirely clear how your proposal is going to make it better.
> 
> IMO building for release tarballs is easier than from a git checkout
> (or archive).  It's a bit annoying to have to pre-download the
> external project sources, now even more as QEMU is using git
> submodules.
> 
> Most distro binary builders have infrastructure to deal with all this,
> but requires a bit more logic in the recipe than a plain just fetch a
> tarball and build from it.

I have an unfinished patch series lying around doing the download steps
_before_ starting the build. This includes make targets for downloading
the required components, or all components if configure should be called
afterwards.

Creating the tarball after having downloaded all components is trivial.

There are a few bugs in the series I didn't have time yet to fix. If someone
is interested in working on it, I can post the series.


Juergen


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-02-22  9:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-19 10:01 Stats on Xen tarball downloads George Dunlap
2024-02-19 10:31 ` Roger Pau Monné
2024-02-19 10:38   ` Jan Beulich
2024-02-21  2:55     ` George Dunlap
2024-02-21 22:53       ` Julien Grall
2024-02-22  9:49         ` Roger Pau Monné
2024-02-22  9:56           ` Juergen Gross
2024-02-19 15:34 ` Elliott Mitchell
2024-02-21  2:52   ` George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.