[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Circumstances in which ChangeLog format is no longer useful
From: |
Joseph Myers |
Subject: |
Re: Circumstances in which ChangeLog format is no longer useful |
Date: |
Mon, 31 Jul 2017 12:55:25 +0000 |
User-agent: |
Alpine 2.20 (DEB 67 2015-01-07) |
On Fri, 28 Jul 2017, Alfred M. Szmidt wrote:
> > 1. The package has a public version control system.
> >
> > (Rationale: this ensures people can see what changed, just as with
> > ChangeLogs, but can see *exactly* what changed rather than just the
> > brief descriptions.)
> >
> > I think that rationale is incorrect, just because you have a public
> > version control system does not mean that you can see what actually
> > changed. Going through multiple megabytes of diffs is not feasible,
> > and searching for when something was renamed, added, removed, etc is
> > something no tool is capable of providing.
>
> That's a function of a busy project and is the same whether you look at
> commit logs, diffs or ChangeLog messages.
>
> That doesn't address how you go through a diff to see when a function
> was added/removed/renamed/moved/..., neither diff nor annotate do
> those things -- nor can they in the general case. Think non-C
Check for appropriate regular expressions in the diffs. In the simple
sorts of cases that the ChangeLog format can readily describe, such
regular expressions will also work reasonably reliably. In more
complicated cases, they may not, but the more complicated cases also
aren't well-described in terms of individual named entities as required in
ChangeLog format.
> language, weird configuration files, etc. And they are still not
Non-C languages are also a case that often doesn't work well with
ChangeLog format - consider e.g. a C++ member function where identifying
it for the ChangeLog requires a long fully-qualified name, complete with
argument types to identify which overload is being modified.
> available in binary packages or tarball releases.
Binary releases likely don't include the ChangeLogs anyway. E.g. Ubuntu
ships only the NEWS file with binary releases, not the ChangeLogs;
likewise a CentOS system I have to hand.
I think of tarball releases as just being one output of the development
process. They are an essential output - to define immutably the contents
of a particular version number, in the same way for everyone, so everyone
can get that version and reproducibly get the same sources, do a
reproducible build and get the same binaries - but I don't think we should
expect much visibility into the development process from them.
Understanding the development process requires many other sources of
information, such as the version control history, the mailing list
archives, the issue tracker, ....
> I wouldn't object to shipping the version control history in tarballs, if
> necessary to stop having to write in the ChangeLog format (or having
> tarballs with and without the version control history).
>
> That would ballon the tarball so much that it would be unacceptable,
> emacs has a .git directory that is around 1.8G on my machine, glibc
> would be 400M with .git and all source files.
A freshly packed glibc checkout (git remote prune origin; git reflog
expire --expire=now --all; git gc --prune=all --aggressive) takes about
130 MB for the .git directory, and I end up with a .tar.xz of about 135
MB. Of course that includes all branches, and with just the history in
the ancestry of the release in question it would be a bit smaller. And
the -with-history.tar.xz could be separate from the normal version, since
it's just a backup of the VCS data in case the VCS data is otherwise
somehow lost (which is a lot less likely with a distributed VCS than a
centralized one).
> But I believe that people wanting to look at the history are going
> to check out the repository rather than attempting to get it from
> tarballs.
>
> I can only speak from my own experience, but I always persue the
> ChangeLog file first. Only when a project is badly maintained do I go
> for the VCS.
I look at the VCS first, at least when a distributed VCS is in use so I
don't need to wait for the VCS server for each log inspection. I used to
look at ChangeLogs, but that's now a pretty old-fashioned, GNU-specific
appproach.
> I am having a hard time taking the "completely useless", "waste of
> time" argument seriously when the ChangeLog files (be it in VCS, or a
> file) are infact used to do exactly that: to understand how code moves
> in a project. If they where so useless and waste full then most GNU
> projects would have abandoned them many many years ago, and yet we
> still use them very activley, even in gcc and glibc being just two
> examples.
They're created in GCC and glibc because the GNU Coding Standards require
them, not because (in the context of having VCS history, mailing list
archives, issue trackers, etc. to provide a much richer understanding of
the development history relevant to any particular issue) they are useful
for something not covered by those other sources of information.
> This is what I would have done, you have only 8 specific changes
> touching multiple files, there is no need to repeat them several times
> and one could even reduce this a bit further by merging the file lines
> into one. You can even skip the "Likewise." part completely.
I don't think this is any better. If anything it obscures the essential
nature of the change, which is very much "take each relevant file, apply
this class of fixes to it", where links between identifiers of the same
name in different files are entirely incidental.
> Writing accurate, and descriptive ChangeLogs is just like writing
> accurate documentation, both can be wrong, but so can code. This
> falls onto the maintainer to see that all things are good. Just using
> VCS won't solve that.
The VCS history is automatically a completely accurate record of the
history of the code in a way that the ChangeLogs aren't.
> The point of the VCS is to be able to undo changes. ChangeLog files, and
> the form of change description therein, are in no way a substitute for the
> VCS, and are essentially obsoleted by it.
>
> The ChangeLog is for human consumption, to understand how the code was
> changed, VCS does not solve this, not everything is prettily managed
That understanding effectively requires the tools such as VCS, mailing
lists, issue trackers etc. that give a rich structure to the history
information.
For users, we have the NEWS file. For people looking at how things
changed at the development level, a natural process is: look at the VCS
logs (describing changes logically rather than physically), then
potentially delve into diffs, list archives, issue trackers, etc. for
individual changes that seem of interest.
> Why do you think that the ChangeLog can't mention the above? Or maybe
The problem isn't that it can't mention the logical nature of the change.
The problem is that given the logical description, in the VCS log, and
given the diffs themselves, in the VCS history, and given the mailing list
archives, bug tracker, etc. that also form part of the development
process, writing a second description of the change, decomposed into
descriptions at the level of individual named entities in individual
files, has net negative utility; any benefit where someone is interested
in that very specific level of information (for a change that doesn't map
well onto that level of information) is outweighed by the extra work
involved in writing that description of extremely niche use, by the time
it takes away from substative development and writing descriptions at the
logical level, and by putting off free software developers because of the
need to jump through this hoop not needed for non-GNU projects.
> better yet, since this is a bug fix in a BUGS file or similar.
glibc has automatically-generated lists of fixed bugs in each release
created from Bugzilla data just before the release and inserted in the
NEWS file.
> Neither of those are available in tarballs, nor might they be
As I said, I think tarballs are just one output of the development
process. It's not appropriate to expect that the rich data of the
interactions involved on mailing lists, issue trackers, patch review
systems, version control, etc. can be adequately serialized into or
understood through a ChangeLog; a proper understanding of the development
process requires using those other tools.
> available in distribution packages where putting up a copy of
> ChangeLog can be very useful as well even if you do not have access to
In practice distribution packages likely do not include ChangeLogs.
> source code. The NEWS file does not describe code changes, only
> user-visible changes, so it is not very useful if you are infact
> looking for a problem in a new release.
The expectation is to look at VCS logs and use tools such as "git bisect".
(Which could be used to look for e.g. when a function moved, if desired,
not just to test properties of the compiled code.)
--
Joseph S. Myers
address@hidden
- Re: Circumstances in which ChangeLog format is no longer useful, (continued)
- Re: Circumstances in which ChangeLog format is no longer useful, Alfred M. Szmidt, 2017/07/28
- Re: Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/28
- Re: Circumstances in which ChangeLog format is no longer useful, Alfred M. Szmidt, 2017/07/28
- Re: Circumstances in which ChangeLog format is no longer useful, John Darrington, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Alfred M. Szmidt, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Paul Smith, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, John Darrington, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Rical Jasan, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/31
- Re: Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/31
- Re: Circumstances in which ChangeLog format is no longer useful,
Joseph Myers <=