[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Script to generate ChangeLogs automatically
From: |
Joseph Myers |
Subject: |
Re: Script to generate ChangeLogs automatically |
Date: |
Mon, 26 Nov 2018 23:09:48 +0000 |
User-agent: |
Alpine 2.21 (DEB 202 2017-01-01) |
On Mon, 26 Nov 2018, Richard Stallman wrote:
> > I think "in all cases" is not realistic; there are cases of structural
> > changes where any description in terms of named entities will be a mess,
>
> I doubt that. Whatever the changes were, it is possible to list
> the entities that were changed, the entities that were deleted, and
> the entities that were added.
And when the actual change is a rearrangement of the contents of the file,
or a rearrangement between multiple files, or a change to the surrounding
#if conditionals rather than the individual functions, such a list is a
mess and useless for actually understanding the change. These are not
rare kinds of changes; they are common in glibc.
Furthermore, you have unnamed entities - for example, in GCC machine
descriptions, unnamed define_split constructs. If you want to find past
changes to such a construct, you have to use tools like "git blame", as
there is no possible name to search for.
Furthermore, as I noted in January, there are cases in glibc where, while
there is arguably a name, it's not a very helpful one - in makefiles, it
can be something like
$(addprefix $(objpfx),$(filter-out $(tests-static) $(libm-vec-tests),$(tests)))
(being something that appeared on the left hand side of ':' in a makefile
rule, where the change in question was modifying the name by changing
$(libm-vec-tests) to $(libm-tests-vector)). In the makefile it appeared
with backslash-newlines in the name. Someone is hardly able to search for
such a name in a ChangeLog; they'd need to guess exactly how whitespace
was inserted / removed for the line continuations in the Makefile, to
produce the line-continued version in the ChangeLog, before they had
something that would match the text in the ChangeLog.
It's these sorts of cases, where there is a mismatch between the nature of
the change and the ChangeLog concept of changes that split into subchanges
to well-defined entities with well-defined short names, where writing the
ChangeLog entries can be the most work, *and* any ChangeLog entry is the
least useful for understanding the change, *and* a script is going to make
the most mess of describing the changes. I don't think expecting scripts
to do anything sensible in such cases is useful, because even a
human-written ChangeLog entry is extremely unhelpful for understanding
such changes.
> > from the use of macros to
> > generate function definitions that makes it hard to identify relevant
> > entities
>
> Indeed, the script to do this needs to be able to handle any nonstandard
> entity-defining constructs used in the package at hand. But I expect
> that not to be very hard.
>
> How many such constructs are used in glibc? Could you post a list
> of what they are and what they look like?
For a package developed by many different people over 30 years, and with
code taken from a range of third-party sources (BSD etc.), and various
different languages in use, and about 17000 source files, naturally we
can't identify all places with such peculiarities. But for example you
can have function names generated by macros, e.g.
FLOAT
INTERNAL (STRTOF) (const STRING_TYPE *nptr, STRING_TYPE **endptr, int group)
(you could say the function is INTERNAL (STRTOF)), or
CFLOAT
M_DECL_FUNC (__cacos) (CFLOAT x)
and then the same file as the latter has, after the function definition,
declare_mgen_alias (__cacos, cacos);
where the precise set of function aliases created by declare_mgen_alias
depends on details of the glibc configuration, so it's hardly clear what
entity name should be used for any change to the declare_mgen_alias call
(or for calls to other such alias-creating macros). Or in
tst-strtod-nan-locale-main.c we have
#define TEST_STRTOD(FSUF, FTYPE, FTOSTR, LSUF, CSUF) \
static int \
test_strto ## FSUF (const char * loc, CHAR * s) \
{ \
CHAR *ep; \
FTYPE val = FNX (FSUF) (s, &ep); \
if (isnan (val) && *ep == 0) \
printf ("PASS: %s: " FNPFXS #FSUF " (" SFMT ")\n", loc, s); \
else \
{ \
printf ("FAIL: %s: " FNPFXS #FSUF " (" SFMT ")\n", loc, s); \
return 1; \
} \
return 0; \
}
GEN_TEST_STRTOD_FOREACH (TEST_STRTOD)
where GEN_TEST_STRTOD_FOREACH generates multiple calls to the macro whose
name is passed as an argument, for different floating-point types - in
this case, that means generating multiple function definitions. If you
change the TEST_STRTOD macro you can say TEST_STRTOD is the named entity
changed - but if you change the call to GEN_TEST_STRTOD_FOREACH, it's much
less clear how that relates to any one named entity.
In all of these cases, there is no obstacle to using "git blame", or "git
log -L <start-regex>,<end-regex>" with appropriate regular expressions, to
track changes to the code in question - whereas even if you invent an
answer to what the canonical entity name should be in each of those cases,
you can't expect subsequent readers to come up with the same entity name
when looking for changes. That's not a single command to search for
changes to the entity, independent of what the entity is - you need to
understand the git tools in question and select an appropriate command for
the code you're looking at - but use of those tools is a much more
reliable way of finding changes in such cases than attempting to generate
a canonical entity name that can then be searched for in a ChangeLog
(whether automatically generated or manually written).
If the entity name is a single C identifier, not generated through macros,
it's clear enough what the name is and people might search for it. If
it's generated through macros, or some other construct as in the makefile
example, or if the name involves qualification by a class or namespace
name, or by argument types as in a C++ overloaded function, the right name
becomes much less clear, and so searching by name becomes much less
helpful. (You could e.g. search by name and find changes in *all* of the
many different overloaded functions with the same name but different
argument types - or you could use "git blame" to look at just the changes
to the particular implementation of interest.)
--
Joseph S. Myers
address@hidden
- Re: Script to generate ChangeLogs automatically, (continued)
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/21
- Re: Script to generate ChangeLogs automatically, Siddhesh Poyarekar, 2018/11/22
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/22
- Re: Script to generate ChangeLogs automatically, Siddhesh Poyarekar, 2018/11/23
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/24
- Re: Script to generate ChangeLogs automatically, Joseph Myers, 2018/11/26
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/26
- Re: Script to generate ChangeLogs automatically,
Joseph Myers <=
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/27
- Re: Script to generate ChangeLogs automatically, Joseph Myers, 2018/11/27
- Re: Script to generate ChangeLogs automatically, John Darrington, 2018/11/28
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/28
- Re: Script to generate ChangeLogs automatically, Joseph Myers, 2018/11/28
- Re: Script to generate ChangeLogs automatically, Richard Stallman, 2018/11/29
- Re: Script to generate ChangeLogs automatically, Joseph Myers, 2018/11/29
- Re: Script to generate ChangeLogs automatically, Alfred M. Szmidt, 2018/11/30
- Re: Script to generate ChangeLogs automatically, Joseph Myers, 2018/11/30
- Re: Script to generate ChangeLogs automatically, Alfred M. Szmidt, 2018/11/30