bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bugs in dirname module


From: Eric Blake
Subject: Re: bugs in dirname module
Date: Fri, 11 Nov 2005 06:32:01 -0700
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Paul Eggert on 11/10/2005 10:57 PM:
> 
> We needn't conform strictly to the POSIX spec as far as slashes go;
> otherwise we'd be forced to treat "a/b" as not having a directory
> separator.

I think you meant 'a\b' in that comment.  But I agree - since cygwin is
not strictly POSIX, base_name should match the intent rather than strict
POSIX semantics on cygwin.

> 
> The point is that dir_name(FOO) should return the directory that FOO
> is in, and base_name(FOO) should return a file name that identifies
> FOO if the working directory is dirname(FOO).  For this example on
> Cygwin, it appears that dir_name("./a:b") should be "." (since that's
> the directory that the file is in) and base_name("./a:b") should be
> "./a:b" (since "a:b" won't do).  For all the examples you've given so
> far, it is the case that base_name(FOO) can return a suffix of FOO,
> no?

No - the filename '/blah/a:b' can be valid, but there is no suffix of that
name (short of the full '/blah/a:b') that will name the same file if cwd
is /blah.  Even worse, if 'blah/a:b' is valid in /, there is no suffix of
the original string that is also valid in /blah.  Currently, base_name has
the nice property that it does not modify the string or allocate memory,
and it would be a much bigger impact if we changed that semantic.

> 
> Similarly, dir_name("//") should be "//" and base_name("//") should be
> "//" on Cygwin (I guess -- I keep forgetting how Cygwin works).

My earlier point was that we can make the semantic choice of whether
base_name("//") is "//" or "", whichever makes other code paths easier.
I'm somewhat leaning towards "", because then base_name can give an
indication that the name passed in was either empty or a root; from there,
code that needs a non-empty filename can reuse the original name.  But in
terms of POSIX, basename("//") must be "//".

>>I don't think it is worth complicating base_name for this extreme
>>corner case
> 
> Hmm, I'm not sure I agree.  As long as we're trying to handle these
> weird DOS names, we might as well handle all the cases.  It's not that
> hard, is it?

OK, how about this approach: continue with the current rule that base_name
always returns a pointer into the original string without modification.
On systems where drive prefixes exist, have base_name("a:") return "" (it
is a root), but base_name("foo/a:b") return "a:b".  Then provide a second
function, abase_name, which malloc()s the memory for storing the last
component.  Normally it will just strdup() the base_name of the original
string, but it does the right thing where base_name is empty (a root, so
abase_name("a:") gives "a:" instead of "") or where base_name results in a
drive letter (a multi-component path, so abase_name("foo/a:b") results in
"./a:b").  Existing uses of base_name() that feed file_name_concat() are
still valid (it makes no sense to concat a root onto a dirname, so having
base_name return "" on roots means that file_name_concat is a no-op; and
where base_name returns something that looks like a drive letter from one
multi-component name, it will still be appended to a directory name giving
another multi-component name after all).  Meanwhile, usages like coreutils
basename(1) would need to switch to using abase_name()/free().
base_name() would no longer be idempotent, but my audit didn't show anyone
trying to do base_name(base_name(foo)) so I doubt that matters.  Since I
did provide in an earlier mail an audit of all uses of base_name in
gnulib, coreutils, findutils, and tar, it should be possible to quickly
decide which usages of base_name would need to be changed to abase_name/free.

Nailing down the semantics we desire is essential before I can complete my
patch; and the use of my proposed tests/test-dirname.c will play an
important role in that.

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDdJ1Q84KuGfSFAYARAmZLAJ4/QoLae3UWNpgW/ThF5AUHpheIwACgwky5
K7Ko3zJ9CUSrReRbFsrac+s=
=SYrJ
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]