bug#38398: non-obvious SCM_EOF

bug-guile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#38398: non-obvious SCM_EOF_VAL rationale

From:	Zefram
Subject:	bug#38398: non-obvious SCM_EOF_VAL rationale
Date:	Wed, 27 Nov 2019 12:05:34 +0000

John Cowan wrote:
>On the contrary:  the EOF object is not a character, but it *can* be
>returned by read-char .

Bother.  Of course I meant "can't be returned by read-char in a non-EOF
situation".  I was alluding precisely to it being distinguishable from
characters for the purposes of that return convention.

>                                  However, section 4.1 says that Guile is
>fully compliant with R5RS.

And yet, as I noted, it's actually non-compliant, in a way that's directly
relevant to this issue.

>Why do you believe it to be a poor design?

Because it makes it impossible to distinguish between reaching EOF and
reading a value that is otherwise a perfectly good one.  Or, from the
other point of view, because it requires that read syntax be crippled
specifically to prevent this one value ever being a genuine result
of reading.  read-char is free to use a distinguished return value
for EOF because the things it can read in a non-EOF situation form an
obviously-constrained subset of values.  The nature of the read function,
however, is that it can read basically any value, so there is no obvious
place for a distinguished value for EOF.

Although the RnRS read syntax doesn't cover absolutely all values,
when extending the read syntax it's quite easy, even unintentionally,
to make it capable of reading types of object that RnRS doesn't imagine
being readable.  Indeed, not only does Guile have the occasionally-useful
"#.", which makes absolutely all values readable, it's also got the
read-hash-extend system, which invites casual extension, and does nothing
to prevent user extensions returning the EOF object.

So it makes much more sense to embrace the ability of read to read
any value whatsoever, and to use some other mechanism to signal EOF.
Common Lisp, for example, which has "#." as standard, specifies that
read is to signal an error by default if it's at EOF.

>                                            It seems quite appropriate to
>me for the EOF object not to be a datum value, for the same reason that it
>should not be a character.  You nowhere state what purpose such a read
>syntax would serve.

You're making a bit of a leap here, if there's meant to be some causal
connection between these two sentences.  By "such a read syntax" you seem
to be referring to my "#eof" suggestion, but the case against the RnRS
design of read doesn't depend at all on whether there's a read syntax
specifically for that object.

The use of a distinguished EOF return value from read, and the consequent
rationale for not having a specific read syntax for the EOF object, is
founded on the idea that read can't return the EOF object *at all* in a
non-EOF situation.  This is undermined for Guile by the already-existing
"#." and read-hash-extend, without any need to invent new syntax.

To answer the second sentence in isolation: it would serve about the
same use as "#nil", making it easier to reference this useful object,
and extending the scope within which write-read round-tripping works.
I don't have strong feelings about having a specific read syntax,
it's just that this kind of distinguished object usually does have
specific syntax ("()", "#t", "#nil").  However, not every other object
like this has a read syntax; Guile's `unspecified' value is another one
that doesn't.  (Tangent: the unspecified value could equally well do
with a read syntax, but through testing with "#.*unspecified*" I note
that at present weird behaviour results from actually reading it.)

>                     Do you wish to be able to use read to input a list of
>EOF objects, for instance?  What would you do with them?

In code, I can imagine using a quoted EOF object in order to return
it from a function that's following something like read-char's return
convention, or to pass it to a function that expects values following
a similar convention.  Also to pass it to something like memq, for the
purposes of testing a value that could be the EOF object.  (A quoted
EOF object currently works in the interpreter but not in the compiler.)
In data, I imagine the EOF object would appear because of much the
same situations: it got returned from something like read-char, or it's
going to be fed to something that expects to occasionally receive the
EOF object.  Stick them in a list?  Sure, a list of values on its way
from A to B could well include an EOF object.

But please don't get sidetracked.  This wasn't a feature request for
"#eof"; that's just an idea that idly arose from consideration of the
rationale in question.  The issue that I'm seeking to get resolved is
that the documentation says the reason for the EOF object having no
specific read syntax is obvious, when in context it's really not.

-zefram

[Prev in Thread]

Current Thread

[Next in Thread]

bug#38398: non-obvious SCM_EOF_VAL rationale, Zefram, 2019/11/27
- bug#38398: non-obvious SCM_EOF_VAL rationale, John Cowan, 2019/11/27
  - bug#38398: non-obvious SCM_EOF_VAL rationale, Zefram <=
    - bug#38398: non-obvious SCM_EOF_VAL rationale, tomas, 2019/11/27

Prev by Date: bug#38388: [2.9.5] Inaccurate source location info for unbound variables
Next by Date: bug#38398: non-obvious SCM_EOF_VAL rationale
Previous by thread: bug#38398: non-obvious SCM_EOF_VAL rationale
Next by thread: bug#38398: non-obvious SCM_EOF_VAL rationale
Index(es):
- Date
- Thread