mit-scheme-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MIT-Scheme-devel] a MIME body structure parser for non-IMAP folders


From: Taylor Campbell
Subject: Re: [MIT-Scheme-devel] a MIME body structure parser for non-IMAP folders in IMAIL
Date: Tue, 29 Nov 2005 04:41:46 +0000 (UTC)
User-agent: IMAIL/1.21; Edwin/3.116; MIT-Scheme/7.7.90.+

   Date: Mon, 28 Nov 2005 20:36:23 -0500
   From: Chris Hanson <address@hidden>

   On 11/28/05, Taylor Campbell <address@hidden> wrote:

   > If I wish to include it in MIT Scheme, would it be most convenient to
   > assign the copyright to MIT?  I'm uninterested in being swamped in
   > legal matters, so I would opt for the simplest path in that regard;

   The simplest thing is for you to retain copyright on it.  Then there
   are no legalities to deal with, just add a copyright line to the file.

Sorry, I meant to add one other thing to the question: would it have
to be GPL'd as well?

   > also, a bit of the code is very similar to existing code in Edwin,
   > such as a slightly extended RFC822 header field value tokenizer.

   Perhaps it would be better to just extend the existing tokenizer. 
   What kind of extensions?

The MIME lexical syntax uses a slightly different set of special
characters, so my extension just adds a parameter for that character
set; also, it has a parameter for specifying whether or not to include
whitespace in the first place, making RFC822:STRIP-WHITESPACE! (which
isn't exported by (EDWIN RFC822)) unnecessary.  Since my extension is
compatible, I'll just add it to rfc822.scm.

   > Is there any way to cache the Boyer-Moore string search table and then
   > to reuse it for multiple searches?  Judging by the source of its
   > implementation in runtime/string.scm, this is not possible, but it
   > would seem to be a very useful operation.

   It doesn't seem that it would be hard to make this change.  Do you
   want to try?  I'll review it (or advise) if you're not sure.

OK, I'll implement it, then.

   I assume you're using this for the parser?  Did you consider using the
   parser language?  It's well suited to this application, and takes care
   of things like backtracking.

Actually, I initially thought that I'd make extensive use of the
parser language before I started implementing the MIME parser, but
then I found that the only text parsing I needed was either header
field parsing, which is already implemented in IMAIL, or MIME
multipart boundary searching, for which I need nothing more than
SUBSTRING-SEARCH-FORWARD.  The only part that would have been nicer to
have a parser language for was a minor bit of RFC822 token stream
parsing, which *PARSER doesn't really help with, since it operates on
the character level, not the level of arbitrary tokens.

   > I'm unclear on how line endings are managed in the new I/O system and
   > IMAIL's infrastructure.  String input/output ports don't seem to
   > translate line endings, regardless of PORT/SET-LINE-ENDING.  Is it
   > reliable that IMAIL messages will contain only #\NEWLINE characters,
   > not CRLFs, if their contents are taken by passing a string output port
   > to WRITE-MESSAGE-BODY?  (The MIME parser currently relies on this.)
   > If not, what can I rely on?

   Yes, this is reliable.  Internal representations of messages always
   use newline for the line separator.

OK, thanks.  (Is it a bug that string ports disregard line ending
normalization settings?)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]