[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sort feature (can't believe this hasn't been implemented yet)
From: |
Edward Peschko |
Subject: |
Re: sort feature (can't believe this hasn't been implemented yet) |
Date: |
Fri, 21 May 2004 16:11:24 -0700 |
User-agent: |
Mutt/1.4.1i |
On Fri, May 21, 2004 at 04:06:04PM -0700, Paul Eggert wrote:
> Edward Peschko <address@hidden> writes:
>
> > ok, well I'll endeavor to add it - although I can't believe I'm the
> > first person to try to use sort to sort records from a relational
> > database (its pretty common practice to use multiple line delimiters
> > because of the possibility of embedded newlines or other such in the
> > data)
>
> But what if your data contains backslash-newline? Then
> backslash-newline isn't safe either. Better might be a safe escaping
> scheme (e.g., use "\n" to represent newline and "\\" to represent
> backslash). Of course the resulting data won't sort correctly, but
> that's true of any escaping scheme.
we preprocess the data, and get rid of any combination of backslash, followed
by newline and replace the '\' with space. Its always a give or take - you
really
can't live without newlines in text fields or nulls in bitfields, and the above
substitution is linear time. If there is a \<newline> combination in a bitfield,
we flag it. So far there hasn't been.
And sqlldr (and other loading programs) handles the above well - wheras you'd
need
special processing in these programs if you had a safe-escaping scheme. We'd
have
to write a custom loading program to do what you suggest.
Come to think of it, it'd probably be good to have the option for
multi-character
field delimiters, too..
> > Is this a trivial change to make?
>
> It shouldn't be that hard, but it probably won't be trivial either.
Ok, I'll give it a try.
Ed