[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-bash] Is there a way to read empty string with `read`?
From: |
Eduardo Bustamante |
Subject: |
Re: [Help-bash] Is there a way to read empty string with `read`? |
Date: |
Mon, 23 May 2016 22:03:52 -0500 |
I stand corrected, my "text file" definition is wrong. This contains
the proper text file definition:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
Still, control characters are rarely seen in "text files" in practice.
It shouldn't cause any issues with UTF-8.
On Mon, May 23, 2016 at 9:46 PM, Peng Yu <address@hidden> wrote:
> Would it cause any problems with UTF-8 coded tsv input?
>
> On Mon, May 23, 2016 at 8:01 PM, Eduardo Bustamante <address@hidden> wrote:
>> By definition, a "text file" should not contain control characters, so
>> you're pretty safe with this hack.
>>
>> On Mon, May 23, 2016 at 7:12 PM, Peng Yu <address@hidden> wrote:
>>> On Mon, May 23, 2016 at 8:45 AM, Greg Wooledge <address@hidden> wrote:
>>>> On Mon, May 23, 2016 at 07:18:16AM -0500, Peng Yu wrote:
>>>>> Hi, The following code shows that an empty string between two TABs can
>>>>> not be captured. Is there a way to let bash read empty strings between
>>>>> TABs?
>>>>
>>>> Convert the tab characters to some other character that is not treated
>>>> as whitespace by IFS, and which also does not appear in the input data.
>>>>
>>>> echo $'one\ttwo\t\tfour' > testfile
>>>> while IFS=$'\002' read -ra array; do
>>>> declare -p array
>>>> done < <(tr \\t \\002 < testfile)
>>>
>>> I have never encountered \\002 in a tsv file. In this sense, this
>>> probably should be a reasonable work around. There are explain about
>>> \\002 or STX in ascii code doc. However, I am not sure whether it is
>>> of much relevance today. Do you have any real case common examples in
>>> which these control characters are used in a text file?
>>>
>>>> Multiple consecutive whitespace characters are treated as a single
>>>> delimiter by IFS. This is by design, as it's what you want 99% of the
>>>> time, for input files that have fields padded by whitespace.
>>>
>>> --
>>> Regards,
>>> Peng
>>>
>
>
>
> --
> Regards,
> Peng