bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Possible but in ``cut -c --output-delimiter''


From: David Krider
Subject: Possible but in ``cut -c --output-delimiter''
Date: Wed, 19 May 2004 21:57:39 -0500

I have a script that I originally wrote under SuSE 8.2. I had no
problems using it on SuSE 9.0. Now that I've upgraded to SuSE 9.1, I'm
getting some problems. My version of coreutils is now 5.2.1.

I have a hierarchical database dump with 3 tables munged together. I
need to un-munge them and prep them for loading into a relational
database. The part information lives in the first 69 characters. The
first 10 lines look like this:

> cat dump.txt | cut -c 1-69
 10165301        GM CPC         STABAR, FR CORVETTE 14105128
 4656102       RICHRYSLER       STABAR, FR 01 PT    4656102
 4656102       RICHRYSLER       STABAR, FR 01 PT    4656102
 4656102       RICHRYSLER       STABAR, FR 01 PT    4656102
 4694750         CHRYSLER       STABAR, FR NS       MVX-2513-H
 52088126        CHRYSLER       COIL FR TJ          52088119AB
 52088127        CHRYSLER       COIL FR TJ          52088119AB
 52088128        CHRYSLER       COIL FR TJ          52088119AB
 52088129        CHRYSLER       COIL FR TJ          52088119AB
 F65A-5B326-CA   FORD           T-BAR, FR           FMTB-001

>From here, I need to ``sort -u'' to get rid of the redundant lines which
exist because of information in the other tables. This isn't central to
my issue, but I mention it because it took me a couple of hours to
realize that my sorts were messed up because of my locale information.
Anyway.

The big trick I want to do is to insert |'s (pipes) into the data so
that I can easily load it into my new database. I have been getting
this:

> cat dump.txt|cut -c 1-69|sort -u|cut --output-delimiter=\| -c
1-17,18-32,33-52,53-
 10165301        |GM CPC         |STABAR, FR CORVETTE |14105128
 4656102       RI|CHRYSLER       |STABAR, FR 01 PT    |4656102
 4694750         |CHRYSLER       |STABAR, FR NS       |MVX-2513-H
 52088126        |CHRYSLER       |COIL FR TJ          |52088119AB
 52088127        |CHRYSLER       |COIL FR TJ          |52088119AB
 52088128        |CHRYSLER       |COIL FR TJ          |52088119AB
 52088129        |CHRYSLER       |COIL FR TJ          |52088119AB
 F65A-5B326-CA   |FORD           |T-BAR, FR           |FMTB-001

I find now that I'm getting this:

 10165301        GM CPC         STABAR, FR CORVETTE |14105128
 4656102       RICHRYSLER       STABAR, FR 01 PT    |4656102
 4694750         CHRYSLER       STABAR, FR NS       |MVX-2513-H
 52088126        CHRYSLER       COIL FR TJ          |52088119AB
 52088127        CHRYSLER       COIL FR TJ          |52088119AB
 52088128        CHRYSLER       COIL FR TJ          |52088119AB
 52088129        CHRYSLER       COIL FR TJ          |52088119AB
 F65A-5B326-CA   FORD           T-BAR, FR           |FMTB-001

As you can see, I only get the output delimiter between the last and
second-to-last fields. Now, from the way I read the man page,
"--output-delimiter" is technically only supposed to work when using
"-f" mode, but, like I say, I've been getting the former behavior for
awhile now. I find that if I skip a column in the character ranges, then
I can get the output delimiters, but then I'm losing data.

> cat dump.txt|cut -c 1-69|sort -u|cut --output-delimiter=\| -c
1-17,19-32,34-52,54-
 10165301        |M CPC         |TABAR, FR CORVETTE |4105128
 4656102       RI|HRYSLER       |TABAR, FR 01 PT    |656102
 4694750         |HRYSLER       |TABAR, FR NS       |VX-2513-H
 52088126        |HRYSLER       |OIL FR TJ          |2088119AB
 52088127        |HRYSLER       |OIL FR TJ          |2088119AB
 52088128        |HRYSLER       |OIL FR TJ          |2088119AB
 52088129        |HRYSLER       |OIL FR TJ          |2088119AB
 F65A-5B326-CA   |ORD           |-BAR, FR           |MTB-001

Is this operating as designed, or have I stumbled across a bug?

Thanks,
dk






reply via email to

[Prev in Thread] Current Thread [Next in Thread]