[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Possible but in ``cut -c --output-delimiter''
From: |
David Krider |
Subject: |
Possible but in ``cut -c --output-delimiter'' |
Date: |
Wed, 19 May 2004 21:57:39 -0500 |
I have a script that I originally wrote under SuSE 8.2. I had no
problems using it on SuSE 9.0. Now that I've upgraded to SuSE 9.1, I'm
getting some problems. My version of coreutils is now 5.2.1.
I have a hierarchical database dump with 3 tables munged together. I
need to un-munge them and prep them for loading into a relational
database. The part information lives in the first 69 characters. The
first 10 lines look like this:
> cat dump.txt | cut -c 1-69
10165301 GM CPC STABAR, FR CORVETTE 14105128
4656102 RICHRYSLER STABAR, FR 01 PT 4656102
4656102 RICHRYSLER STABAR, FR 01 PT 4656102
4656102 RICHRYSLER STABAR, FR 01 PT 4656102
4694750 CHRYSLER STABAR, FR NS MVX-2513-H
52088126 CHRYSLER COIL FR TJ 52088119AB
52088127 CHRYSLER COIL FR TJ 52088119AB
52088128 CHRYSLER COIL FR TJ 52088119AB
52088129 CHRYSLER COIL FR TJ 52088119AB
F65A-5B326-CA FORD T-BAR, FR FMTB-001
>From here, I need to ``sort -u'' to get rid of the redundant lines which
exist because of information in the other tables. This isn't central to
my issue, but I mention it because it took me a couple of hours to
realize that my sorts were messed up because of my locale information.
Anyway.
The big trick I want to do is to insert |'s (pipes) into the data so
that I can easily load it into my new database. I have been getting
this:
> cat dump.txt|cut -c 1-69|sort -u|cut --output-delimiter=\| -c
1-17,18-32,33-52,53-
10165301 |GM CPC |STABAR, FR CORVETTE |14105128
4656102 RI|CHRYSLER |STABAR, FR 01 PT |4656102
4694750 |CHRYSLER |STABAR, FR NS |MVX-2513-H
52088126 |CHRYSLER |COIL FR TJ |52088119AB
52088127 |CHRYSLER |COIL FR TJ |52088119AB
52088128 |CHRYSLER |COIL FR TJ |52088119AB
52088129 |CHRYSLER |COIL FR TJ |52088119AB
F65A-5B326-CA |FORD |T-BAR, FR |FMTB-001
I find now that I'm getting this:
10165301 GM CPC STABAR, FR CORVETTE |14105128
4656102 RICHRYSLER STABAR, FR 01 PT |4656102
4694750 CHRYSLER STABAR, FR NS |MVX-2513-H
52088126 CHRYSLER COIL FR TJ |52088119AB
52088127 CHRYSLER COIL FR TJ |52088119AB
52088128 CHRYSLER COIL FR TJ |52088119AB
52088129 CHRYSLER COIL FR TJ |52088119AB
F65A-5B326-CA FORD T-BAR, FR |FMTB-001
As you can see, I only get the output delimiter between the last and
second-to-last fields. Now, from the way I read the man page,
"--output-delimiter" is technically only supposed to work when using
"-f" mode, but, like I say, I've been getting the former behavior for
awhile now. I find that if I skip a column in the character ranges, then
I can get the output delimiters, but then I'm losing data.
> cat dump.txt|cut -c 1-69|sort -u|cut --output-delimiter=\| -c
1-17,19-32,34-52,54-
10165301 |M CPC |TABAR, FR CORVETTE |4105128
4656102 RI|HRYSLER |TABAR, FR 01 PT |656102
4694750 |HRYSLER |TABAR, FR NS |VX-2513-H
52088126 |HRYSLER |OIL FR TJ |2088119AB
52088127 |HRYSLER |OIL FR TJ |2088119AB
52088128 |HRYSLER |OIL FR TJ |2088119AB
52088129 |HRYSLER |OIL FR TJ |2088119AB
F65A-5B326-CA |ORD |-BAR, FR |MTB-001
Is this operating as designed, or have I stumbled across a bug?
Thanks,
dk
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Possible but in ``cut -c --output-delimiter'',
David Krider <=