bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Feature request for cut and/or sort


From: The Wanderer
Subject: Re: Feature request for cut and/or sort
Date: Sun, 22 Jul 2007 20:31:35 -0400
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922

(&*%&*!@ incorrect reply behaviour... I *hate* having to type out the
address by hand in every post.)

Bob Proulx wrote:

The Wanderer wrote:

It happens not infrequently that I want to sort or cut on the final
field of a sequence of lines which do not all have the same number
of fields. The usual case is a list of files with full path, where
I am interested only in the filename.

There are good alternatives to cut for this.

How about for sort?

The intuitive thing to do is to treat the slash as the the field
delimiter and (taking cut as the example) cut out all but the final
field of each line. The first part is trivial, but there does not
appear to be any way to request the second; the field-selection
syntax described in the cut manual invariably starts counting
fields from the beginning of the line.

That is correct.  But cut is really not the best tool for the job.

In that case, it probably is not the best tool for the job of cutting
from the beginning of the line, either. Are there any other than
historical reasons ("it's become standard, and people expect it to be
there") for not removing it entirely?

(Naturally, I don't want it removed. The point is that the fact that
there are better tools available for a given task does not inherently
mean it is not worthwhile to have "worse" tools capable of performing
the same task.)

Instead I recommend and use awk for these types of things.

  echo /path/to/somefile | awk -F/ '{print$NF}'

I've never had occasion (or, beyond the general availability Somewhere
of documentation, opportunity) to learn awk. In any case a program to
which I have to pass an esoteric incantation is less convenient than one
to which I can simply pass arguments to simple options.

But there are ways to do this in the shell too.

  p=/path/to/somefile
  echo ${p##*/}

And lastly I probably should mention that the coreutils 'basename'
program does this too.

  basename /path/to/somefile

Or for as many from stdin as you want.

  find /tmp -type f -print0 | xargs -r0 -l basename

How well will these work in cases where there is other, extraneous data
on the line before the path begins?

(Regardless, the 'find' solution will not work in most of the cases I
see, because the program which outputs the data I need to parse does not
and cannot be made to null-terminate.)

You say that you want the second from the end?  Subtract the number
from the end.

  $ echo /one/two/three/four | awk -F/ '{print$(NF-1)}'
  three

How about the final two, including their separator?

(I kind of expect a "RTFM awk" here. The point, however, is that this is
much less convenient and intuitive to *find out about* - much less to
use - than are the options to cut.)

Awk is best because it is a standard utility and very stable. But
perl and ruby are similar. Identical to each other, amazingly.

  echo /one/two/three/four | perl -F/ -lane 'print $F[-1]'
  echo /one/two/three/four | perl -F/ -lane 'print $F[3]'

  echo /one/two/three/four | ruby -F/ -lane 'print $F[-1]'
  echo /one/two/three/four | ruby -F/ -lane 'print $F[3]'

Noted. Thank you.

This does not by itself address doing the same thing with programs other
than cut, though the much more terse message from Andreas Schwab seems
to indicate that there are ways to accomplish it generically, but it
does provide at least minimal incentive for me to attempt to learn awk.
(I'm having enough trouble with sed and bash, and haven't improved at C
in much of a decade - attempting to gain reflexive-recall mastery of
another language is not a pleasant prospect...)

I would like to have a way to tell cut and sort, and for that
matter anything else which likewise deals with fields, to count
them beginning from the end of the line.

Nah...  Just use awk.  It is standard, portable and already does what
you ask along with many more features.  The syntax of doing this
with awk is quite obvious.  It is short and quick to type when doing
it on the command line.

The same argument could, presumably, be provided for the functions for
which cut does provide options. The advantages of having a separate
utility appear to be that it is more convenient to use quickly and is
easier to discover and learn.

I do consider myself a comparatively advanced user (having built my own
system from parts and administered it more or less independently for
years), although I am nowhere near anything like mastery yet, and I
still find the (reputedly both complex and powerful) languages whose use
seems to be the standard response to requests for enhancement to these
tools to be intimidating. Expecting a more basic user - who may have
been lucky in stumbling across e.g. cut or sort at all - to spend the
time to learn them, when these capabilities seem natural extensions of
the abilities the tools already have, seems to me like at best a dubious
and at worst a damagingly elitist position.

--
      The Wanderer

Warning: Simply because I argue an issue does not mean I agree with any
side of it.

Secrecy is the beginning of tyranny.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]