findutils-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Findutils-patches] new predicate


From: Konrad Eisele
Subject: [Findutils-patches] new predicate
Date: Thu, 27 May 2010 22:04:44 +0200

I wanted to submit a patch that is quite short and 
more thought as a feature request. It adds the predicate
"-dtype <regex>" (dtype meaning datatype). The dtype
predicate uses libmagic from the "file" command to get
the *content datatype* of the file in view, then doing a regex on
it. i.e. "echo abc>f.txt; file f.txt" yealds "ASSCII text".
Therefore "file f.txt -dtype .*text.*" would do a regex ".*text.*"
on "ASCII text" (and match). 

The problem this patch addresses is like this:
I have several source project directory with serveral million
files in them. I want to make a backup, however i want 
to only backup text files, (Makefiles, shell sripts, c and
h files etc). Currently I do something like this:
(for f in `find <srcdir> -type f`; do if (file $f | cut -d: -f2 | grep text &> 
/dev/null ); then echo $f; fi; done) > file.list
Then I use file.list to create a tar.
But, this pipe is very slow (I run it over night so it works but...).
With the above patch I can do:
find <stcdir> -dtype .*text.* >file.list 
This version is a magnitude faster...

As for I'm not really familiar with patch subbmissions I
send this short and easy to understand patch so that 
a developer can integrate it himself. I guess having a 
content datatype match is quite useful...
 
To make it compile you need to have libmagic (from the "file" 
command distribtuin) installed. Then I added LIBS=-lmagic to the
configure call. I guess in reality you should have a --with-magic
or so configure option etc. 

-- Greetings Konrad Eisele







-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Attachment: fu.diff
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]