This patch corrects highly non-optimal memory allocation by canonicalize_filename_mode(), which got exposed with: 2009-08-07 Sergey Poznyakoff [...] * src/misc.c: Include canonicalize.h (zap_slashes, normalize_filename): New functions. On a specific test case (a tree with around 3500 sub-directories with a total of around 58,000 files), this reduces tar's memory usage from 32 MB to 4.5 MB for the initial full backup run ("-g" enabled), and from 19 MB to 5.5 MB for a subsequent incremental run. On a real-world system with around 370,000 directories and 2.3 million files where the problem was first spotted, an incremental run of a 32-bit build of tar 1.22.90 bumped into the 3 GB process address space limit and failed, whereas a build with this patch applied uses around 400 MB during incremental runs and around 300 MB during initial full backup runs. --- tar-1.22.90/gnu/canonicalize.c.orig 2009-08-07 11:55:47 +0000 +++ tar-1.22.90/gnu/canonicalize.c 2009-12-08 15:50:40 +0000 @@ -161,6 +161,7 @@ canonicalize_filename_mode (const char * char const *end; char const *rname_limit; size_t extra_len = 0; + size_t actual_size; Hash_table *ht = NULL; if (name == NULL) @@ -325,6 +326,10 @@ canonicalize_filename_mode (const char * --dest; *dest = '\0'; + actual_size = strlen(rname) + 1; + if (rname_limit - rname > actual_size) + rname = xrealloc (rname, actual_size); + free (extra_buf); if (ht) hash_free (ht);