monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Killing off boost::fs


From: Nathaniel Smith
Subject: [Monotone-devel] Killing off boost::fs
Date: Fri, 29 Jul 2005 20:49:33 -0700
User-agent: Mutt/1.5.9i

Boost::fs is really, really slow, for our uses, and has regularly been
the culprit in performance bugs.  Fortunately, I think we can get away
with something much simpler, and faster too.

Some thoughts on how to do this:
  - right now we have types "file_path" (a versionable path),
    "local_path" (a path within the working dir -- difference from
    file_path is that local_path's can begin "MT/"), and fs::path (a
    generic, points-anywhere path).  All are stored as strings, and
    have constructors that silently normalize, 
  - I suggest three slightly different types -- file_path (same as
    now), system_path (same as old fs::path, which is a boost type
    used directly), and bookkeeping_path (which is a path that _does_
    begin with "MT/").  I.e., there is no reason for local_path to be
    a superset of file_paths.
  - one should not be able to instantiate a file_path directly.
    Instead there should be two constructors; one that takes (and
    requires) an already-normalized path, and one that takes an
    unnormalized path and normalizes it.  Doing this allows us to be
    more secure, and probably also faster, when dealing with
    important contexts (e.g., parsing revisions) where we _know_ that
    the path should be normalized already.
  - it might also make sense to have these constructors take care of
    the prefixing required to deal with calling monotone in a
    subdirectory
  - the bookkeeping_path constructor should require an
    already-normalized path that begins "MT/", since we always
    generate these internally anyway.
  - system_path (maybe needs a better name?) should always be
    absolutified and tilde-expanded, and otherwise do not need to be
    verified.
  - all this code needs to be done very, very carefully, because bugs
    in path normalization/checking can easily lead to security exploits.

We don't need many actual operations on these:
  - write file
  - read file
  - rename file, dir (needs to do somewhat odd shenanigans on win32)
  - delete file
  - delete dir (recursively)
  - make dir (recursively)
  - walk tree
  - file_path: split into components (might be able to optimize this
    if we have our own library, e.g. do it during the normalization
    pass?)
  - get system pathname (for inodeprints code to pass to stat(), for
    instance, or to pass into crypto library unbuffered hash code,
    etc.)
These should probably take care of i18n implicitly (filenames are
represented as UTF-8 when normalized, but may not be when entered on
the command line or in the filesystem).

Anything missing here?

So, those are some thoughts on what we should, perhaps, do.  Any
thoughts?

-- Nathaniel

-- 
"...these, like all words, have single, decontextualized meanings: everyone
knows what each of these words means, everyone knows what constitutes an
instance of each of their referents.  Language is fixed.  Meaning is
certain.  Santa Claus comes down the chimney at midnight on December 24."
  -- The Language War, Robin Lakoff




reply via email to

[Prev in Thread] Current Thread [Next in Thread]