Jon Smirl wrote:
>> I'll check how monotone cvs_import counts. I didn't write that portion
>> of the code myself. But certainly these numbers should match (or we
>> should know why they don't.)
>
> I'm rerunning a pass 1 cvs2svn on mozcvs as I type. I will let you
> know the exact counts.
Okay, thanks. But that probably won't help that much. We need to figure
out where the difference comes from.
> I've noticed that most of the memory consumption in programs like this
> comes from tracking CVS file names and revision numbers. In my git
> importer I write the revision into the git data store the first time
> it is parsed out of the ,v. Now I can simply track the sha1 through
> all of the sorts, no need to keep CVS file names and rev numbers
> around.
monotone's cvs_import also stores the deltas in it's database, before
all of the toposort stuff. Only blobs and their dependencies are kept in
memory. The filenames as well as the revision strings are only stored once.