[Monotone-devel] Improving the performance of annotate

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Improving the performance of annotate

From:	Eric Anderson
Subject:	[Monotone-devel] Improving the performance of annotate
Date:	Tue, 18 Jul 2006 14:43:15 -0700

(resending with patch attached as text/plain)

I've been working on improving the performance of annotate.  I have
found a solution that drops the time for mtn annotate Makefile.am from
about 175 seconds down to 9 seconds (detailed cpu and memory
statistics at the bottom).

I've attached the patch so that people can play with it, but it needs
significant work before it could be applied.  I just want to use it as
a place to discuss where the improvements came from, and let people
try it to verify that the improvements work for others.

In the end there were two main improvements: 1) only parsing the
portion of the roster that was relevant to the file being annotated.
2) skipping the version hash check in database.cc

Initially looking at the oprofile results, time seemed to be spread
out across a whole bunch of functions, with ~15-25% in the parsing
routines.  Investigation of the other functions showed that they were
all being called as part of constructing the roster datastructure.  It
turns out that the annotation code only needs the part of the roster
that deals with the file being annotated, so an ugly chunk of code was
written that basically fast-forwards through the roster to the entry
that is needed, and then only parses that entry.  Applying this piece
of the fix gets the system down to about 20 user cpu seconds, with
~75% of the time being spent in the SHA1 hash.

I had previously removed the SHA1 hash check from the code, and making
that configurable shows that removing the check gets the run down to
about 6 user cpu seconds.  This is not surprising, running the
annotate ends up reading in ~4000 rosters around 300k each, so in
order to perform an annotate, we are hashing over 1 GB of data.  Two
thoughts occur to me on this part of the improvement 1) the check
could be made optional, defaulting to on for write operations and off
for read only operations; 2) the Botan hash could be replaced by the
optimized assembly from openssl, although there are licencing issues
associated with this option.

With both of those improvements applied, the system is spending >60%
of it's time in parsing, with most of the time in the function that
skips to the beginning of the next roster record.  Therefore I predict
that if there was an index at the beginning of the roster (say hex
encoded id, offset pairs), the user time for the annotate could drop
to about 3 seconds.  Alternately, the roster was stored as a number of
separate records in the database, then only the needed roster could be
read.

With all of the patches applied, annotating almost all of the .cc
files in monotone is fast, three files continue to be slow:
commands.cc, database.cc, netsync.cc; most of the time for those
annotations is spent in extend_path_if_not_cycle and
piecewise_applicator::copy.  I have not bothered to investigate
whether those could be made faster.

Thoughts on what is the right way to make this improvement?
        -Eric


Detailed performance of mtn annotate Makefile.am 
  for 341e4a18c594cec49896fa97bd4e74de7bee5827:
    User time: 167.194s, System time: 3.080s, Elapsed time: 174.636s
    Max Size: 37.80 MiB, Max Resident: 20.08 MiB
    Mean Size: 34.60 MiB, Mean Resident: 18.40 MiB

    User time: 171.163s, System time: 3.032s, Elapsed time: 179.597s
    Max Size: 37.85 MiB, Max Resident: 20.08 MiB
    Mean Size: 34.60 MiB, Mean Resident: 18.40 MiB

    User time: 168.611s, System time: 2.932s, Elapsed time: 173.784s
    Max Size: 37.73 MiB, Max Resident: 20.08 MiB
    Mean Size: 34.60 MiB, Mean Resident: 18.40 MiB

  with prototype patch:
    User time: 6.116s, System time: 2.332s, Elapsed time: 9.241s
    Max Size: 34.59 MiB, Max Resident: 16.43 MiB
    Mean Size: 30.59 MiB, Mean Resident: 14.37 MiB

    User time: 5.892s, System time: 2.552s, Elapsed time: 9.081s
    Max Size: 34.59 MiB, Max Resident: 16.43 MiB
    Mean Size: 30.60 MiB, Mean Resident: 14.37 MiB

    User time: 6.028s, System time: 2.392s, Elapsed time: 9.081s
    Max Size: 34.59 MiB, Max Resident: 16.43 MiB
    Mean Size: 30.64 MiB, Mean Resident: 14.41 MiB

# 
# old_revision [341e4a18c594cec49896fa97bd4e74de7bee5827]
# 
# patch "annotate.cc"
#  from [7bc1472f79bea8bc5b5b2370e35d63b676dd6569]
#    to [aa14cf2e6a2400086ec42453e0c0d04f2f7921b0]
# 
# patch "basic_io.cc"
#  from [a15ef15ce8808a455ee8f9a7edeeb2ba23eddc67]
#    to [2ae5b2e0b5c66b99efb7eb3586e8bca073b9945e]
# 
# patch "basic_io.hh"
#  from [a690f850ad0a49bd2309216008f8bf931bd0b017]
#    to [9d10c0fb2eae613507b13afd008fcfcf87c68dff]
# 
# patch "database.cc"
#  from [2722674ada7b591827618ca848ea6f76d4326c19]
#    to [f953667f1da998183b199dc2083e40b6778b45f7]
# 
# patch "monotone.cc"
#  from [f975c3c25c7731b7eb727854d9ec4e5ae964fca4]
#    to [0e7576fb57495f79f8ef7b68d2a1438074c02950]
# 
# patch "roster.cc"
#  from [76aa75696d056c98a890b0b0bb6c1164363a919e]
#    to [71a51993420f179145dddaefed35857f6215b1bd]
# 
# patch "vocab.cc"
#  from [b770a2970961886842d185bb6aa627ca20300994]
#    to [3b4e297acee0e4dedd899c2a22fc29b779998cb9]
# 
============================================================
--- annotate.cc 7bc1472f79bea8bc5b5b2370e35d63b676dd6569
+++ annotate.cc aa14cf2e6a2400086ec42453e0c0d04f2f7921b0
@@ -699,7 +699,66 @@
     }
 }
 
+///ERIC
+bool
+roster_get_revision_fid_info(const roster_data &dat,
+                             node_id const &file_id,
+                             file_t &file,
+                             marking_t &marks);
+static bool
+get_revision_fid_info(database &db,
+                      revision_id const & rev_id, 
+                      node_id const &file_id,
+                      file_id &file_content,
+                      marking_t &marks)
+{
+  if (rev_id.inner()().empty()) {
+    return false;
+  }
 
+  file_t direct_file;
+  marking_t direct_marks;
+  bool direct_found;
+  {
+    roster_data dat;
+    roster_id ident;
+    
+    db.get_roster_id_for_revision(rev_id, ident);
+    db.get_roster_version(ident, dat);
+    direct_found = roster_get_revision_fid_info(dat, file_id, direct_file, 
direct_marks);
+  }
+  
+  static const bool regression_test = false;
+  if (!regression_test) {
+    if (!direct_found) {
+      return false;
+    }
+    file_content = direct_file->content;
+    marks = direct_marks;
+    return true;
+  } else {
+    marking_map markmap;
+    roster_t roster;
+    db.get_roster(rev_id, roster, markmap);
+
+    if (!roster.has_node(file_id)) {
+      I(!direct_found);
+      return false;
+    }
+    I(direct_found);
+    map<node_id, marking_t>::const_iterator mmi =
+      markmap.find(file_id);
+    I(mmi != markmap.end());
+    file_t file = downcast_to_file_t(roster.get_node(file_id));
+    file_content = file->content;
+    marks = mmi->second;
+    I(direct_file->content == file->content);
+    I(direct_marks == marks);
+    return true;
+  }
+}
+
+
 static void
 do_annotate_node
 (annotate_node_work const & work_unit,
@@ -713,15 +772,22 @@
   I(nodes_complete.find(work_unit.revision) == nodes_complete.end());
   // nodes_seen.insert(make_pair(work_unit.revision, work_unit.lineage));
 
-  roster_t roster;
-  marking_map markmap;
-  app.db.get_roster(work_unit.revision, roster, markmap);
+  file_id file_in_child_content;
   marking_t marks;
+  bool found = get_revision_fid_info(app.db,
+                                     work_unit.revision, work_unit.fid,
+                                     file_in_child_content, marks);
+  I(found);
 
-  map<node_id, marking_t>::const_iterator mmi =
-    markmap.find(work_unit.fid);
-  I(mmi != markmap.end());
-  marks = mmi->second;
+//  marking_map markmap;
+//  roster_t roster;
+//  app.db.get_roster(work_unit.revision, roster, markmap);
+//  marking_t marks;
+//
+//  map<node_id, marking_t>::const_iterator mmi =
+//    markmap.find(work_unit.fid);
+//  I(mmi != markmap.end());
+//  marks = mmi->second;
 
   if (marks.file_content.size() == 0)
     {
@@ -760,15 +826,22 @@
     {
       revision_id parent_revision = *i;
 
-      roster_t parent_roster;
-      marking_map parent_marks;
       L(FL("do_annotate_node processing edge from parent %s to child %s")
         % parent_revision % work_unit.revision);
 
       I(!(work_unit.revision == parent_revision));
-      app.db.get_roster(parent_revision, parent_roster, parent_marks);
 
-      if (!parent_roster.has_node(work_unit.fid))
+      file_id file_in_parent_content;
+      marking_t marks_in_parent;
+      bool in_parent = get_revision_fid_info(app.db,
+                                             parent_revision, work_unit.fid,
+                                             file_in_parent_content, 
marks_in_parent);
+//      roster_t parent_roster;
+//      marking_map parent_marks;
+//      app.db.get_roster(parent_revision, parent_roster, parent_marks);
+
+      //      if (!parent_roster.has_node(work_unit.fid))
+      if (!in_parent) 
         {
           L(FL("file added in %s, continuing") % work_unit.revision);
           added_in_parent_count++;
@@ -776,15 +849,13 @@
         }
 
       // The node was live in the parent, so this represents a delta.
-      file_t file_in_child =
-       downcast_to_file_t(roster.get_node(work_unit.fid));
 
-      file_t file_in_parent =
-       downcast_to_file_t(parent_roster.get_node(work_unit.fid));
+//      file_t file_in_parent =
+//        downcast_to_file_t(parent_roster.get_node(work_unit.fid));
 
       shared_ptr<annotate_lineage_mapping> parent_lineage;
 
-      if (file_in_parent->content == file_in_child->content)
+      if (file_in_parent_content == file_in_child_content)
         {
           L(FL("parent file identical, "
               "set copied all mapped and copy lineage\n"));
@@ -794,9 +865,9 @@
       else
         {
           file_data data;
-          app.db.get_file_version(file_in_parent->content, data);
+          app.db.get_file_version(file_in_parent_content, data);
           L(FL("building parent lineage for parent file %s")
-           % file_in_parent->content);
+           % file_in_parent_content);
           parent_lineage
            = work_unit.lineage->build_parent_lineage(work_unit.annotations,
                                                      parent_revision,
============================================================
--- basic_io.cc a15ef15ce8808a455ee8f9a7edeeb2ba23eddc67
+++ basic_io.cc 2ae5b2e0b5c66b99efb7eb3586e8bca073b9945e
@@ -169,6 +169,13 @@
   return "TOK_UNKNOWN";
 }
 
+void
+basic_io::parser::roster_skip_to_record(const std::string &first_line)
+{
+  // Assume we are starting at the beginning of a record
+  
+}
+
 // Local Variables:
 // mode: C++
 // fill-column: 76
============================================================
--- basic_io.hh a690f850ad0a49bd2309216008f8bf931bd0b017
+++ basic_io.hh 9d10c0fb2eae613507b13afd008fcfcf87c68dff
@@ -304,6 +304,7 @@
                : (std::string(" with value ") + token)));
       advance();
     }
+    void roster_skip_to_record(const std::string &first_line);
   };
 
 }
============================================================
--- database.cc 2722674ada7b591827618ca848ea6f76d4326c19
+++ database.cc f953667f1da998183b199dc2083e40b6778b45f7
@@ -1229,9 +1229,12 @@
       app->finish(tmp);
       dat = data(tmp);
 
-      hexenc<id> final;
-      calculate_ident(dat, final);
-      I(final == ident);
+      extern bool global_slow_assertions_version_check;
+      if (global_slow_assertions_version_check) {
+        hexenc<id> final;
+        calculate_ident(dat, final);
+        I(final == ident);
+      }
     }
   vcache.insert(ident, dat);
 }
============================================================
--- monotone.cc f975c3c25c7731b7eb727854d9ec4e5ae964fca4
+++ monotone.cc 0e7576fb57495f79f8ef7b68d2a1438074c02950
@@ -252,6 +252,21 @@
   return string();
 }
 
+// commands.cc is still slow to do the annotation on; spends time in
+// extend_path_if_not_cycle; database.cc also a little slow for a while;
+// same with netsync.cc (spends time in epinc also, and
+// piecewise_applicator::copy)
+// otherwise spread over most of the files, it's about 65% time in the parsing
+// code, mostly in the skip to the next record chunk of the code.
+
+// Base 20.10s, 20.30, 20.15 (all slow assertiosn true)
+bool global_slow_assertions_version_check = false; // 5.94s, 5.96s, 5.87s
+bool global_slow_assertions_roster_check = false; // 20.22, 20.19
+bool global_slow_assertions_hexenc_verify = false; // 20.14s, 20.11s
+// All slow assertions false: 5.88, 5.74, 5.86
+
+bool global_disable_atomic_unique = false;
+
 int
 cpp_main(int argc, char ** argv)
 {
@@ -705,3 +720,4 @@
 // indent-tabs-mode: nil
 // End:
 // vim: et:sw=2:sts=2:ts=2:cino=>2s,{s,\:s,+s,t0,g0,^-2,e-2,n-2,p2s,(0,=s:
+
============================================================
--- roster.cc   76aa75696d056c98a890b0b0bb6c1164363a919e
+++ roster.cc   71a51993420f179145dddaefed35857f6215b1bd
@@ -808,9 +808,14 @@
   out = oss.str();
 }
 
+extern bool global_slow_assertions_roster_check;
+
 void
 roster_t::check_sane(bool temp_nodes_ok) const
 {
+  if (global_slow_assertions_roster_check == false) {
+    return;
+  }
   I(has_root());
   node_map::const_iterator ri;
 
@@ -2290,7 +2295,6 @@
   }
 }
 
-
 static void
 push_marking(basic_io::stanza & st,
              node_t curr,
@@ -2457,6 +2461,173 @@
   return n;
 }
 
+static void
+skip_to_next_record(basic_io::input_source &src)
+{
+  src.col = 1;
+  // records start at two newlines in a row
+  while(LIKELY(src.curr != src.in.end())) {
+    if (*src.curr == '\n') {
+      ++src.line;
+      if ((src.curr+1) != src.in.end() && *(src.curr+1) == '\n') {
+        ++src.curr;
+        ++src.curr;
+        ++src.line;
+        break;
+      }
+    }
+    ++src.curr;
+  }
+}
+
+static void
+skip_to_record_key(basic_io::input_source &src)
+{
+  while(LIKELY(src.curr != src.in.end())) {
+    if (*src.curr == ' ') {
+      // skip
+    } else if (isalpha(*src.curr)) {
+      break;
+    } else {
+      I(false);
+    }
+    ++src.curr;
+  }
+}
+
+static bool
+next_str_is(basic_io::input_source &src, const string &str)
+{
+  std::string::const_iterator i = src.curr;
+  std::string::const_iterator j = str.begin();
+  while(j != str.end()) {
+    if (i == src.in.end() || *i != *j) {
+      return false;
+    } 
+    ++i;
+    ++j;
+  }
+  src.curr = i;
+  return true;
+}
+
+static void
+skip_to_next_line(basic_io::input_source &src)
+{
+  while(LIKELY(src.curr != src.in.end())) {
+    if (*src.curr == '\n') {
+      ++src.curr;
+      break;
+    }
+    ++src.curr;
+  }
+  ++src.line;
+}
+
+
+bool
+roster_get_revision_fid_info(const roster_data &dat,
+                             node_id const &file_id_int,
+                             file_t &file,
+                             marking_t &marks)
+{
+  // there has to be a better way
+  char buf[30];
+  sprintf(buf,"\"%u\"",file_id_int);
+  string file_id_token(buf);
+
+  basic_io::input_source src(dat.inner()(), "roster");
+  basic_io::tokenizer tok(src);
+  basic_io::parser pars(tok);
+  {
+    pars.esym(syms::format_version);
+    string vers;
+    pars.str(vers);
+    I(vers == "1");
+  }
+
+  // Make sure we're at the beginning we expect to be at...
+  I(pars.token == "dir");
+  I(src.lookahead == ' ');
+  src.advance();
+  I(src.lookahead == '"');
+  src.advance();
+  I(src.lookahead == '"');
+  src.advance();
+  I(src.lookahead == '\n');
+  skip_to_next_record(src);
+
+  while(src.curr != src.in.end()) {
+    std::string::const_iterator save_record_start = src.curr;
+    skip_to_record_key(src);
+    I(src.curr != src.in.end());
+    if (next_str_is(src,"dir ")) {
+      // skip
+    } else if (next_str_is(src,"file ")) {
+      skip_to_next_line(src);
+      skip_to_record_key(src);
+      bool ok = next_str_is(src,"content ");
+      I(ok);
+      skip_to_next_line(src);
+      skip_to_record_key(src);
+      ok = next_str_is(src,"ident ");
+      I(ok);
+      if (next_str_is(src,file_id_token)) {
+        src.curr = save_record_start;
+        src.line -= 2; // backup by file, content
+        pars.advance();
+
+        file_t n;
+        if (pars.symp(syms::file))
+          {
+            string content, pth, ident;
+            pars.sym();
+            pars.str(pth);
+            pars.esym(syms::content);
+            pars.hex(content);
+            pars.esym(syms::ident);
+            pars.str(ident);
+            n = file_t(new file_node(read_num(ident),
+                                     file_id(content)));
+          } 
+        else 
+          {
+            I(false);
+          }
+        I(static_cast<bool>(n));
+        while(pars.symp(syms::attr))
+          {
+            pars.sym();
+            string k, v;
+            pars.str(k);
+            pars.str(v);
+            safe_insert(n->attrs, make_pair(attr_key(k),
+                                            make_pair(true, attr_value(v))));
+          }
+
+        // Dormant attrs
+        while(pars.symp(syms::dormant_attr))
+          {
+            pars.sym();
+            string k;
+            pars.str(k);
+            safe_insert(n->attrs, make_pair(attr_key(k),
+                                            make_pair(false, attr_value())));
+          }
+        parse_marking(pars, n, marks);
+        file = n;
+        return true;
+      }
+    } else {
+      I(false);
+    }
+    skip_to_next_record(src);
+  }
+  I(src.curr == src.in.end());
+  return false;
+}
+
+
 void
 roster_t::parse_from(basic_io::parser & pa,
                      marking_map & mm)
============================================================
--- vocab.cc    b770a2970961886842d185bb6aa627ca20300994
+++ vocab.cc    3b4e297acee0e4dedd899c2a22fc29b779998cb9
@@ -52,9 +52,15 @@
   val.ok = true;
 }
 
+extern bool global_slow_assertions_hexenc_verify;
+
 inline void
 verify(hexenc<id> & val)
 {
+  if (global_slow_assertions_hexenc_verify == false) {
+    return;
+  }
+
   if (val.ok)
     return;
 
@@ -161,12 +167,12 @@
   val.ok = true;
 }
 
-
+extern bool global_disable_atomic_unique;
 // Note that ATOMIC types each keep a static symbol-table object and a
 // counter of activations, and when there is an activation, the
 // members of the ATOMIC type initialize their internal string using a
 // copy of the string found in the symtab. Since some (all?) C++
-// string implementations are copy-on-write, this has the affect
+// string implementations are copy-on-write, this has the effect
 // of making the ATOMIC(foo) values constructed within a symbol table
 // scope share string storage.
 struct
@@ -178,6 +184,9 @@
   void clear() { vals.clear(); }
   string const & unique(string const & in)
   {
+    if (global_disable_atomic_unique) {
+      return in;
+    }
     // This produces a pair <iter,bool> where iter points to an
     // element of the table; the bool indicates whether the element is
     // new, but we don't actually care. We just want the iter.

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] Improving the performance of annotate, Eric Anderson, 2006/07/18
- Re: [Monotone-devel] Improving the performance of annotate, Daniel Carosone, 2006/07/18
  - [Monotone-devel] Re: Improving the performance of annotate, Graydon Hoare, 2006/07/19
    - Re: [Monotone-devel] Re: Improving the performance of annotate, Nathaniel Smith, 2006/07/19
    - [Monotone-devel] Re: Improving the performance of annotate, Graydon Hoare, 2006/07/19
- [Monotone-devel] Improving the performance of annotate, Eric Anderson <=
  - Re: [Monotone-devel] Improving the performance of annotate, Nathaniel Smith, 2006/07/19
- Re: [Monotone-devel] Improving the performance of annotate, Eric Anderson, 2006/07/19
  - [Monotone-devel] Re: Improving the performance of annotate, Graydon Hoare, 2006/07/19
- Re: [Monotone-devel] Improving the performance of annotate, Eric Anderson, 2006/07/19

Prev by Date: [Monotone-devel] Improving the performance of annotate
Next by Date: Re: [Monotone-devel] Patch to add Composite Benchmarks and some simple scalable add/commit tests
Previous by thread: [Monotone-devel] Re: Improving the performance of annotate
Next by thread: Re: [Monotone-devel] Improving the performance of annotate
Index(es):
- Date
- Thread