monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] serialization format


From: Markus Wanner
Subject: Re: [Monotone-devel] serialization format
Date: Tue, 5 Apr 2016 18:25:21 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 04/04/2016 10:02 PM, Ludovic Brenta wrote:
> No but they might care about performance.  How much of monotone's time
> is actually spent translating between binary and hex?  Is this really a
> major performance bottleneck?

Well, not the conversion between hex and binary itself, no. But the
effect the serialization format has on hashing.

Let's have a look at some perf samples gathered during a functional test
run:

> #
> # Overhead  Shared Object          Symbol                                     
>                                                     
> # ........  .....................  
> ...................................................................................................................
> #
>      6.80%  libbotan-1.10.so.1.10  [.] 
> _ZN5Botan12SHA_160_SSE210compress_nEPKhm                                      
>                                  
>      3.74%  libc-2.21.so           [.] _int_free                              
>                                                                         
>      2.60%  libstdc++.so.6.0.21    [.] 
> _ZSt18_Rb_tree_incrementPKSt18_Rb_tree_node_base                              
>                                  
>      2.24%  libstdc++.so.6.0.21    [.] 
> _ZSt29_Rb_tree_insert_and_rebalancebPSt18_Rb_tree_node_baseS0_RS_             
>                                  
>      1.85%  libc-2.21.so           [.] malloc                                 
>                                                                         
>      1.85%  mtn                    [.] 
> _ZNSt8_Rb_treeIN6option6optionI7optionsEES3_St9_IdentityIS3_ESt4lessIS3_ESaIS3_EE7_M_copyINS9_20_Reuse_or_alloc
>      1.73%  mtn                    [.] 
> _ZSt11__set_unionISt23_Rb_tree_const_iteratorIN6option6optionI7optionsEEES5_St15insert_iteratorISt3setIS4_St4le
>      1.66%  ld-2.21.so             [.] do_lookup_x                            
>                                                                         
>      1.57%  libcrypto.so.1.0.0     [.] DES_encrypt2                           
>                                                                         
>      1.36%  libc-2.21.so           [.] __memcmp_sse4_1                        
>                                                                         
>      1.17%  mtn                    [.] 
> _ZNSt8_Rb_treeIN6option6optionI7optionsEES3_St9_IdentityIS3_ESt4lessIS3_ESaIS3_EE8_M_eraseEPSt13_Rb_tree_nodeIS
>      1.04%  libc-2.21.so           [.] free                                   
>                                                                         
>      1.03%  libc-2.21.so           [.] malloc_consolidate                     
>                                                                         
>      0.98%  [unknown]              [k] 0xffffffff817f4ca0                     
>                                                                         
>      0.75%  mtn                    [.] 
> _ZNSt17_Function_handlerIFvP7optionsNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEMS0_FvS7_EE10_M_manage
>      0.71%  ld-2.21.so             [.] _dl_lookup_symbol_x                    
>                                                                         
>      0.71%  mtn                    [.] 
> _ZNSt17_Function_handlerIFvP7optionsEMS0_FvvEE10_M_managerERSt9_Any_dataRKS6_St18_Manager_operation
>             
>      0.67%  libcrypto.so.1.0.0     [.] DES_encrypt1                           
>                                                                         
>      0.64%  libbotan-1.10.so.1.10  [.] 
> _ZN5Botan16MDx_HashFunction12final_resultEPh                                  
>                                  
>      0.64%  libgmp.so.10.2.0       [.] __gmpn_redc_1                          
>                                                                         
>      0.62%  [unknown]              [k] 0xffffffff811b24fa                     
>                                                                         
>      0.58%  libc-2.21.so           [.] strlen                                 
>                                                                         
>      0.58%  libbotan-1.10.so.1.10  [.] 
> _ZN5Botan16MDx_HashFunction8add_dataEPKhm                                     
>                                  
>      0.57%  [unknown]              [k] 0xffffffff813d3417                     
>                                                                        
...
>      0.06%  libbotan-1.10.so.1.10  [.] _ZN5Botan10hex_decodeEPhPKcmRmb        
...
>      0.02%  libbotan-1.10.so.1.10  [.] _ZN5Botan10hex_encodeEPcPKhmb 


Hashing probably is the single most time consuming operation here, with
about 8% of the time spent (note that the add_data and final_result
methods are within the top 25 as well).

The CPU time that's used for the actual hex encoding and decoding is
vanishingly small, below 0.1%.


Now, I'm clearly not into micro optimizations (but rather consider
modifications like using base58 instead of the hex encoding for hashes
presented to the user - an encoding that's certain to consume more CPU
time, not sure how much more, though.)

However, reducing the amount of data to be hashed, cached and moved
around (in memory, network, etc..) sounds like a generally good idea to
me (performance wise). However, it's equally clearly a bad idea from a
usability perspective. So there's a balance. That's why I started this
thread.

Given the arguments so far I tend towards a binary encoding, as I think
developers should be able to handle binary data. And if users really
don't care...

Regards

Markus Wanner


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]