[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Libunwind-devel] Re: [patch] Fix for race in dwarf_find_save_locs
From: |
Paul Pluzhnikov |
Subject: |
[Libunwind-devel] Re: [patch] Fix for race in dwarf_find_save_locs |
Date: |
Tue, 24 Nov 2009 11:50:53 -0800 |
On Tue, Nov 24, 2009 at 10:54 AM, Arun Sharma <address@hidden> wrote:
> Executing apply_reg_state with the lock held is a problem only for
> UNW_CACHE_GLOBAL.
With lock *not* held. Correct.
> How does the performance of UNW_CACHE_PER_THREAD compare
> in your tests?
In google3 tests? I don't have a good way to measure that.
I have tried to set UNW_CACHE_PER_THREAD as I was debugging this
race, but that caused crashes I didn't understand; perhaps I should
revisit that.
Hmm, I don't see how it could work at all in current code :(
Doesn't using UNW_CACHE_PER_THREAD require that unw_local_addr_space
in x86*/Ginit.c be made a per-thread variable? Otherwise, all threads
will share that global, but will not lock it.
For Gperf-simple, there is no discernible difference (data below),
but it only uses one thread.
> I'm inclined to apply the more conservative fix #1 until we have more data
> on the cost of the memcpy vs using UNW_CACHE_PER_THREAD.
My concern with fix#1 is that it reduces concurrency in a hot function
(apply_reg_state) -- we have CPUs to burn!
Data from running tests/Gperf-simple:
--- current ---
unw_getcontext : cold avg= 150.204 nsec, warm avg= 38.147 nsec
unw_init_local : cold avg= 259.876 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1848.312 min= 1341.956 avg= 1413.348 nsec
global cache : unw_step : 1st= 390.552 min= 131.698 avg= 180.194 nsec
per-thread cache: unw_step : 1st= 390.552 min= 131.698 avg= 171.771 nsec
unw_getcontext : cold avg= 159.740 nsec, warm avg= 47.684 nsec
unw_init_local : cold avg= 278.950 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1941.408 min= 1341.956 avg= 1424.901 nsec
global cache : unw_step : 1st= 342.869 min= 131.698 avg= 167.376 nsec
per-thread cache: unw_step : 1st= 304.268 min= 131.698 avg= 159.256 nsec
unw_getcontext : cold avg= 147.820 nsec, warm avg= 40.531 nsec
unw_init_local : cold avg= 259.876 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1855.124 min= 1360.121 avg= 1414.406 nsec
global cache : unw_step : 1st= 429.153 min= 170.299 avg= 180.939 nsec
per-thread cache: unw_step : 1st= 304.268 min= 131.698 avg= 173.349 nsec
unw_getcontext : cold avg= 138.283 nsec, warm avg= 40.531 nsec
unw_init_local : cold avg= 259.876 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1961.844 min= 1341.956 avg= 1367.185 nsec
global cache : unw_step : 1st= 390.552 min= 131.698 avg= 140.364 nsec
per-thread cache: unw_step : 1st= 267.937 min= 131.698 avg= 134.995 nsec
--- fix #1 ---
unw_getcontext : cold avg= 138.283 nsec, warm avg= 50.068 nsec
unw_init_local : cold avg= 278.950 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1866.477 min= 1341.956 avg= 1409.588 nsec
global cache : unw_step : 1st= 361.034 min= 131.698 avg= 152.169 nsec
per-thread cache: unw_step : 1st= 417.800 min= 131.698 avg= 164.150 nsec
unw_getcontext : cold avg= 150.204 nsec, warm avg= 38.147 nsec
unw_init_local : cold avg= 288.486 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1827.876 min= 1341.956 avg= 1396.537 nsec
global cache : unw_step : 1st= 342.869 min= 170.299 avg= 179.509 nsec
per-thread cache: unw_step : 1st= 295.185 min= 170.299 avg= 175.034 nsec
unw_getcontext : cold avg= 159.740 nsec, warm avg= 50.068 nsec
unw_init_local : cold avg= 290.871 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1914.161 min= 1360.121 avg= 1415.378 nsec
global cache : unw_step : 1st= 283.832 min= 140.780 avg= 145.469 nsec
per-thread cache: unw_step : 1st= 286.102 min= 131.698 avg= 141.995 nsec
unw_getcontext : cold avg= 150.204 nsec, warm avg= 47.684 nsec
unw_init_local : cold avg= 271.797 nsec, warm avg= 47.684 nsec
no cache : unw_step : 1st= 1839.229 min= 1341.956 avg= 1428.347 nsec
global cache : unw_step : 1st= 295.185 min= 131.698 avg= 140.117 nsec
per-thread cache: unw_step : 1st= 286.102 min= 131.698 avg= 164.806 nsec
--- fix #2 ---
unw_getcontext : cold avg= 159.740 nsec, warm avg= 38.147 nsec
unw_init_local : cold avg= 278.950 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 2132.143 min= 1341.956 avg= 1433.247 nsec
global cache : unw_step : 1st= 381.470 min= 161.216 avg= 167.335 nsec
per-thread cache: unw_step : 1st= 351.951 min= 161.216 avg= 163.721 nsec
unw_getcontext : cold avg= 138.283 nsec, warm avg= 40.531 nsec
unw_init_local : cold avg= 259.876 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1868.748 min= 1360.121 avg= 1410.736 nsec
global cache : unw_step : 1st= 351.951 min= 161.216 avg= 167.386 nsec
per-thread cache: unw_step : 1st= 304.268 min= 161.216 avg= 164.086 nsec
unw_getcontext : cold avg= 140.667 nsec, warm avg= 50.068 nsec
unw_init_local : cold avg= 269.413 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1836.958 min= 1341.956 avg= 1377.523 nsec
global cache : unw_step : 1st= 390.552 min= 161.216 avg= 174.246 nsec
per-thread cache: unw_step : 1st= 447.319 min= 161.216 avg= 185.533 nsec
unw_getcontext : cold avg= 150.204 nsec, warm avg= 50.068 nsec
unw_init_local : cold avg= 290.871 nsec, warm avg= 50.068 nsec
no cache : unw_step : 1st= 1923.243 min= 1360.121 avg= 1430.753 nsec
global cache : unw_step : 1st= 324.703 min= 161.216 avg= 167.917 nsec
per-thread cache: unw_step : 1st= 283.832 min= 161.216 avg= 163.593 nsec
--
Paul Pluzhnikov