One of the best available opensource solutions is Google Benchmark.
You have to create simple wrappers around code you want to benchmark and link either statically or dynamically with the benchmark library. It is often useful to have such micro benchmarks compiled near with your code. For inspiration see awesome presentation.
static void BM_F(benchmark::State& state) {
const auto input1 = state.range_x();
const auto input2 = state.range_y();
while (state.KeepRunning()) F(input1, input2);
}
static void BM_D(benchmark::State& state) {
const auto input1 = state.range_x();
const auto input2 = state.range_y();
while (state.KeepRunning()) D(input1, input2);
}
BENCHMARK(BM_F)
->ArgPair(1, 10)
->ArgPair(10, 100)
->ArgPair(100, 1000);
BENCHMARK(BM_D)
->ArgPair(1, 10)
->ArgPair(10, 100)
->ArgPair(100, 1000);
If you want to measure raw CPU cycles, then your only choice is to use direct CPU instructions. For x86 you can use Time Stamp Counter.
But you should be aware, that such measuring will not resist any context switches performed by OS or jumping on CPUs. Your only choice in such situations will be to use an algorithm with a single flow of execution. Remember the ID of the CPU and the TSC value before entering to test function, and check the ID of the CPU after the test function. Then calculating the difference between TSC values. You may also set up CPU affinity for your process to stick the process to a specific CPU.
Another Linux-specific possible way to benchmark functions is to use perf tool.
But in any case, any measurement will add some error level to the result.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…