perf

Readup

Wiki.

General

A tool in the Linux kernel using advanced method to monitor both kernel and userspace performance. Available through linuxPackages.perf in nixpkgs. Some commands required specially configured kernels and thus may not be available.

Events

Events are provided by various sources. They can contain sub-events and modifiers.

Software events

SE are pure kernel counters. E.g. context-switches, minor-faults.

PMU Hardware events

Generated by the Performance Monitoring Unit - a part of the processor architecture, - which measures cycles, L1 misses, retired instructions. These depend on the processor.

Hardware cache events

A handful of monikers for common processor events, provided by the perf_events interface.

Tracepoint events

Events, provided by the ftrace infrastructure.

Multiplexing

When there are more events than counters, kernel will not measure an event all the time, but will do so in intervals, scaling the count once tool has finished running. Thus, in such situations perf displays only an estimate of what number of events was emitted. When scaling, perf will report the amount in percents.

$ perf stat -B -e cycles,cycles,cycles,cycles,cycles,cycles stress --cpu 4 -t 1s
stress: info: [2448427] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
stress: info: [2448427] successful run completed in 1s
 Performance counter stats for 'stress --cpu 4 -t 1s':
    14,965,192,076      cycles:u                                                                (83.33%)
    14,963,395,776      cycles:u                                                                (83.32%)
    14,964,864,137      cycles:u                                                                (83.31%)
    14,963,907,082      cycles:u                                                                (83.37%)
    14,964,520,486      cycles:u                                                                (83.34%)
    14,965,279,701      cycles:u                                                                (83.36%)
       1.001521732 seconds time elapsed
       3.963546000 seconds user
       0.000998000 seconds sys

Reporting seems to become progressively less accurate as scaling is utilized more heavily.

Commands