Counter, MSRs - model specific registers
https://easyperf.net/blog/2018/06/01/PMU-counters-and-profiling-basics
Counting vs. Sampling
Counting
- disable counting
- set all the counters to 0
- configure evenst that we want to measure
- enable counting
- run the application
- disable counting
- read the values of the counters
Sampling
- set counter to 0
- enable counting
- wait for the overflow and disable counting when it happens
- inside the interrupt handler capture IP, registers state, etc.
- repeat the process
we set initial value to 0xFFFFFFFF - 0x19999999, where 0x19999999 represent the number of clockticks in 100ms. And in this case we will receive an interrupt after 0x19999999 clocks.
CPL_CYCLES.RING0_TRANS = halt로 진입한 횟수?
[Number of intervals between processor halts while thread is in ring 0.]
uops_retired.total_cycles = cpu_clk_unhalted.thread = cycles
uops_retired.stall_cycles = cycle_activity.cycles_no_execute
cycle_activity.cycles_no_execute
[This event increments by 1 for every cycle where there was no execute for this thread]
Summary:
CLKS = cpu_clk_unhalted.thread = cycles
[Per-thread actual clocks when the logical processor is active]
CPI = inst_retired.any, cycles
[Cycles Per Instruction (threaded)]
CPU_Utilization = cpu_clk_unhalted.ref_tsc, msr/tsc/
[Average CPU Utilization]
Instructions = inst_retired.any = instructions
[Total number of retired Instructions]
Kernel_Utilization = cpu_clk_unhalted.ref_tsc:u, cpu_clk_unhalted.ref_tsc
[Fraction of cycles spent in Kernel mode]
SMT_2T_Utilization = cpu_clk_thread_unhalted.one_thread_active, cpu_clk_thread_unhalted.ref_xclk_any
[Fraction of cycles where both hardware threads were active]
Note: HLT instruction != Stall cycle. HLT instructino은 logical core를 halt state로 전환시키는 역할이며, stall cycle은 pipelining에서 발생하는 bubble 횟수임
Note: HLT instruction은 ring 0 privilege 필요.
Note: CPU가 halt state로 만드는 instruction은 HLT instruction, MWAIT instruction 두 종류가 있음
Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.
https://www.masterraghu.com/subjects/np/introduction/unix_network_programming_v1.3/ch06lev1sec2.html
'Research Log > Tracing' 카테고리의 다른 글
Advanced profiling topics. PEBS and LBR. (0) | 2022.03.06 |
---|---|
The PMCs of EC2: Measuring IPC (0) | 2022.03.06 |
CPU cycle에 대한 고찰 (0) | 2022.03.06 |
BTF, CO-RE (0) | 2022.03.06 |
PCI (0) | 2022.03.02 |