dcpivarg - A tool for examining value profile information on Alpha
21064/EV4 and 21164/EV5 systems
[options] [-p] [procedure-name] image-file
Dcpivarg is a multi-purpose value profile data analysis tool.
Dcpivarg can perform one operation at a time from the following list, with
the operation selected through specification of the corresponding flag:
-arg_full ; Print arguments to procedures
-arg_terse ; Tersely print arguments to procedures
-ret_full ; Print procedure return reg info
-ret_save ; Print first instruction that saves the ra register
-ret_terse ; [DEFAULT] Tersely print procedure call graph info
-inv_blocks ; Print value-invariant blocks
-top_loads ; Show loads that stall
Note that most of this information is also available from
dcpilist(1) with the -asm -vprof options.
Dcpivarg accepts all flags that
dcpivlst(1) accepts. Please see the dcpivlst documentation for
descriptions of those flags.
- Operation: Tries to find and print value profile data for the
procedure argument registers (a0 to a5). The operation
looks at the first few basic blocks of the procedure, scanning for
instructions that save the argument registers. The basic blocks scanned are
those forming a straight-line segment at the start of the procedure. Any
argument-register-saving instructions that are found are printed using the
format of dcpivlst(1). Sample
output follows. The percentage before the procedure name is total cycles
that fall into this procedure in the image.
5.45% power (../super.c):
cycles vtot thld nv
8 0x120099efc cpys $f16,$f16,$f4 3087 232.6 13
19593 0x120099f00 cpys $f17,$f17,$f3 4355 1.0 14 f3: (36.1% 8) ...
- Operation: This is a one-line alternate output format for the
same information as computed by the -arg_full operation. For each
argument register only the percentage of the most commonly seen value
profile data value is shown, i.e., the first value in the sorted value/count
5.45% power (../super.c): 0.0% 36.1%
- Operation: This operation uses the same methods as -arg_full
to find value profile data for the return address register ra. The
output format is the same as in -arg_full, as shown below.
18.73% power (../super.c):
cycles vtot thld nv
109135 0x120087518 stq ra, 0(sp) 2668 1.0 6 ra: (26.4% 0x120087488) ...
- Operation: This operation is an alternate output format for
-ret_full. It prints the procedure name and any instructions found to
save the return address register ra on a single line. Sample output
is shown below.
5.45% power (../super.c): 0x120099ee8 stq ra, 0(sp)
- Default operation: This operation uses the value profile data
for the return address register ra to approximate a call graph for
frequently used procedures. The call graph is computed using the values for
ra collected using the same methods as in -ret_full. Also see
the -ra_image flag below. Sample output is shown below.
power 24.1% from evaluate_superellipsoid(0x12000e460)
power 0.3% from Superellipsoid_Normal(0x120099f58)
- -ra_image <F>
- This flag allows the -ret_terse operation to look up procedure
names in an image other than that whose value profile is being reported.
This allows correct reporting of call sites for images such as shared
- Operation: This operation outputs
dcpivlst(1)-like listings for frequently executed basic blocks with
high value invariance. The cutoff for how frequently executed a block
needs to be for listing is a percentage of total program cycle samples.
This cutoff is set by -inv_ccut, with a default of 0.1%. The
cutoff for how invariant a block needs to be for listing is the average
percentage of the most significant value in the value profile data over
the basic block. This cutoff is set by -inv_vcut, with a
default of 80%. Sample output is shown below. The percentage before the
procedure name is total cycle samples in that procedure, over the
execution of the program, and the percentage in parentheses is total
cycles in this basic block.
5.45% power (../super.c): (0.7%)
cycles cpi vtot thld nv
126810 8.4cy 0x120099f38 ldq at, -30776(gp) 1822 1.0 1 at: (100.0% 0x12000e460)
46688 3.1cy 0x120099f3c s4addq s0, at, at 717 1.0 1 at: (100.0% 0x12000e464)
15942 1.1cy 0x120099f40 ldl at, 0(at) 330 1.0 1 at: (100.0% 0xffffffffe00793a8)
50115 3.3cy 0x120099f44 addq at, gp, at 728 1.0 1 at: (100.0% 0x120099f58)
14396 0.9cy 0x120099f48 jmp zero, (at), 0x120099f4c 381 1.0 1 at: (100.0% 0x120099f58)
- Operation: Prints load instructions that stall and an
approximation of the stall cycles incurred by them. The output shows the
percentage of total program cycles and the percentage of procedure stall
cycles accounted for by the stall on the load. Below is sample output
showing a load whose stall accounts for 0.5% of total program cycles,
while the address is the same approximately 80% of the time.
9.22% polyeval (../polysolv.c):
total stall vtot thld nv
0.49685% 39.3% 0x12006ed74 ldt $f1, 24(a2) 1509 88.2 15 f1: (0.1% 5.95031) ...
a2: (80.0% 0x11fffd110) ...
- Prints the percentage of cycles accounted for by a procedure before
the name of that procedure. This allows -allprocs output to be
Dcpivarg works only on Alpha 21064/EV4 and 21164/EV5 processors. For Alpha
21264/EV6 and later processors, use
For more information, see the DCPI project home page
Hewlett-Packard Company. All rights reserved.