dcpi2pix - Translate DCPI profile data to pixie format
dcpi2pix [<flags>] [-scale] [-version] image-file
Dcpi2pix processes an image file and one or more DCPI profiles for that
image and produces a pixie-style output file (*.Counts) containing the
relative execution counts for each basic block in the image.
The execution counts are estimated from cycles samples using heuristics as
in dcpicalc(1). When a basic
block's estimated execution count has too low a confidence, dcpi2pix
arbitrarily writes a count of 0 for the block in the Counts file.
Users will typically want to use the -conf_low option to include
Before running dcpi2pix, one must gather cycles samples using
dcpid(1) and construct a Addrs file using pixie.
Since DCPI is a sampling-based profiling system and pixie produces exact
basic block execution counts through instrumentation, profiles produced by
dcpi2pix might confuse downstream tools. In particular, if a downstream tool
depends on having exact counter values, with basic block counts that satisfy
the flow constraints of the control flow graph, then it might become
confused when confronted by a dcpi2pix-generated Counts, which will
not have this property (see BUGS, below, for an example). If the tools use
the basic block counts only to determine which parts of the program were
frequently executed, however, the output produced by dcpi2pix should work
Pixie files contain an embedded CPU type in the profile data. Dcpi2pix
determines the cpu type using the cpu type of the machine on which the first
specified sample file was gathered.
- By default, an estimated count of 1 denotes P executions, where P is
the average sampling period used to gather the cycles samples. With -scale,
the estimated basic-block counts are scaled by P.
- Print dcpi2pix version string.
PROFILE SELECTION FLAGS
By default, this command automatically finds all of the relevant profile
files. The following options can be used to guide the search for the profile
- -db <directory name>
- Search for profile files in the specified profile database directory.
The directory name should be the same name as the one specified when
dcpid was started. If this option is not specified, the directory name
is obtained from the DCPIDB environment variable. If neither this
option, nor the DCPIDB environment variable are set, the name of
the directory used by the last invocation of dcpid on this machine
is used. If none of these methods succeed in finding the appropriate
directory, and no explicit set of profile files is provided via the
-profiles option, then the command fails.
- -epoch latest
- Search for profile files in the latest epoch. This is the default.
- -epoch latest-k
- Search for profile files in the "k+1"th oldest epoch. For example,
search in the third oldest epoch if -epoch latest-2 is
- -epoch <name>
- Search for profile files in the named epoch. The epoch name should be
the name of a subdirectory corresponding to a single epoch within the
profile database directory. Epoch subdirectory names usually take the
form YYYYMMDDHHMM (year-month-day-hours-minutes). For example,
an epoch started on June 11, 2002 at 22:33 would be named
200206112233. If an epoch is given a symbolic name by creating a
symbol link to the actual epoch directory, then the symbolic name can
also be used as an argument to the -epoch option.
- -epoch all
- Search for profile files in all epochs.
- -ihost <hostnames...> --
- Include just those profile files associated with the
specified host names. The list of host names must be
terminated either via -- or by the end of the option list.
The command prints an error message and fails if both the
-ihost and -ehost options are specified.
- -ehost <hostnames...> --
The list of
-- or by the
end of the
- -label <label>
- Search for profile files with the specified label(s) (see dcpilabel(1)). This option can be repeated multiple times. If no labels are specified on the command line, profile file labels are ignored entirely. If any labels are specified on the command line, only profile files that have one of the specified labels are used.
- -profiles <file names...> --
- Use just the profile files named by the specified file names. The list of profile file names can be terminated either via --, or by the end of the option list. The command prints an error message and fails if the -profiles option is used in conjunction with any of the earlier automatic profile finding options. (Use the automatic profile lookup mechanism, or explicitly name the profile file with the -profile option; but don't do both.)
STATISTIC SELECTION FLAGS
Different kinds of performance counter statistics are available on various
models of Alpha CPUs. Alpha 21064/EV4, 21164/EV5 and 21264/EV6 CPUs have
traditional aggregate event counters. Alpha 21264A/EV67 and later processors
have a mix of some traditional aggregate event counters and newer ProfileMe
counters which allow accurate and precise instruction execution profiles on
out-of-order processors. (See
dcpiprofileme(1) for more information on ProfileMe statistics.)
The default statistic selection on an aggregate counter machine is to
select all the aggregate events. The default on a ProfileMe machine is to
select ProfileMe retire delay, retire count, !retired (i.e. aborted) count,
!notrap (i.e. trap) count, and aggregate cycles.
The options below can be used to select various statistics when available.
Use -event for aggregate statistics and -pm for ProfileMe
statistics. Note: there can be multiple, mixed -event and -pm
specifications. You can also specify the ratio of two statistics (written as
- -pm pm_stat(+pm_stat)
- Select the specified ProfileMe statistic plus any added in by optional
+pm_stat specifications. For example, select various trap statistics by
specifying the option -pm trap+replays+ldstorder+mispredict.
- -pm default(+pm_stat)
- Select the default set of ProfileMe statistics plus those added in by
+pm_stat specifications. At least one additional statistic is mandatory;
-pm default without modifications is extraneous and not allowed. The
additional ProfileMe statistics will take the place of the aggregate cycles
statistic which is selected by default.
- -pm all(-pm_stat)
- Select all ProfileMe statistics less those subtracted out. You can
repeat the optional -pm_stat specification to deselect multiple
ProfileMe statistics. Note: there are a lot of ProfileMe statistics.
Unless you deselect a bunch of them, this will select more statistics
than are appropriate for human consumption.
- -event ag_stat(+ag_stat)
- Select the specified aggregate statistic plus any added in by
optional +ag_stat specifications. For example, select cycles,
icache misses, and data cache misses when the option -event
cycles+imiss+dmiss is specified.
- -event all(-ag_stat)
- Select all aggregate statistics less those subtracted out. You can
repeat the optional -ag_stat specification to deselect multiple
- Select profile events corresponding to all event types, both
aggregate and ProfileMe. However, if there are ProfileMe events, this
will produce a large number of statistics, which in most cases will not
EXECUTION COUNT AND STALL ANALYSIS FLAGS
The following options can be used to control the heuristics for estimating
execution counts and identifying the causes of stalls.
- Generate low, medium, and high confidence data.
- Generate medium and high confidence data. (default)
- Generate only high confidence data.
- -cross_procedure [optimistic | pessimistic | selective]
- Choose what assumption to make when a procedure call
boundary is encountered while looking for reasons to explain
dynamic stalls. A procedure call boundary is either a call
made by the procedure being analyzed or the beginning or end
of that procedure. With pessimistic, assume that
whatever happens outside the analyzed procedure can cause a
dynamic stall inside it. With optimistic, assume that
it cannot. With selective, the assumption is based on
standard procedure call convention. (The default is
- Use a (non-linear time) constraint solver to exploit
global flow constraints when estimating execution counts.
The estimates may still violate flow constraints.
- -tab foo.tab
- Get execution counts from output of
dcpix(1) instead of making estimates,
which may be inaccurate. Requires a .xct
which may be
- -xct_factor num
- Scales counts from .xct files by num. Useful when you run a program once under dcpix(1) but multiple (num) times under dcpid(1) to get more samples. Used in conjunction with -tab and -xct.
Using pixstats(1) with profiles generated by dcpi2pix is known to give
incorrect procedure-level information. Pixstats relies on having exact
basic-block counts and only checks to see if a basic block starts a new
procedure when the basic block's count is bigger than 0. If the entry basic
block for a procedure does not accrue any samples during profiling using
DCPI, then the basic block's effects will be incorrectly attributed to a
previous procedure in the image.
- pixie image
- Run pixie(1) over executable image, producing the file image.Addrs
containing the address of each basic block in image, and the pixified
executable file image.pixie. The image.pixie file can be
deleted since it is not needed by dcpi2pix.
- dcpid db; run image; dcpiflush
- Get cycles samples for image.
- dcpi2pix -conf_low image -db db
- Run dcpi2pix over executable image and cycles samples in
db, producing the pixie-format file image.Counts.
Note that dcpi2pix automatically loads the image.Addrs
file generated by pixie.
- pixstats image
- Run pixstats(1) over executable image and image.Counts,
generating a program execution analysis report.
For more information, see the DCPI project home page
Hewlett-Packard Company. All rights reserved.