dcpi2bb - Insert DCPI profile feedback data in an image file
dcpi2bb [-make_bbdb] [-counts] [-pm all] [-load_lat] [<flags>]
Dcpi2bb inserts profile feedback information into an optional part of the
executable image for subsequent use by compilers or post-link optimizers
like spike(1). If the image was
originally compiled with the -feedback option, the compiler will have
included an empty basic block data base in the executable. If the image was
not compiled for feedback, dcpi2bb can create a basic block data base before
inserting profile data.
One must gather profile samples using
dcpid(1) before using dcpi2bb. For best results, application training
runs should be representative of anticipated production runs. Note: Because
DCPI has little overhead, it is often practical to use actual production
runs to generate profile feedback information.
- Populate the profile feedback data base with estimated execution
frequency counts for each basic block. Note: Because DCPI is a sampling
tool, there will be some statistical variation from run to run.
- Display a usage message.
- Include load latency value samples (if available). These must
have been previously gathered by running
dcpid(1) in load latency value profiling mode. See
- Create a basic block data base from the code structure and
relocations in the image.
- By default, an estimated count of 1 denotes P executions,
where P is the average sampling period used to gather the
samples. With -scale, the estimated basic-block counts are
scaled by P.
- -trace flag
- Enable debug tracing for the specified
flag (e.g. alloc, change, check, combine,
esli, hash, io, leak, match, merge, name,
offset, profile, rank, reorder, weight).
PROFILE SELECTION FLAGS
By default, this command automatically finds all of the relevant profile
files. The following options can be used to guide the search for the profile
- -db <directory name>
- Search for profile files in the specified profile database directory.
The directory name should be the same name as the one specified when
dcpid was started. If this option is not specified, the directory name
is obtained from the DCPIDB environment variable. If neither this
option, nor the DCPIDB environment variable are set, the name of
the directory used by the last invocation of dcpid on this machine
is used. If none of these methods succeed in finding the appropriate
directory, and no explicit set of profile files is provided via the
-profiles option, then the command fails.
- -epoch latest
- Search for profile files in the latest epoch. This is the default.
- -epoch latest-k
- Search for profile files in the "k+1"th oldest epoch. For example,
search in the third oldest epoch if -epoch latest-2 is
- -epoch <name>
- Search for profile files in the named epoch. The epoch name should be
the name of a subdirectory corresponding to a single epoch within the
profile database directory. Epoch subdirectory names usually take the
form YYYYMMDDHHMM (year-month-day-hours-minutes). For example,
an epoch started on June 11, 2002 at 22:33 would be named
200206112233. If an epoch is given a symbolic name by creating a
symbol link to the actual epoch directory, then the symbolic name can
also be used as an argument to the -epoch option.
- -epoch all
- Search for profile files in all epochs.
- -ihost <hostnames...> --
- Include just those profile files associated with the
specified host names. The list of host names must be
terminated either via -- or by the end of the option list.
The command prints an error message and fails if both the
-ihost and -ehost options are specified.
- -ehost <hostnames...> --
The list of
-- or by the
end of the
both the -ihost
- -label <label>
- Search for profile files with the specified label(s) (see dcpilabel(1)). This option can be repeated multiple times. If no labels are specified on the command line, profile file labels are ignored entirely. If any labels are specified on the command line, only profile files that have one of the specified labels are used.
- -profiles <file names...> --
- Use just the profile files named by the specified file names. The list of profile file names can be terminated either via --, or by the end of the option list. The command prints an error message and fails if the -profiles option is used in conjunction with any of the earlier automatic profile finding options. (Use the automatic profile lookup mechanism, or explicitly name the profile file with the -profile option; but don't do both.)
STATISTIC SELECTION FLAGS
Different kinds of performance counter statistics are available on various
models of Alpha CPUs. Alpha 21064/EV4, 21164/EV5 and 21264/EV6 CPUs have
traditional aggregate event counters. Alpha 21264A/EV67 and later processors
have a mix of some traditional aggregate event counters and newer ProfileMe
counters which allow accurate and precise instruction execution profiles on
out-of-order processors. (See
dcpiprofileme(1) for more information on ProfileMe statistics.)
The default statistic selection on an aggregate counter machine is to
select all the aggregate events. The default on a ProfileMe machine is to
select ProfileMe retire delay, retire count, !retired (i.e. aborted) count,
!notrap (i.e. trap) count, and aggregate cycles.
The options below can be used to select various statistics when available.
Use -event for aggregate statistics and -pm for ProfileMe
statistics. Note: there can be multiple, mixed -event and -pm
specifications. You can also specify the ratio of two statistics (written as
- -pm pm_stat(+pm_stat)
- Select the specified ProfileMe statistic plus any added in by optional
+pm_stat specifications. For example, select various trap statistics by
specifying the option -pm trap+replays+ldstorder+mispredict.
- -pm default(+pm_stat)
- Select the default set of ProfileMe statistics plus those added in by
+pm_stat specifications. At least one additional statistic is mandatory;
-pm default without modifications is extraneous and not allowed. The
additional ProfileMe statistics will take the place of the aggregate cycles
statistic which is selected by default.
- -pm all(-pm_stat)
- Select all ProfileMe statistics less those subtracted out. You can
repeat the optional -pm_stat specification to deselect multiple
ProfileMe statistics. Note: there are a lot of ProfileMe statistics.
Unless you deselect a bunch of them, this will select more statistics
than are appropriate for human consumption.
- -event ag_stat(+ag_stat)
- Select the specified aggregate statistic plus any added in by
optional +ag_stat specifications. For example, select cycles,
icache misses, and data cache misses when the option -event
cycles+imiss+dmiss is specified.
- -event all(-ag_stat)
- Select all aggregate statistics less those subtracted out. You can
repeat the optional -ag_stat specification to deselect multiple
- Select profile events corresponding to all event types, both
aggregate and ProfileMe. However, if there are ProfileMe events, this
will produce a large number of statistics, which in most cases will not
Profiles gathered on an EV6 machine are not currently useful for profile
feedback. Because EV6 is an out-of-order machine without ProfileMe
performance counters, it is impractical to estimate basic block execution
frequencies. Use a 21264A/EV67 (or later) system to generate profile
- dcpid db; run image; dcpiflush
- Profile image so DCPI can collect profile feedback data.
- dcpi2bb -db db -make_bbdb -counts -pm all image
- Run dcpi2bb to insert profile feedback data into executable image.
- spike image -feedback image -o
- Run spike(1) to analyze
image based on the profile feedback data stored in image
and to create a new, optimized image.faster.
For more information, see the DCPI project home page
Hewlett-Packard Company. All rights reserved.