HP DCPI tool

»

DCPI

Site information

» Send us your comments

Installation

» Download DCPI
» Installing DCPI

Product information

» Frequently asked questions
» Documentation
» Publications
customer times newsletter link

dcpiprof(1)

NAME

dcpiprof - Analyze profile data collected by dcpid

SYNOPSIS

dcpiprof [-i] [-keep percentage] [-p image-file-name] [-m map-file-name] [-no_header] [image-names...]

DESCRIPTION

Dcpiprof summarizes a set of profile files by printing a histogram of the number of samples per image or per procedure. The output is sorted by decreasing number of samples found within that image or procedure. Each entry in the listing is annotated with the number of samples, the percentage of samples that belong to this entry, and a cumulative percentage value.

If one or more image names are specified on the command line, then only the profile files corresponding to the named images are used to generate the output. The output will be per procedure unless the -i option is specified. If no image names are listed on the command line, then dcpiprof reads all profile files found in the profile database and behaves as if the -i option was specified; producing output sorted by the number of samples found within each image.

Dcpiprof sometimes reports that it could not open some image files. In such cases, you can help dcpiprof locate the appropriate image files either by using the -p option to specify the name of an image file of interest, or by using the -m option to supply an image map generated by dcpiscan(1).

FLAGS

-i
Lists samples collected per image instead of samples per procedure.

-keep p
Lists just enough top routines to account for the top p percent of the samples of the event type used to sort the profile output. The value p must be a floating point number in the range [0..100].

-p image-file-name
Use the specified image file as a candidate when associating profiles with named image files. This option can be repeated, allowing several image names to be specified on the command line.

-m map-file-name
Use specified map file generated by dcpiscan(1) for associating profiles with named images. At install time, a default map is constructed for binaries found in the usual locations. Specifying additional maps will allow dcpiprof to find site-specific binaries. This option can be repeated, allowing several map files to be specified; information from all of the supplied map files is merged. The -m option works like the -p option, except that instead of specifying one image at a time, a whole set of images can be entered into a map file via dcpiscan(1) and the entire set can be specified with one command line option.

-cnt_bops | -cnt_fops | -cnt_iops | -cnt_mops | -cnt_nops | -cnt_pops
Attempt to add a column tallying the { branch | floating | integer | memory | NOP | PALcode } operations in each procedure. Tallying the operations requires ProfileMe retire counts to be present in the listing (as they normally are by default). If aggregate cycle counts are also present, this flag will add a line in the heading showing the operation per second rate. Note: For large images, this option will run slowly. Also, it currently will not work across multiple images.

-no_header
Do not print any header in the output. This option may be useful for programs that parse the output of dcpiprof.

PROFILE SELECTION FLAGS

By default, this command automatically finds all of the relevant profile files. The following options can be used to guide the search for the profile files.

-db <directory name>
Search for profile files in the specified profile database directory. The directory name should be the same name as the one specified when dcpid was started. If this option is not specified, the directory name is obtained from the DCPIDB environment variable. If neither this option, nor the DCPIDB environment variable are set, the name of the directory used by the last invocation of dcpid on this machine is used. If none of these methods succeed in finding the appropriate directory, and no explicit set of profile files is provided via the -profiles option, then the command fails.

-epoch latest
Search for profile files in the latest epoch. This is the default.

-epoch latest-k
Search for profile files in the "k+1"th oldest epoch. For example, search in the third oldest epoch if -epoch latest-2 is specified.

-epoch <name>
Search for profile files in the named epoch. The epoch name should be the name of a subdirectory corresponding to a single epoch within the profile database directory. Epoch subdirectory names usually take the form YYYYMMDDHHMM (year-month-day-hours-minutes). For example, an epoch started on June 11, 2002 at 22:33 would be named 200206112233. If an epoch is given a symbolic name by creating a symbol link to the actual epoch directory, then the symbolic name can also be used as an argument to the -epoch option.

-epoch all
Search for profile files in all epochs.

-ihost <hostnames...> --

Include just those profile files associated with the specified host names. The list of host names must be terminated either via -- or by the end of the option list. The command prints an error message and fails if both the -ihost and -ehost options are specified.

-ehost <hostnames...> --

Exclude any profile files associated with the specified host names. The list of host names must be terminated either via -- or by the end of the option list. The command prints an error message and fails if both the -ihost and -ehost options are specified.

-label <label>
Search for profile files with the specified label(s) (see dcpilabel(1)). This option can be repeated multiple times. If no labels are specified on the command line, profile file labels are ignored entirely. If any labels are specified on the command line, only profile files that have one of the specified labels are used.

-profiles <file names...> --
Use just the profile files named by the specified file names. The list of profile file names can be terminated either via --, or by the end of the option list. The command prints an error message and fails if the -profiles option is used in conjunction with any of the earlier automatic profile finding options. (Use the automatic profile lookup mechanism, or explicitly name the profile file with the -profile option; but don't do both.)

STATISTIC SELECTION FLAGS

Different kinds of performance counter statistics are available on various models of Alpha CPUs. Alpha 21064/EV4, 21164/EV5 and 21264/EV6 CPUs have traditional aggregate event counters. Alpha 21264A/EV67 and later processors have a mix of some traditional aggregate event counters and newer ProfileMe counters which allow accurate and precise instruction execution profiles on out-of-order processors. (See dcpiprofileme(1) for more information on ProfileMe statistics.)

The default statistic selection on an aggregate counter machine is to select all the aggregate events. The default on a ProfileMe machine is to select ProfileMe retire delay, retire count, !retired (i.e. aborted) count, !notrap (i.e. trap) count, and aggregate cycles.

The options below can be used to select various statistics when available. Use -event for aggregate statistics and -pm for ProfileMe statistics. Note: there can be multiple, mixed -event and -pm specifications. You can also specify the ratio of two statistics (written as stat1::stat2).

-pm pm_stat(+pm_stat)
Select the specified ProfileMe statistic plus any added in by optional +pm_stat specifications. For example, select various trap statistics by specifying the option -pm trap+replays+ldstorder+mispredict.

-pm default(+pm_stat)
Select the default set of ProfileMe statistics plus those added in by +pm_stat specifications. At least one additional statistic is mandatory; -pm default without modifications is extraneous and not allowed. The additional ProfileMe statistics will take the place of the aggregate cycles statistic which is selected by default.

-pm all(-pm_stat)
Select all ProfileMe statistics less those subtracted out. You can repeat the optional -pm_stat specification to deselect multiple ProfileMe statistics. Note: there are a lot of ProfileMe statistics. Unless you deselect a bunch of them, this will select more statistics than are appropriate for human consumption.

-event ag_stat(+ag_stat)
Select the specified aggregate statistic plus any added in by optional +ag_stat specifications. For example, select cycles, icache misses, and data cache misses when the option -event cycles+imiss+dmiss is specified.

-event all(-ag_stat)
Select all aggregate statistics less those subtracted out. You can repeat the optional -ag_stat specification to deselect multiple aggregate statistics.

-allevents
Select profile events corresponding to all event types, both aggregate and ProfileMe. However, if there are ProfileMe events, this will produce a large number of statistics, which in most cases will not be useful.

PROFILE SORTING FLAGS

[-s | -st | -sp] statistic

The named statistic is used to sort the profile output. For -s or -st, the statistic is assumed to be an aggregate event; for -sp it is assumed to be a ProfileMe statistic. If this option is not specified, the output is sorted so that the procedure or image that accounts for the most cycles is listed first; if the database contains ProfileMe statistics, the sort key is valid:retdelay, and otherwise the sort key is cycles. If neither of these statistics appears in the output, the first column in the output is used as the sort key. If this option is specified, the statistic specified as the sort key does not need to be listed explicitly in a -event or -pm specification; it will be included automatically.

EXAMPLE USAGE

dcpiprof
Use dcpiprof to analyze the breakdown of cpu time across all images that contribute to the contents of the profile database.

dcpiprof <image names...>
Use dcpiprof to analyze the breakdown of cpu time across all procedures for the specified images.

dcpiprof -keep 99.99 ...
Stop the output after accounting for 99.99% of the samples.

INTERPRETING THE OUTPUT

Dcpiprof prints a header, followed by a number of lines of output. If per-image profiles are being produced, there is a line per image, and the last column in the line is the name of the image. Otherwise there is a line per procedure and the last two columns contain the name of the procedure and the image to which the procedure belongs.

For example, consider the following output:

  Event            Samples  Period
  -----            -------  ------
  cycles       21761024463   63488
  imiss         1943063555    4096
  
  The counts given below are the number of samples for each
  listed event type.
  ==========================================================
      cycles       %    cum%      imiss       % procedure       image
  9479311336  43.56%  43.56%   94570129   4.87% idle_thread     /vmunix
  3093399786  14.22%  57.78%  359058745  18.48% _XentInt        /vmunix
  2982861812  13.71%  71.48%   32386524   1.67% gh_zero_memory  /vmunix
  ...
This provides information on two different types of events: cycles events and imiss events (i.e., instruction cache misses).

The header gives the list of reported events, the total number of samples recorded per event type, and sampling period for each event. (For example, 1943063555 samples of type imiss were recorded with each sample accounting for 4096 imisses on average.)

The first three columns in each line contain information about the number of event samples that correspond to the event used to sort the dcpiprof output (cycles by default.) The first one of these columns lists the number of event samples that fell within this image/procedure (i.e. 9479311336 within idle_thread). The second column lists the percentage these event samples form of the total number of samples of this event type listed in dcpiprof's output (i.e. 14.22% of all cycle samples in dcpiprof's output fell within _XentInt). The third column gives the cumulative percentage of all event samples on this line and above (i.e. the top three procedures in the example account for 71.48% of the cycle samples.)

The remaining columns report the number of samples of other secondary event types. There are two such columns per secondary event type. The first column lists of the number of samples of that type (i.e. 94570129 imiss samples for idle_thread). The second column lists the percentage this number forms of the total number of samples of that type listed in dcpiprof's output (i.e. 18.48% of all imiss samples in dcpiprof's output fell within _XentInt).

SEE ALSO

dcpi(1), dcpi2bb(1), dcpi2pix(1), dcpi2ps(1), dcpicalc(1), dcpicat(1), dcpicc(1), dcpicoverage(1), dcpictl(1), dcpid(1), dcpidiff(1), dcpidis(1), dcpiepoch(1), dcpiflow(1), dcpiflush(1), dcpikdiff(1), dcpilabel(1), dcpildlatency(1), dcpilist(1), dcpiprofileme(1), dcpiquit(1), dcpiscan(1), dcpisource(1), dcpistats(1), dcpisumxct(1), dcpitar(1), dcpitopcounts(1), dcpitopstalls(1), dcpiuninstall(1), dcpiupcalls(1), dcpivarg(1), dcpivcat(1), dcpiversion(1), dcpivlst(1), dcpivprofiler(1), dcpiwhatcg(1), dcpix(1), dcpiformat(4), dcpiexclusions(4)

For more information, see the DCPI project home page http://h30097.www3.hp.com/dcpi.

COPYRIGHT

Copyright 1996-2004, Hewlett-Packard Company. All rights reserved.