HP DCPI tool



Site information

» Send us your comments


» Download DCPI
» Installing DCPI

Product information

» Frequently asked questions
» Documentation
» Publications
customer times newsletter link





The DCPI system consists of a set of tools that provides low-overhead continuous profiling of all executables, including the kernel. It is based on periodic sampling using the Alpha performance counter hardware. Profiles containing samples for each executable image (including shared libraries) are stored in a user-specified directory.

Tools are provided to display profiles and produce a breakdown of all cpu time by image, and by procedure within images. In addition, detailed information can be produced showing the time spent executing each source line and each instruction in a procedure.

Support is provided for some automated analysis for Alpha 21064/EV4 and Alpha 21164/EV5 based systems, including the presentation of possible reasons for static and dynamic stalls.

For more information, see the DCPI home page http://www.h30097.www3.hp.com/dcpi/.


Scans filesystem directories to find executables and associate executables with filesystem pathnames. If you have significant executables in unusual directories, you should create a map of those image pathnames.

Continuous profiling daemon. Extracts raw samples from kernel device driver, associates them with executable images, and stores them in profiles on disk.

Displays profile data collected by dcpid. Produces a breakdown of cpu time by image, or by procedures within images.

Lists the contents of a procedure and annotates the listing with samples collected during profiling via dcpid. The listing can contain either source lines, or machine instructions, or both. The listing is annotated with the samples collected during profiling. When possible, the average number of cycles required to execute each instruction or source line is also produced.

Produces a sorted list of the instructions (and their source line numbers) accounting for the greatest number of samples of a specified event type.

Compares multiple sets of raw sample counts and prints various statistics about them. Dcpistats is useful for comparing variations across multiple runs of the same program, or for comparing differences between slightly different versions of a program.

Converts DCPI profile data to a profile feedback file which is stored in a given executable. This can be used for compilation with feedback or by post-link optimizers like spike(1).

Controls the operation of dcpid. This subsumes dcpiepoch, dcpiflush, and dcpiquit (which are still provided for backward compatibility). Includes the ability to notify the daemon about specific images loaded into processes when necessary (e.g., when an image is loaded via mmap).

Starts a new profiling epoch. All samples are associated with a time interval called an epoch. The analysis tools typically operate on a set of profiles from a single epoch.

Flushes all unsaved samples from dcpid to profiles on disk.

Terminates the dcpid daemon, flushing all unsaved samples to disk.

Bundles up a profile database directory and associated hot images for examination on a different system.

Prints the version string and creation date of the installed DCPI release.

Uninstalls DCPI binaries, libraries and man pages.

Produces a sorted list of the instructions accounting for the most stall cycles. This analysis tool works only on Alpha 21064/EV4 and 21164/EV5 systems.

Annotates each instruction in a procedure's basic-block graph with the average number of cycles for that instruction, and computes the overall average cycle-per-instruction for that procedure. This analysis tool works only on Alpha 21064/EV4 and 21164/EV5 systems.

Produces, for one or more images, a summary breakdown of where time has been spent (percent of cycles spent in, e.g., memory delays, static stalls, branch mispredicts, and useful execution). This analysis tool works only on Alpha 21064/EV4 and 21164/EV5 systems.

Measures execution counts for basic blocks and control-flow edges directly; produces output which can be used by stall-analysis tools (dcpicalc, dcpiwhatcg, dcpitopstalls) to produce more accurate information. Without output from dcpix, these tools estimate execution counts.

Aggregates execution counts measured using dcpix from multiple runs of an instrumented program. This makes it possible for stall analysis tools to analyze counts from multiple runs of a program.

Compares two sets of profiles for a procedure, highlighting basic blocks or source lines with the largest differences. This analysis tool works only on Alpha 21064/EV4 and 21164/EV5 systems.

Augments a basic-block graph generated by dcpicalc(1) with source code.

Compiles C programs to produce object code that helps dcpisource in identifying which source token each instruction corresponds to.

Formats a basic-block graph into Postscript.

Converts DCPI profile data to pixie format.

Creates a new image based on both vmunix and kmem(7) that captures the true running kernel image after HP Tru64 Unix dynamically patches itself using self-modifying code.

Prints the contents of one or more profile files in an ASCII format.

Generates a basic-block graph for a procedure annotated with samples collected during profiling via dcpid. The functionality of this program has been subsumed by dcpicalc for Alpha 21064/EV4 and 21164/EV5.


Installation and Setup

See the README file from the kit, or the DCPI installation page http://h30097.www3.hp.com/dcpi/installation.html for details of how to install the device driver, binaries, and man pages for the profiling system. Once the system is installed, dcpiscan(1) should be run and a profile-database directories should be created.

dcpiscan directories > map.local

Create an image map for site-specific executables and shared libraries in the specified directories and their descendants. Although this step is technically optional, creating a map of local executables will allow dcpid to more accurately identify binaries stored in site-specific directories. Dcpiscan(1) should be executed once during system installation, and need only be re-executed to scan other directories or newly-installed executables.

mkdir db

Make a directory to store profiles.

The profiles written in the directory are owned by the user who invokes dcpid, so the directory must be writable by that user. If the directory is shared across hosts, its permission should be set appropriately to allow write-sharing by the users running dcpid. The directory must also be in a partition with a reasonable amount of available space (20 MB or so should be more than adequate).

Data Collection

After installation and setup are complete, data is collected by running dcpid and the tools that control it:

dcpid -m map.local db

Start the dcpid process. (If dcpid is not installed setuid-root, then this command must be run as root). The optional argument -m map.local should contain a mapping from executables to pathnames previously produced by dcpiscan(1). db is the database directory created during setup above.


Terminate the current epoch and start a new one, ensuring that profiles for the terminated epoch are flushed to disk and won't change.


Flush buffered samples in the current epoch to the on-disk database. Note that this is typically not necessary (unless you want to see profiles immediately after running a program). Buffered samples are flushed to disk whenever an epoch is terminated and when dcpid is terminated. In addition, buffered samples are flushed periodically (at intervals that can be controlled with command-line arguments to dcpid).


Terminate the current epoch, flushing all buffered samples to disk, and exit dcpid. This turns off all performance-counter interrupts and frees all memory used by the profiling system.

As dcpid runs, it creates subdirectories of db, one for each epoch. Each epoch directory further contains subdirectories, one for each platform sharing the same db. The platform names default to the local hostname on each machine running dcpid, so by default profiles collected on different machines are stored separately (though their epochs are synchronized). However, the file hosts in directory db may also be edited to contain a mapping from hostnames to arbitrary platform names, allowing samples from several hosts to be aggregated in the same profile database.

Data Analysis

After an epoch is terminated, the profile data for the epoch can be analyzed using a number of tools. By default, the analysis tools find the relevant profile files automatically. There are also a number of options that can be used to guide the search for profile files when the default rules are not appropriate; see the man pages for the individual tools for details.

setenv DCPIDB db

Set an environment variable that tells the downstream tools where the profile database is located.


Use dcpiprof to analyze the breakdown of cpu time across all executables that ran during the epoch, broken down by image.

dcpiprof image

Use dcpiprof to analyze the breakdown of cpu time across all procedures in the image file image.

dcpitopcounts image

Identify the hot spots in image; listing the expensive instructions in order along with their source line numbers if available.

dcpilist -asm proc image

Disassemble procedure proc in the image file image, and annotate the disassembly with samples extracted from the profile database and the average cycle time required for executing each instruction.

dcpilist -source proc image

Generate a source code listing of procedure proc in the image file image, and annotate the listing with samples extracted from the profile database, and the average cycle time required for executing each source line.

dcpicalc proc image | dcpisource -f image.c | dcpi2ps -o proc.ps

Produce a basic-block graph for procedure proc in image file image; then augment the graph with source lines from image.c, calculate the cycle per instruction for each instruction, and store the resulting Postscript in proc.ps. The dcpicalc analysis tool works only on Alpha 21064/EV4 and 21164/EV5 systems.


Installation instructions are in the kit README file, or at the DCPI installation page htp://h30097.www3.hp.com/dcpi/installation.html.


For processes that use the exec() system call (or its variants), PC samples are sometimes charged to the wrong image. Thus, it is possible to get samples for unexecuted instructions. Specifically, the problem is that samples gathered prior to an exec() call may be charged to an image that is running after exec() returns. This problem is not serious in practice for the common case of processes that call exec() once soon after being created: since there are only a few samples gathered prior to the exec(), only a few samples can be charged to the wrong image.


dcpi2bb(1), dcpi2pix(1), dcpi2ps(1), dcpicalc(1), dcpicat(1), dcpicc(1), dcpicoverage(1), dcpictl(1), dcpid(1), dcpidiff(1), dcpidis(1), dcpiepoch(1), dcpiflow(1), dcpiflush(1), dcpikdiff(1), dcpilabel(1), dcpildlatency(1), dcpilist(1), dcpiprof(1), dcpiprofileme(1), dcpiquit(1), dcpiscan(1), dcpisource(1), dcpistats(1), dcpisumxct(1), dcpitar(1), dcpitopcounts(1), dcpitopstalls(1), dcpiuninstall(1), dcpiupcalls(1), dcpivarg(1), dcpivcat(1), dcpiversion(1), dcpivlst(1), dcpivprofiler(1), dcpiwhatcg(1), dcpix(1), dcpiformat(4), dcpiexclusions(4)

For more information, see the DCPI project home page http://h30097.www3.hp.com/dcpi.


Copyright 1996-2004, Hewlett-Packard Company. All rights reserved.