What are "unknown" samples?
What are the causes of unknown samples?
How can unknown samples be identified?
|A. The dcpid
daemon will classify a sample as unknown when it cannot be
associated with any known executable image (such as an application,
shared library, or kernel module). The mapping of samples to images
has improved with each release of dcpid, but there are
still some situations that will result in unknown samples. In most
cases, the total number of unknown samples will be negligible. If a
large number of samples (e.g., significantly more than 1%)
are classified as unknown, it may be due to one of the following
- On DIGITAL Unix, processes that are created by programs
that fork(2) without subsequently invoking
exec(2). A recent enhancement to dcpid helps
classify such samples, at least for relatively long-lived
processes. See the -forkid option described in the
dcpid man page for additional information.
- Execution of dynamically-generated code, such as code
emitted at runtime by "just-in-time" (JIT) compilers.
Samples in such code cannot be attributed to an executable
image without manual intervention. In many cases, there is
no non-volatile executable image with which to associate
samples. See the register option described in the
dcpictl man page for additional information.
- An internal buffer overflow. When processes are created at
an exceptionally high rate, it is possible for a kernel
image name translation buffer to overflow. Check the dcpid
log file for messages of the form dcpid: pcount: dropped
N image name translations.
By default, unknown samples are aggregated by event type, and
a single count is stored for each event in a profile named with
the prefix unknown@host, where host is the
local machine name. The -unknown option to dcpid
can be used to help help identify the source of unknown samples.
When this option is used, unknown samples are stored in separate
profiles associated with 1MB regions of each process address
space. The resulting profiles are given names with the prefix
hostPID@address, where host is the local machine
name, PID is the process identifier associated with the
sample, and address is the starting address for the
region containing the sample. In some cases, this information
will make it possible to manually identify the source of the
|Q. Why can't I run
dcpid at the same time as other tools such as uprofile,
kprofile, and iprobe?
||A. All of these
tools use the Alpha hardware performance counters, but each requires
exclusive control of the counters and use different device drivers
to access them.
|Q. How can
I obtain SCL?
||A. SCL is
currently available by download from this web site. You will need to
agree to the license terms and register your details in order to
obtain the software.
In the future SCL may be packaged with other developer tools for
Why does dcpiflow or dcpicalc say "could not
compute jump table targets"?
|A. This error
message means that the program could not automatically determine all
of the targets of a computed jump instruction. See the documentation
for dcpicalc to find out how to fix this problem. (Note: dcpiflow
has been subsumed into dcpicalc.)
Why do I get the error message "dcpisource: perl not found"?
|A. You need to
have perl installed on your system in order to run dcpisource.
Perl source code is available on the web.
Why do dcpiprof and dcpicalc disagree on stall information?
|A. Dcpiprof just
lists the number of event samples accrued per procedure for each
event type monitored by dcpid. Dcpicalc uses a detailed machine
module to compute actual number of stall cycles attributable to a
given machine event. Therefore, when dcpiprof lists 17% imiss, it
means that 17% of the imiss samples in dcpiprof's output fell within
a particular procedure. When dcpicalc gives a percentage for
"I-cache" activity, it means that that percent of all cycles spent
in this procedure were attributable to i-cache misses.
What does "unexplained gain" mean?
Why is it negative?
Is it ever possible to see "over-100%" execution?
gain occurs at places in the code where instructions take fewer
cycles to execute than even our ideal assumption suggests. For
example, our analysis assumes that an Alpha~21164 processor ideally
can issue 2 instructions in each cycle. If in fact 3 instructions
are issued, 0.5 cycle would be attributed to unexplained gain, since
3 instructions are expected to take at least 1.5 cycles. This
scenario is possible because the processor can actually issue 2
instructions to the integer pipeline and 2 to the floating-point
pipeline every cycle. (Assuming quad-issue as the ideal case would
leave all integer code with under-50% execution. This may confuse
many users, as would using different assumptions under different
circumstances.) If this occurs consistently enough throughout the
code being analyzed, the percentage of cycles spent on instruction
execution can exceed 100%, which is sometimes the case for code with
many floating-point operations.
Unexplained gain is shown as a negative number
because all other numbers in the same table
represent cycles that are in some sense "lost"
(e.g., to D-cache misses) or "spent" (e.g., on
executing instructions). These positive numbers and
the unexplained gain always add up to 100%, with the
gain shown as negative to indicate that its
contribution to the sum is in a direction opposite
that of the others.
For more details, see "INTERPRETING OUTPUT" in the
man page for dcpicalc.
Tru64 UNIX specific questions
Why does dcpid not produce any profiles?
Why does the entire system hang when dcpid is
|A. A kernel bug
exists in Tru64 UNIX that can occasionally cause dcpid to
crash, and can even crash the kernel in rare cases. Running
dcpid with the -b flag prevents dcpid from
doing its initial scan of the system process tables and hence from
triggering the bug. Note that this will also prevent dcpid
from determining what images are loaded in processes that are
already running when it starts up.
Unlike earlier versions of dcpid, which performed
frequent scans of system tables to identify statically linked
executables, the current version only performs a single scan
during initialization. Thus, it is extremely unlikely that this
problem will be encountered. It is generally worth the risk of
performing the initial scan in order to obtain useful
information about the processes that were already executing when
dcpid was started.