1. Executable Editing:
Ted Romer, Geoff Voelker, Dennis Lee, Alec Wolman, Wayne Wong, Hank Levy,
Brian Bershad, and Brad Chen, Instrumentation and Optimization of Win32/Intel
Executables Using Etch, The USENIX Windows NT Workshop Proceedings,
Seattle, Washington, August, 1997.
James R. Larus, Eric Schnarr: EEL: Machine-Independent Executable Editing.
PLDI 1995: 291-300.
L.S. Wilson, C.A. Neth, M.J. Rickabaugh, "Delivering binary object
modification tools for program analysis and optimization," Digital Technical
Journal, 8(1):18-31, 19
Han B. Lee,
and Benjamin G. Zorn.
BIT: A tool for instrumenting Java bytecodes. In Proceedings of the 1997
USENIX Symposium on Internet Technologies and Systems (USITS97), pages
73-82, Monterey, CA, December 1997.
R. J. Hookway and M. A. Herdeg, "Digital FX!32: Combining Emulation and
Binary Translation," Digital Technical Journal, 9(1):3-12, 1997.
David W. Wall. "Systems for Late Code modification." In Robert Giegerich and
Susan L. Graham, eds., Code Generation – Concepts, Tools, Techniques,
pages 275-293, Springer-Verlag, 1992.
Reed Hastings and Bob Joyce. "Purify: Fast Detection of Memory Leaks and
Access Errors." Proceedings of the Winter USENIX Conference, Pages
125-136, January 1992
D. Goodwin, "Interprocedural dataflow analysis in an executable Optimizer,"
Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation ‘94,
pp 122-133, Las Vegas, Nevada, June 1997.
Alan Eustace, Amitabh Srivastava: ATOM: A Flexible Interface for Building
High Performance Program Analysis Tools. USENIX Winter 1995: 303-314
Amitabh Srivastava, Alan Eustace: ATOM - A System for Building Customized
Program Analysis Tools. PLDI 1994: 196-205
James R. Larus, Thomas Ball: Rewriting Executable Files to Measure Program
Behavior. SP&E 24(2): 197-218 (1994)
MIPS Computer Systems. "UMIPS-V Reference Manual (pixie and pixstats)."
Sunnyvale, CA, 1990.
Amitabh Srivastava and David Wall, "A Practical System for Intermodule Code
Optimization at Link-Time." Journal of Programming Languages, vol 1, no
1, pages 1-18, March 1993.
Robert S. Cohn, David W. Goodwin, and P. Geoffrey Lowney, "Optimizing Alpha
Executables on Windows NT with Spike." Digital Technical Journal, vol 9,
no 4, http://www.digital.com/DTJS00/index.html.
Amitabh Srivastava, David W. Wall: Link-Time Optimization of Address
Calculation on a 64-bit Architecture. PLDI 1994: 49-60
Amitabh Srivastava: Unreachable Procedures in Object-Oriented Programming.
LOPLAS 1(4): 355-364 (1992)
W. J. Schmidt, R. R. Roediger, C. S. Mestad, B. Mendelson, I. Shavit-Lottem
and V. Bortnikov-Sitnitsky, Profile-directed restructuring of operating system
code, IBM Systems Journal, Vol. 37, No. 2, 1998,
Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika R.
Henzinger, Shun-Tak A. Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A.
Waldspurger, and William E. Weihl Continuous Profiling: Where Have All the
Cycles Gone? ACM Transactions on Computer Systems 15(4): 357-390 (1997). An
earlier version appears in Proc. of the 16th ACM Symposium on Operating
Systems Principles, St. Malo, France, October 1997, pages 1-14.
Thomas Ball, James R. Larus: Optimally Profiling and Tracing Programs. TOPLAS
16(4): 1319-1360 (1994)
Thomas Ball, James R. Larus: Branch Prediction For Free. PLDI 1993: 300-313
Aaron J. Goldberg and John L. Hennessy. MTOOL: An integrated system for
performance debugging shared memory multiprocessor applications. IEEE Trans. on
Parallel and Distributed Systems, pages 28--40, January 1993.
VTune: Intel's visual tuning environment.
VTune(tm) home page.
David W. Wall: Predicting Program Behavior Using Real or Estimated Profiles.
PLDI 1991: 59-70
M. Zagha et al. Performance analysis using the MIPS R10000 performance
counters. In Proc. of Supercomputing, November 1996, Pittsburgh,
X. Zhang et al. Operating system support for automated profiling &
Proc. of the 16th ACM Symposium on Operating Systems Principles, St.
Malo, France, Oct 1997.
3. Profile based optimization:
P. G. Lowney et al. "The Multiflow Trace Scheduling Compiler," The Journal
of Supercomputing, 7(1/2):51-142, 1993.
S. McFarling, "Program optimization for instruction caches," in ASPLOS III
Proc., pp. 183-193, Boston, MA, April 1989.
X. Zhang, et al, "System Support for Automatic Profiling and Optimization,"
in Proc. of the Sixteenth ACM Symposium on Operating System Principles,
pp. 1-14. Saint-Malo, France, Oct. 1997.
K. Pettis and R.C. Hansen, "Profile Guided Code Positioning" in Proc. ACM
SIGPLAN Conf. on Programming Language Design and Implementation ‘90, pp.
16-27, White Plains, NY, June 1990
Pohua P. Chang, Scott A. Mahlke, Wen-mei W. Hwu: Using Profile Information to
Assist Classic Code Optimizations. SP&E 21(12): 1301-1321 (1991)
W.W. Hwu and P.P. Chang, "Achieving high instruction cache performance with
an optimizing compiler," in Proc. 16th Annual Intl. Symp. on Computer
Jerusalem, Israel, June 1989
R. Cohn and P. G. Lowney, "Hot Cold Optimization of Large Windows/NT
Applications," MICRO-29, pp. 80-89, Paris, France, December 1996.
B. Calder, D. Grunwald, and A. Srivastava, "The predictability of branches in
libraries," in Proc. of the 28th Annual Intl. Symp. on Microarchitecture,
pp. 24-34, Ann Arbor, MI, Nov. 1995
J. A. Fisher, "Trace scheduling: A technique for global microcode
compaction," IEEE Transactions on Computers, C-30, 7(July): 478-490.
Pohua P. Chang, Scott A. Mahlke, William Y. Chen, Wen-mei W. Hwu:
Profile-guided Automatic Inline Expansion for C Programs. SP&E 22(5): 349-369