
VTune Performance Analyzer 4.0
Release Notes
Contents
1.0 Overview
2.0 System Requirements
3.0 License Definitions
4.0 Installation
5.0 What's New in Version 4.0
6.0 Getting Started
7.0 Example Programs
8.0 Known Limitations in Version 4.0
9.0 Technical Support and Feedback
1.0 Overview
The VTune Performance Analyzer is designed to provide an integrated tuning environment for Microsoft* Windows* 95, Windows 98, and Windows NT* systems.
Additional information on this software as well as other Intel® software performance products is available at
http://developer.intel.com/vtune.
2.0 System Requirements
Minimum hardware requirements:
- A Pentium® processor-based system and 32 MBytes of RAM. Be sure to use at least the minimum amount of virtual memory that Windows recommends or the VTune analyzer will experience unexpected failures.
- Pentium Pro, Pentium II, Pentium II Xeon, Intel Celeron, or Pentium III processor-based system for event-based sampling.
- Installation requires 40 MBytes of space on the hard disk you specify to install the VTune analyzer on. This may be a local hard disk or a network drive.
- Additionally, 18 MBytes of disk space is required for system files on the drive containing the system directory (assumed in this discussion to be C:). The additional hard disk space on C: is needed for updating and installing the DLLs and OCXs that the VTune analyzer requires to be in the system directory. Even if you install the VTune analyzer on a hard disk other than C:, please make sure that you have 18 MBytes on C:.
Minimum software requirements: Microsoft Windows 95, Windows 98, or Windows NT 4.0 operating system with Service Pack 3 or later. The VTune analyzer has also been tested with Windows 2000 Beta 3 (build 1946.1).
3.0 License Definitions
The VTune analyzer license (
LICENSE.WRI) refers to developer tools that are found in the VTune analyzer installation directory.
4.0 Installation
To install on Windows 95:
Insert the CD into the drive and auto-execution will invoke setup.exe. Make sure that the VTune analyzer is selected to be installed.
Caveats:
When looking at system modules on Windows 95, you can display symbolic information for these modules if you install the symbol files that are distributed with the SDK, DDK, and/or OS distribution diskettes/CDs.
Install all symbol files and individual VXDs that are part of VMM32.VXD and that are included in the Windows 95 DDK. For example, to view the hotspots for the module VMM, install the symbols and the debug version of the VXDs from the DDK, and then associate the files VMM.VXD and VMM.SYM with the VMM module in the Hotspots or Static Code Analysis window. Note however, that the debug version of VMM32 does not have the same performance characteristics as the retail version.
To install on Windows NT:
Installation Notes:
- The VTune analyzer requires Windows NT Version 4.0 with Service Pack 3 or later to be installed.
- Administrator rights are required.
Insert the CD into the drive. Auto-execution on Windows NT will invoke setup.exe. Make sure that the VTune analyzer is selected to be installed.
- If the User ID of the person using the VTune analyzer is different from the User ID of the person who installed the VTune analyzer, the person using the VTune analyzer must use the Update User’s Registry utility in the program group.
- To do sampling on Windows NT, you must have administrator rights or have the "Profile system performance" right assigned to your user account by the administrator. Otherwise, the VTune analyzer will not be able to collect samples. To assign this right, the administrator must do the following:
- Open the Programs\Administrative Tools (Common)\User Manager.
- Single click the user who needs the right assigned.
- Select the Policies\User Rights menu item. In this window, check the "Show Advanced User Rights" checkbox at the bottom of the window.
- Click on the right pull-down menu and select "Profile system performance".
- Select the Add button and specify the user.
- Click Add, then OK.
- Click OK in the User Rights Policy dialog box to finish.
Caveats:
- When sampling on Windows NT, the output directory must be a local drive. The VTune analyzer may not be able to record samples on a network or a mapped drive.
- During sampling, the VTune analyzer can track the modules belonging to your application when you use the "Program Name" or "Application to Activate" modes specified on the Automation options page of the Options dialog box. When using "None," all modules that get loaded after the VTune analyzer starts sampling will be attributed to OTHER32 in the Modules report.
- In order for the VTune analyzer to correctly attribute the samples on Windows NT to the Win32* Subsystem Server process, please use regedit to set bit 0x10000 in the registry variable:
[HKEY_LOCAL_MACHINE]
SYSTEM
CurrentControlSet
Control
SessionManager
GlobalFlag
Setting this bit in GlobalFlag will allow samples to be correctly attributed to the DLLs that are opened by the Subsystem Server process, for example the Windows NT Video DLL. If this bit in GlobalFlag is not set, samples collected in modules opened by the Subsystem Server process will be attributed to OTHER32. Be sure to not inadvertently change the other bit settings in GlobalFlag when setting bit 0x10000.
- To use an alternate symbol path for DBG files, define the environment variable: _NT_ALT_SYMBOL_PATH. For instance, if your files are under D:\support\debug\i386\symbols\dll\*.dbg, you should set:
_NT_ALT_SYMBOL_PATH = d:\support\debug\i386
- When looking at system modules on Windows NT, you can display symbolic information for these modules if you install the symbol files that are distributed with the SDK, DDK, and/or OS distribution diskettes/CDs.
To Uninstall the VTune Analyzer:
When the VTune analyzer is uninstalled using the UnInstall VTune Analyzer icon in the program group, only files installed by the VTune analyzer are deleted. Files that were created after installation, such as the database files (*.ldb and *.mdb) created at the end of the sampling session in the output directory and the binary files instrumented for a call graph in the cache directory, are not deleted. Delete the remaining files and directories yourself.
5.0 What’s New in the VTune Performance Analyzer Version 4.0
Following are the new features introduced in this version. See "What's New" in the online-help for more details on these features.
III processor support. You can use sampling, call graph profiling, or assembly analysis to analyze the performance of your application on the Pentium III processor and find potential performance bottlenecks. You can select a Pentium III processor-specific event and run a sampling session based on this event.
You can use the coach to analyze performance bottlenecks and get tuning advice specific to the Pentium III processor. The coach analyzes your code and, whenever appropriate, to boost the performance of your program, suggests intrinsics corresponding to the Streaming SIMD Extentions.
From the assembly view, you can display context-sensitive online help descriptions of the Pentium III processor instructions.
Improved counter data collection using chronologies. The VTune analyzer now reports counter data separately for each instance of an object. The GUI for the collection and display of the data has also been improved.
Bytecode Accelerator. The VTune analyzer allows you to identify critical time-intensive Java* methods in the Hotspots report of your Java application and compile them to native Intel architecture (IA) code for faster performance.
Source View Enhancement. The VTune analyzer allows you to view all the collected events, in the source view, from a session at once and to switch sessions at the source view level. The VTune analyzer allows you to open the source view from hotspot, static code analysis, or call graph and to view any collected event or call graph information on this source view.
Event ratios. Event ratios provide detailed performance information based on Pentium Pro, Pentium II, and Pentium III processor performance-related events monitored and collected in an event-based analysis session. The information is displayed in the Source, Mixed Source/Assembly, and Assembly views.
Call site information. Call graph profiling now includes collecting and analyzing call-site information and displaying the results in the call list and Source views.
6.0 Getting Started
All information about the VTune analyzer is contained in the online help. The context-sensitive online help can be activated in several ways:
- Click the right-mouse button to invoke online help for the active item on the screen.
- Press the <F1> key to invoke online help and display the overview help topic corresponding to the active VTune analyzer window. From the overview help topic, you can navigate to related help topics.
- Click on the buttons of the Assistant bar on the right inside the VTune analyzer main window to invoke context-sensitive How To, Tips, and Troubleshooting topics.
The product package also includes reference cards that provide a quick tour of all of the main features of the VTune analyzer.
The VTune analyzer supports applications generated with the following compilers and development environments when correct debug symbols are available:
Compiler Environment |
Source View |
Call Graph Support |
Microsoft Visual C++* 5.0/6.0 |
Yes |
Yes |
Intel C/C++ Compiler 4.0 |
Yes |
Yes |
Microsoft Visual Basic* 5.0/6.0 |
Yes |
Yes |
Intel Fortran Compiler 2.0 |
Yes |
Yes |
Borland* C++Builder* 4.0 |
Yes |
Yes |
Borland JBuilder* 2 |
Yes |
Yes |
Borland Delphi* 4.0 |
Yes |
|
Watcom* 11.0 |
Yes |
|
Microsoft Visual J++* 2.0 |
Yes |
Yes |
7.0 Example Programs
A set of example programs is provided in the
examples directory.
Demo Program: The demo program example contains fragments of code, which can be used to view the advice given by the code coach. The example program includes chronologies. However, the option to collect chronologies is turned off by default. If you want to collect chronology data, check the Collect Chronology Data option checkbox on the Advanced options page of the Options dialog box before running a session.
Manual Session: The Manual Session example allows you to start PC sampling without starting an application.
Simulation Demo: The Simulation Session example shows Pentium II processor simulation results such as partial stalls, MOB stalls, BTB miss, and data cache miss. Open the online help for the reported stalls, after running dynamic analysis, to have a complete explanation.
Search: The Search example performs different search mechanisms on a text file.
MS Java Example: The Java example can be used to perform Java call graph profiling.
Other examples are available from the VTune analyzer website, URL:
http://developer.intel.com/vtune, accessible from the Help menu.
8.0 Known Limitations In Version 4.0
Installation
- Sometimes InstallShield* will report that it cannot install the software and displays a cryptic error message. This may be due to low disk space, or your %TEMP% directory is full. You must first clean up/delete the files in your %TEMP% directory, then run scandisk or chkdsk on your drive. You can then try to reinstall the software.
Registry Corruption
- The VTune analyzer registry sometimes gets corrupted if the VTune analyzer hangs. Use the "Update User's Registry" program from the program group to restore the registry.
Options
- The VTune analyzer may hang if you open the Options dialog box, by clicking on the Configure Options button or using the Configure menu/Options command, while a static code analysis or dynamic code analysis is in progress.
- If you have NuMega* SoftIce* running while the VTune analyzer is running, SoftIce will report a fault if you open the VTune analyzer's Options dialog box. To workaround this issue, turn faults to off while setting the options in the VTune analyzer.
- The Source View options may not work correctly if more than one Source view is open. Close all open Source views and open only the desired one.
Sampling
- Multiple instances of the VTune analyzer can be invoked; however, only one version of the VTune analyzer will be able to perform sampling. It is not recommended that multiple instances of the VTune analyzer be run together.
- Sometimes the VTune analyzer fails to load an application because it cannot find some of the DLLs.
- Application names to be sampled cannot be longer than 64 characters and cannot contain non-alphanumeric characters such as '!'.
- Drilling down on a single sample may cause the VTune analyzer to fail. It is not recommended to drill down only on one sample. Drill down over multiple samples.
Call Graph
- You must delete the project files (.vts) generated by older versions of the VTune analyzer and create new project files for them in order to generate correct call graph information.
- If you encounter an error during instrumentation, then try clearing the cache (using the Clear Cache button in the Win32* Call Graph Setup dialog box) and re-instrumenting.
- If you get results that seem too large, they may be a side effect of the run-time instrumentation (LoadLibrary). Try re-running the application, without re-instrumentation. In order to improve accuracy, you can add, in advance, the dynamically loaded module to the module list by using the Add Module button.
- In order to get call graph information for executables produced by Microsoft Visual Basic 6.0, you need to define the environment variable, LINKER=/fixed:no, before building the application. In Windows NT, it should be added in the Control Panel/System/Environment. In Windows 95 or Windows 98, it should be added in the autoexec.bat file. This will cause generation of base relocations in the image which are needed in order to instrument it.
- In order to instrument an application that consists of more than one process (e.g. client-server), the other process should be added to the module list by using the Add Module button in the Win32* Call Graph Setup dialog box.
When instrumenting more than one process, make sure that:
-- the other process isn't currently running in the system, and
-- the instrumented server is killed after completing the run
Currently, there isn't a way to view the call graph for the other process.
- In Windows 95 and Windows 98, some user DLLS may be loaded by the system and won't be instrumented. In order to instrument these DLLs, use the Add Module button in the Win32* Call Graph Setup dialog box and use the original name as the Instrumented Module Name (the path remains in the cache).
- If an executable contains a Shared Section which it uses to communicate with another process, the other process must be instrumented (in order to force both processes to use the same DLL with the shared section).
- It is recommended not to use in-place instrumentation (same name, same directory).
- Some functions cannot be profiled. These are rare extreme cases. In this case, it will seem as if the caller of the non-instrumented function called its callees.
- In Windows 95 and Windows 98, inter-DLL calls (e.g., user32 calling kernel32) will not appear in the call graph.
- The new Microsoft Delayed-Load feature is treated like a dynamically loaded library (run-time vs. instrumentation-time).
- For the Win32 call graph, exit the application being profiled normally to end the call graph session. Do not use the Stop Session button to end the session.
- When first moving the Zoom slider on the call graph view, the graph scale is reset to the smallest possible scale. After that, the slider works normally.
- You may get an error stating that the VTune analyzer cannot find the file of the Java program when you try to view its source file. This happens when you run a Win32 call graph profiling session on a Win32 project beforehand without exiting and reinvoking the VTune analyzer. The call graph will display the names of the Java methods incorrectly; therefore, the VTune analyzer cannot find the associated source file.
- Call site support has not been implemented for Java call graph.
- For Java applications, call graph data in the Source view appears only for current method. To get call graph data for another method, reenter the source view from the call graph of the desired method.
- Transitioning from any place in the Source view to the Call Graph view for Java code will always bring you to the previously selected method.
- The VTune analyzer may get confused when there are .prf files in the working directory which are not created by the VTune analyzer.
- Renaming .prf files manually may result in a loss of consistency between call graph and sampling data.
- The RVA filed is sometimes wrong in the Call List window in the view by call sites.
- Call graph instrumentation problems might occur on Windows 2000.
- Running Java call graph with the Automatically increment option turned off (in the Sampling options page of the Options dialog box) will delete previously collected sampling data for the session. To get both sampling and call graph data for the same session, run the call graph first then run the sampling session. Take note that if you run call graph again, without incrementing the session ID, it will delete the previously collected sampling data.
- For Java applications, sampling and call graph data cannot be displayed simultaneously in the source view. When coming to the source view from the call graph view, only the call graph data will be displayed. When coming from the hotspots view, only sampling data will be displayed. When coming from the Static Code Analysis view, none of these data will be displayed.
- The function names of Intel C/C++ compiler-generated Pentium III processor 128-bit stack alignment functions, with multiple entry points, will appear at least twice in the call graph grid (with the same name) if the entry points are called during execution.
Code Coach
- Makefile processing for the code coach will run properly only when a compiler is set up to run. For instance, if you requested that the compiler run from the CD-ROM, makefile processing for code coach will run only if the compiler CD-ROM is present in the CD-ROM drive.
- If you used the Microsoft Visual C++ IDE to generate a makefile for the code coach, your source directories specified under Tools/Options/Directories are not contained in the makefile. If the code coach fails to find the include files residing in these directories, you will need to add these directories to the INCLUDE environment variable (before starting the VTune analyzer), or specify the directories in the make option INCLUDE="...".
- The code coach will generate an "Invalid combination of type specifiers" error when it encounters the following line in a Watcom header file: typedef long char wchar_t The workaround is to specify -D_WCHAR_T_DEFINED -Dwchar_t=short as an option in the code coach.
Dynamic Analysis
- No support is available for the Intel Celeron processor and the Pentium II Xeon processor.
- No support is available for Java code.
- Laptop users running Windows 95 should avoid selecting the Uniform Sampling in the Dynamic Analysis Setup dialog box. This may cause Windows 95 to crash.
Static Analysis
- Static code analysis does not recognize symbolic information in csm.vxd file.
- In the case of large source files (3000 lines of code or more), the VTune analyzer minimizes the graphic display in the Static Assembly Analysis view. The graphic display for the pairing, penalty, warning, instruction cache, and decoder group columns is provided only for the assembly code lines that correspond to the first line of the selected source code. The mnemonic information corresponding to these graphics is displayed in their respective columns even though the graphics are disabled.
Mixed Source and Assembly View
- Loading and displaying large source files (7000 lines of code or more) in the Mixed Source and Assembly view can be very slow.
- Labels containing more than one "::" are parsed incorrectly in MIX/ASM views
- You might run into problems displaying source and assembly for virtual functions compiled with Watcom* 11.0.
Bytecode Accelerator
- To use the bytecode accelerator under any environment, include the VTune analyzer installation directory in the PATH, CLASSPATH and LIB environment variables.
- To use the bytecode accelerator under the Microsoft environment, you must install the Microsoft Java Development Kit. It is available from www.microsoft.com/java. After installing the Microsoft JDK, you must add the JDK_root_install_path\lib\i386 directory to your LIB environment variable. The bytecode accelerator must be able to find the msjava.lib library in order to create native binaries for the Microsoft Java environment. This note does not apply to Borland JBuilder 2 users.
- The bytecode accelerator does not generate correct code for the following Java syntax:
i = a[(a=b)[n]];
The workaround is that you can delay the assignment of array variable
a after the array reference to bypass the problem.
i = a[b[n]];
a = b;
- The following statement of comparing equality of the same float/double variable may be eliminated by the bytecode accelerator and the effect may introduce different outcome.
"When a test contains an if statement of "if (y == y) s++" inside a loop statement, the same variable comparison is considered to be equal by the optimizer, so the if statement is considered dead code and is removed. Because of this optimization, the compiler produces a different answer than the JIT code."
- Floating-point expressions may result in different precisions and a floating-point comparison of equality may fail because of the precision difference.
- The bytecode accelerator gives different results than the JIT code for the following Java syntax:
a = "abc"
if (a == "abc")
The compiler treats the first
abc as a different object than the second abc and gives a FALSE result of this string object comparison. If the intention is to compare the value of string variable a to abc, you can use the predefined method of class String to do the comparison.
- Currently the bytecode accelerator does not support overloaded methods in the Microsoft VM environment.
- Currently the bytecode accelerator does not support the JNI interface in the Microsoft VM environment.
Intel740 Graphics Accelerator
- The Intel740 Graphics Accelerator performance counters are not available in Windows 2000.
Compiler Intrinsics
- You will find more up-to-date compiler intrinsics information at http://developer.intel.com. The online help will be updated to reflect the new instrinsic names at a later release.
9.0 Technical Support and Feedback
Your feedback on Intel software is very important to us. We will strive to provide you with answers or solutions to problems you might encounter with the software. To give feedback or report any problems with installation or use, you can do the following:
- call Intel Developer Support Hotline: (800) 628-8686
- send email to
developer_support@intel.com
When submitting a problem description by e-mail, please include your name, company, phone number, product name and version, compiler name and version, OS/service pack, and information about your development environment.
You can find up-to-date information on the product and software updates at
http://developer.intel.com/vtune.
Intel, the Intel logo, and Pentium are registered trademarks of Intel Corporation.
Intel386, Intel486, MMX, and VTune are trademarks of Intel Corporation.
*Other brands and names are the property of their respective owners.
Copyright © 1998-1999, Intel Corporation, All Rights Reserved.
Reference: readme.htm