Intel® VTune™ Amplifier XE 2017

vtune-product-image

  • Intuitive CPU & GPU performance tuning, multi-core scalability, bandwidth and more
  • Quick performance insight with advanced data visualization
  • Automate regression tests and collect data remotely

Overview

Simpified Serial and Parallel Performance Optimization

Whether you are tuning for the first time or doing advanced performance optimization, Intel® VTune Amplifier provides a rich set of performance insight into hotspots, threading, locks & waits, OpenCL, bandwidth and more. Use powerful analysis to sort, filter and visualize results on the timeline and on your source

VTune Amplifier comes in two versions:

  • VTune™ Amplifier XE for performance profiling of Windows* and Linux* applications.
  • VTune™ Amplifier for Systems for profiling embedded target platforms including energy profiling.

Hotspots XE-1

What’s hogging all the cycles?…Quickly locate code taking a lot of CPU/GPU time

Details Fig 01

Hotspots analysis gives you a sorted list of the functions using a lot of CPU time. This is where tuning will give you the biggest benefit. Click [+] for the call stacks. Double click to see the source.

Show me where…See the results on your source

Details Fig 02

A double click from the function list takes you to the hottest spot in the function.

Why wait?…Locks and Waits Analysis

VTune Fig 03

Quickly find a common cause of slow performance in parallel programs: waiting too long on a lock while the cores are underutilized during the wait. Profiles like “basic hotspots” and “locks & waits” use a software collector that works on both Intel and compatible processors.

Mine the Data with Timeline Filtering

VTune Fig 04

Select a time range in the timeline to filter out data (e.g., application startup) that masks the information you need. When you select and filter in the timeline, the grid that lists functions using a lot of CPU time updates to show the list filtered for the selected time.

Profile Remote Systems

Easy profiling of remote systems.

VTune Fig 05

Low Overhead / High Resolution Hardware Profiling

VTune Fig 06

Intel® processors have an on chip Performance Monitoring Unit (PMU).  In addition to “basic hotspots” analysis that works on both Intel and compatible processors, VTune Amplifier XE has “advanced hotspots” analysis that uses the Performance Monitoring Unit (PMU) on Intel processors to collect data with very low overhead.  System wide analysis lets you analyze drivers.  Increased resolution (~1 ms vs. ~10 ms) can find hot spots in small functions that run quickly.

Advanced Analysis Like Bandwidth

Preset profiles provide an easy “point and shoot” set-up. No memorizing complex event names. Advanced profiles like bandwidth analysis, cache analysis and branch mispredictions find tuning opportunities.

VTune Fig 07

Opportunities Highlighted

VTune Fig 08

The cell is highlighted in pink when there is a potential tuning opportunity. Hover to get suggestions.

OpenMP Scalability Analysis

VTune Fig 09

See the potential gain for each parallel region. See what is serial, what is balanced and what is imbalanced. Tuning opportunities like excessive spin time are highlighted in pink.

Tune OpenCL™

VTune Fig 10

On newer processors, optionally collect GPU data for tuning OpenCL applications. Correlate GPU and CPU activities. (Windows* only.)

No special builds
Use a production build with symbols from your normal compiler.

Low overhead
Accurate results you can count on.

Command line
Automate regression analysis. Simple remote collection.

System Wide Analysis
Tune drivers, kernel modules and multi-process apps.

Tune Inlining with Call Counts
When a function is called frequently it may make sense to “inline” the code and eliminate the overhead of the function call. VTune Amplifier XE now provides statistical call count data to help you make better inlining decisions. It also displays profile results on the source code, even if the code is inlined, making it easier to interpret profile results.

Auto Detect Microsoft DirectX* Frames
Got a slow spot in your Windows* game play? You don’t want to know where you are spending a lot of time, you want to know where you are spending a lot of time and the frame rate is slow. VTune Amplifier can now automatically detect Microsoft DirectX* frames and filter results to show you what is happening in slow frames. Not using DirectX*? Just define the critical region using the API and frame analysis becomes a powerful tool for analyzing latency.

Intel® Threading Building Blocks, OpenMP 4.0, Intel® Cilk™ Plus support
Built-in understanding of parallel programming models means profiling data is described using familiar terms from the source, not with cryptic internal runtime labels.

Low Overhead Java* Profiling
Analyze Java or mixed Java and native code.  Results are mapped to the original Java source.  Unlike some Java profilers that instrument the code, VTune Amplifier uses low overhead statistical sampling with either a hardware or software collector.  Hardware collection has extremely low overhead because it uses the on-chip performance monitoring hardware.

Analyze User Tasks
The task annotation API is used to annotate your source so VTune Amplifier XE can display which tasks are executing. For example if you label the stages of your pipeline, they will be marked in the timeline and hovering will reveal details. This makes profiling data much easier to understand.

Tune for Intel® Xeon Phi™ Products
Hardware profiling is supported for Intel® Xeon Phi™ products and can be launched from the graphic user interface. It can collect advanced hotspots and advanced event data and has time markers for correlation of data across multiple cards. Software collection (e.g., locks and waits analysis) is not supported on Intel® Xeon Phi™ products.

“Hot keys” Start and Stop Analysis
Add a short cut to quickly launch performance analysis whenever you see your app running slowly.  Program hot keys to start and stop the collection of performance data.

Tune MPI Applications
Analyze hybrid applications using MPI and OpenMP. Install on a cluster.
Visit “What’s New?” for a more complete list of the new 2015 features.  Check back occasionally as we constantly add new features in product updates.  One year of updates is included with your initial purchase or support renewal

 

The Next Steps

Was sagen unsere Kunden über uns?

Bob’s great, he should be franchised.

Tutor was very knowledgeable and taught in a way that was easy to follow

KH, Uxbridge, UK

Your technicians are always really friendly, reassuring and helpful.

HM, Cambs, UK

This level of customer support will increase my likelihood to stay with Endnote and recommend it to colleagues. Actually outstanding on comparison to other site.

JS