Java performance check using jstack:
Jstack is a JVM tool that will generate thread dumps for a running JVM. We can use this as a way to do some lightweight profiling of our running application, to identify where performance bottlenecks are.
1. Use JPS to determine the process ID of your server
2. Execute a slow performing use case on your server which usually have slow performance.
3. While the use case is executing, use the jstat command to create a thread dump for the process. We redirect the output to a file to inspect it more easily:
This will generate a file called jstack-output-1.txt. Alternatively, kill -3 <pid> can be used on Linux or Ctrl + Break in the console in Windows.
4. Repeat step 3 a number of times, changing the number of the generated file each time, to create a sequence of thread dumps. I suggest aiming to create thread dumps at intervals of one or two seconds, for a period of 30 seconds, or the amount of time that it takes to execute your slow use case.
5. Open the output files in your favorite text editor.
To identify the poor performing code, we are looking for thread stacks that appear frequently across the files. These will either be operations that are executed multiple times, or operations that are executed once but take a long time to complete.
A thread dump is a snapshot of what the every thread in the application is doing at a particular moment in time. By taking multiple thread dumps over a period of time, we can build up a picture of where the application is spending its time. This approach is essentially how a sampling profiler works, so we are effectively doing manual profiling of the application.
It is common to find that the cause of poor performance is waiting on data from external systems, such as a database or web service. In this case, what we will see in our thread dumps is that there will be many instances where a thread is in JVM code that is reading data from a socket. This is the thread waiting on a response from the remote system, and we know that we should target the external system in order to improve the performance.
We can quickly look through large numbers of stacks by noting that, in general, threads of interest to us will be those that are running the application code. We can identify the threads of interest, as they will generally have two properties; they will be executing the code from the packages that are used in your application, and they will have stack traces that are longer than those of threads that are currently idle.
It is worth noting that WebLogic can report Java threads as “stuck” if they run for a longer period of time than a configurable timeout. This itself is not indicative of a problem, as long-running activities (such as database or file polling) can be valid things for application components to do. Observing thread dumps will tell us definitively whether we need to take further action.
Generating heap dumps is also a good technique for investigating performance issues, and free tools such as the Memory Analyzer Toolkit can generate reports on object allocation and aid with detecting memory leaks.
Java performance check using VisualVM on HotSpot:
VisualVM is a powerful graphical tool that comes with the HotSpot JVM. It has a number of views, which can be useful for diagnosing performance problems.
1. Use JPS to determine the process ID of your server.
2. Start VisualVM by using the jvisualvm command:
If this is the first time you have run VisualVM, it will first perform calibration on your server. Once the calibration is completed, the VisualVM homepage will open as shown:
3. Select the WebLogic process that has the process ID you identified in step 1. Either double-click on it, or right-click and select Open.
4. VisualVM will open the Overview page for the WebLogic Server instance. You should check that the server name matches the one you were expecting
5. Switch to the Monitor tab, which provides an overview of the performance of the application.
6. There are a number of things that we want to check on this tab:
— The CPU usage should be relatively low (below 30 percent, ideally) and the amount of CPU usage spent on the GC activity should also be low.
— When garbage collection occurs, Used Heap should drop to a significant amount.
7. VisualVM contains two profilers that can be used to identify performance problems. We prefer to use the sampling profiler, which is available on the Sampler tab.
8. Click on the CPU button to begin profiling your application. The table will show which Java methods are taking the most time.
VisualVM hooks into the JVM to read the statistics about memory usage, garbage collection, and other internals, and displays them graphically for the user. Many of these metrics are useful in identifying where a performance problem lies. An application with high CPU usage may be CPU bound, but before just putting it on a host with more or faster CPUs, we need to identify what it is that is using the CPU. Garbage collection is a common culprit, so the graph showing the amount of CPU time spent running the garbage collector is particularly useful. If garbage collection is not the culprit, then the sampling profiler will show us where the CPU time is being spent.
Additional plugins can be installed for VisualVM to display the information in more convenient ways, or to display additional information. The Profiler can also be used to profile your application, but it imposes a much higher overhead, and so is more likely to affect the results.
Java performance check using JRMC on JRockit:
JRockit Mission Control is a graphical management console for the JRockit JVM. The tool jrmc is included with the JRockit JVM.
1. Use JPS to determine the process ID of your server.
2. Launch JRockit Mission Control by executing the command jrmc from the JVM bin directory:
If this is the first time you have run jrmc, it will open the JRockit Mission Control Welcome page:
3. Click on X next to the Welcome tab in the top-left to close the welcome screen. This will display the birds-eye view, which can be selected at any time from the Window menu.
4. Select the WebLogic instance that you want to connect to, based on the process ID that you established in step one. Right-click on the instance and select Start Console.
5. This will display the JRockit Mission Control console for the selected WebLogic instance:
6. There are a number of things that we are looking for on this page:
— JVM CPU Usage should be low, not higher than 30 percent ideally.
— Used Java Heap (%) should increase as memory is used, but should come down again when garbage collection occurs.
JRockit Mission control gathers management metrics made available by JRockit at runtime, and makes these visible graphically. The Mission Control console gives us an overview of the CPU and memory usage of the application with graphs of historical data. This gives us an overview that we can use to determine whether applications are having memory- or CPU- related performance problems.
Mission Control can also monitor JVMs remotely; refer to the JRockit documentation for guidelines on achieving this at http://www.oracle.com/technetwork/middleware/jrockit/documentation/index.html
JRockit flight recorder to identify java performance problems:
JRockit Flight Recorder is a component of JRockit Mission control that can be used to record detailed statistics about your application, to be viewed later.
1. Start JRockit Mission Control by following steps from 1 through 3 as mentioned in previous example.
2. Select the WebLogic instance that you want to connect to, based on the process ID that you established in step 1. Right-click on the instance and select Start Flight Recording. This will display the flight recording settings dialog box.
3. Accept the default Template and Filename settings, and set Name to be something descriptive. Recording Time should be long enough to encapsulate a use case that is considered to be slow. Once you are happy with the settings, click on OK and the recording will begin.
4. While recording is in progress, you can see the remaining time from the Flight Recorder Control view at the bottom of the screen.
5. Once the flight recorder finishes recording, it will open the Overview screen as shown:
This screen displays the overview of the recorded data. The event strip at the top of the screen can be used to select a specific part of the timeline to view. Clicking on Synchronize Selection will keep the selected time range the same across all the viewed screens.
6. Select the Memory tab; this tab shows us a number of interesting statistics about the JVM memory during the runtime. Notice that for more detailed information, you can select one of the subtabs at the bottom of the screen. On this page, we want to check a number of things:
— That the average and maximum Main Pause times are reasonable; for example, below 10 s
— That the heap usage comes down by a reasonable amount when garbage collection occurs, and that it is not just “bouncing around near the top of the graph”
7. Select the CPU/Threads tab. This tab shows the statistics regarding CPU usage and threads. We want to check that the Application + JVM CPU usage does not make up the majority of the total CPU usage, and that the total CPU usage is low (ideally at 30 percent or below).
JRockit Flight Recorder collects a large number of detailed statistics about an application execution, and stores them so that they can be analyzed later. This method allows a more detailed analysis to be performed then, if you are trying to observe the statistics in real time, you can go back and “replay” the gathered statistics to focus on certain time periods. Flight recorder can be executed with a number of profiles; the Normal profile has the lowest overhead, so is most suitable for use in production environments. The other profiles gather more information, but have a higher overhead, so are more suitable when you can reproduce the problem in a test environment. One of the advertised powerful features of JRockit is that the Normal collection profile adds virtually no overhead to the JVM execution time, as the collected metrics are used internally by JRockit during execution for runtime optimization.
It can be helpful to take Flight Recorder profiles of healthy systems, so that if a future release introduces a performance problem, you have a baseline to compare it to.