In this post I’d like to review a feature called RAT (or Real Application Testing). This feature is an Enterprise Edition database option and is not new at all, it was introduced in 11gR1.
In Oracle, there are different tools that allow the DBA to perform complex tasks very easily and efficiently. RAT is exactly one of these tools, its goal is to validate any change we make to our database. It has a very good OEM integration (Grid Control 11g or Could Control 12c), but it can be used independently using packages supplied in the database itself.
Over the years I configured and upgraded quite a few environments. From real upgrades (e.g. from 9i to 10g or 11g), through OS changes (e.g. Oracle on HP-UX to Linux), to parameter and optimizer changes (memory allocations, move from Rule Based Optimizer to Cost Base Optimizer and more).
In many cases, we had issues and problems on the production environment regardless the amount of time and effort we put in testing. After upgrade, the common issues were low performance (often due to optimizer changes, but sometime even because of memory management changes or other reasons), other issues may include small Oracle bugs (such as bugs causing queries to never complete or get ORA-600). Even with smaller changes such as moving from 32-bit to 64-bit database, or parameter or optimizer changes, we encountered performance issues, memory problems or other unexpected issues.
Of course, that in most cases (as I mentioned above), we tried to test as best as we could. In this type of testing, customers usually start a new database containing the changes (upgrade, new parameters, etc.) and run the application and sometimes manual queries on it. This method has several problems:
- You can’t really run all the application processes. In complex systems we sometimes don’t even know what is running. We know the main processes or processes or users that overload the database or have issues with it, but there can be reports, or other processes that we are not even familiar with and can’t reproduce.
- It doesn’t matter how much we try, we can’t cause the same load on the test environment as the production’s.
RAT comes to solve this exact problem. This features runs the production workload on a test environment and identify changes and problems. It does it using 2 components, which are actually building the RAT feature.
SQL Performance Analyzer (SPA)
SPA is the tool we use to perform SQL statements “unit testing”. The SPA will run all the statements from the production sequentially once by one and will compare execution plan, execution statistics, etc. Using the SPA we will be able to identify SQL statements that behave differently in all kind of ways after the change we have made. For example:
- Statements that read more or less data from the Buffer Cache.
- Statements performing more or less I/O.
- Number of rows the statement returns.
- Execution time.
Using the SPA we are able to identify issues with specific statements, such as problems with optimizer or parameters, and even wrong results issues.
The second tool of RAT completes the SPA and is called Database Replay. Its goal is to execute the workload from the production on the test environment and identify concurrency and scalability issues. Database Replay will try to run the exact workload and then compare statistics and AWR between the environments. With DB Replay we can identify behavior changes of the entire database such as:
- SGA load
- Oracle memory management
- Locking issues
The main advantage of this component is that we can evaluate how the new environment will look like and behave.
When Should We Use RAT
Generally, all the time. We always want to make change and update our environment but we are also afraid of them. Upgrades, parameter changes, and system changes are something that is happening on every production environment. But beyond these massive changes, many production systems are not updated (even though security patches or patches for bug fix are important). The fear from degradation after upgrade, patch installation or any change prevents us from doing these changes.
With RAT we can be much safer. We can run the actual production workload and see how the database behaves after the change from different perspectives, from the single SQL statement to the entire database.
Usually, new features come with new versions, therefore it might take a while until we can implement them. The nice thing about RAT is that it was backport to older versions so the feature will be usable immediately.
RAT was first introduced with Oracle 11gR1 and from this version on it exists in the database. Besides that, it exists in 220.127.116.11 and 10.2.0.4 and up. In some other versions of 9i and 10g it can be installed as a patch.
The matrix of the supported operations in the different versions is a bit complex. Everything is described in MOS note# 560977.1
Hope you enjoyed,