In February ’17 I participated in Mike Dietrich’s upgrade workshop and it was great! I don’t want to repeat stuff that he said there, you can read everything on his blog. This workshop made me think about upgrades I did in the past (and I did quite a few) and important things to think about before and after upgrading a database.
Before we even start planning the upgrade, there are a couple of questions we should ask
- Does the application support the new version? (a very important question especially with 3rd party applications)
- Can we expect problems with the new version?
Answering the first question is relatively easy. The second one is more difficult, but is very important. New versions introduce new behaviors and also new bugs, and these might affect the database after the upgrade. For example, let’s assume we work with AIX and use bigfiles, and would like to upgrade to 220.127.116.11. There might be a known issue with bigfiles on AIX (there isn’t, it’s just a hypothetical example). If there is such an issue, we might decide not to upgrade until it is fixed. In these cases, knowing this stuff in advance might be critical. Oracle publishes known issues in specific database versions, so we should review this information before planning the upgrade.
you can find a lot of information about that in Mike’s blog and presentations, in the documentation, and MOS.
After we’ve decided to upgrade, the next step is to plan the process. The planning is a critical part of the upgrade as a good planning will probably result in an easy and smooth upgrade. Some of the planning steps are quite obvious, like choose the upgrade method (direct, using standby database, export/import, etc.). I’m not going to write about that, but I do would like to mention several things that the planning should include:
- Write a very details procedure. The real upgrade should be very familiar and should not include anything new or unknown.
- Use scripts and prepare commands as much as possible. When we actually perform the upgrade, we don’t want to mistype commands or make mistakes. The procedure should include all the commands that need to be executed, and we will just “copy” and “paste” them.
- Many times the upgrade process requires steps that generate log files. It is very important to have a step in the upgrade procedure to check these log files, so we won’t forget that.
- Have a fallback plan. While we don’t want to get to the place where we need to revert the upgrade and go back to the old version, we have to know how to do it if needed. Another thing here is to decide when this revert decision should be taken. For example, if the system has to be up and running by 8am and the fallback is a full database restore, we need to know how much time the restore takes (and have the commands or script ready). If the restore takes 2 hours, we have to decide if the upgrade is successful or not by no later than 5:30am (to allow some buffer time).
- Test, test, and test some more. Everything in the procedure should be tested. This includes the upgrade itself, the database after the upgrade, and even the fallback plan. Don’t forget to add steps to the procedure as needed (if you decided to change an initialization parameter, for example). I remember a 20TB database upgrade that we tested about 4-5 times before actually upgrading the database. Note that the test should be on an identical environment as the production, because different environments can have different behavior.
Don’t forget that upgrades usually happen at night. When the upgrade is long, we are very tired, and tired people can easily miss stuff. This is why everything in the procedure should be clear and simple. As long as we don’t have any unexpected problem, the procedure should be followed step by step without any “thinking” (all the thinking should happen while preparing the procedure and testing).
After the Upgrade
In the upgrade procedure, don’t forget to add steps to do after the upgrade. This might include updating backups, fixing scripts, updating monitoring tools and so on. There is a new ORACLE_HOME, sometimes a new server, maybe a new SID and it’s very important to update everything to the new environment.
Working With the New Version
The days after the upgrade are risky. It is very common to have issues with the new database, mainly because we can’t check everything on the test environment. We can hit bugs, performance issues, and other things that we didn’t think about (or didn’t happen in test). Performance problems are very common, and I have a method to check optimizer changes. Other issues can be fixed by being creative or open an SR. We should be very responsive and be prepared to handle any issue that comes up. In extreme cases, a revert might be the only option, even after a few days. Specify in the procedure how long after the upgrade this is an option and make sure you are able to revert.
Here are a few examples from my experience:
When upgrading a database from 8.1.7 to 9.2 (without enough testing, I came to do only the real upgrade itself), suddenly packages became invalid and couldn’t be compiled. The reason was that in 8.1.7 there could be a difference in default values between the procedure declaration (in the package header) and the implementation (in the body), while in 9.2 it had to be identical. This broke many packages and the developers worked hard to fix all the packages.
In one of the upgrades I’ve been involved with we did a lot of testing. However, after the real upgrade (18.104.22.168 to 10.2.0.2) we started having shared pool issues. With the help of Oracle support engineer on site, we found out that this is because of shared pool subpools. Setting an underscore parameter fixed it immediately.
Even patchsets can change the database behavior. In GI 22.214.171.124, Oracle introduced a new multicast behavior when communicating between the nodes. I know a few upgrades that simply hung because of this. The upgrade of the database here was fine, but the GI is the first thing to upgrade. This is why it’s important to have an identical test environment (including RAC, ASM, Data Guard or any other feature we use on the production).
Upgrades are complex tasks. In order to perform them well we need to try them, and prepare a good procedure. Good preparation will assist in successful upgrade with as little problems as possible. I tried to give some tips here to help planning better the upgrade procedure, I hope it helps.