Every dba's
nightmare is facing an Oracle disaster recovery situation. So when the
worst has happened and disaster has struck, what do you do?
Well, first of all, you take a deep breath and you don't panic! Next,
you buy some time to think by asking for the details. What
happened? When? It's probably best to avoid asking who,
though, just in
case it was caused by something you did or didn't do!
Then you have to come up with a plan.
Planning The Recovery
How you handle your Oracle disaster recovery and how quickly you can
recover depends on the nature of the disaster, so let's run through a
few common scenarios and see what the options are.
Application Error
Problem:
The monthly batch job has run but the wrong figures were loaded and now
the users want to revert everything to how it was before the batch job
ran.
Solution:
With Oracle 10g and above, recovery from this sort of disaster is much
easier than in the past. The options available to the dba include
flashback query, flashback table, flashback transaction, flashback
database and incomplete recovery.
Let's look at these options in a little more detail.
Oracle
Tips
& Tricks to
SKYROCKET Your
Career!
If
you're not already a subscriber to Oracle Secrets Revealed,
you're missing out on a myriad of tips and techniques to help
you work better, faster and
smarter. Subscribe now and ignite your
career.
Flashback
query
This was introduced with Oracle 9i and let's you query a
table as it was at an earlier point in time or scn. If only a few
tables are
involved you could use this to determine which tables were affected by
the batch job and then restore them.
Flashback
table
This was a new feature with Oracle 10g and enables you to
restore a table (the whole table) to its state at
an earlier time/scn/restore
point. One very useful feature of this is that you can go
backwards and
forwards in time (or scn) as many times as you like to find the exact
time or scn to which to restore the table.
Again, this
is practical only if a small number of tables are involved.
Also the undo needs to be available.
In this situation Oracle disaster recovery is achieved by use of the
following SQL command:
FLASHBACK
TABLE TO <object_type>
<scn/timestamp/restore_point_name>;
where <object_type> is one of
SCN,
TIMESTAMP, RESTORE POINT.
Flashback
database
If a large number of tables are involved and the batch job was the only
thing running at the time, you can rewind the whole database with one
simple command in RMAN or SQL*Plus:
FLASHBACK
DATABASE TO <object_type>
<scn/time/restore_point_name>;
where <object_type> is one of
SCN,
TIME, RESTORE POINT.
The prerequisites for being able to
use
FLASHBACK
DATABASE are:
- The
database must be in
ARCHIVELOG
mode -
archived logs are needed for the Flashback Database operation not to
mention most of the other methods of Oracle disaster recovery.
- You must
have a flash recovery area enabled - flashback logs can only be stored
in the flash recovery area.
- For Real
Application Clusters databases, the flash recovery area must be stored
in a clustered file system or in ASM.
Flashback transaction
This can help with your Oracle disaster recovery by reversing
transactions
and optionally dependent transactions (by use of undo records). This
is performed
by use of the
procedure
DBMS_FLASHBACK.TRANSACTION_BACKOUT specify
either the ids or the names
of transactions to be reversed.
Incomplete recovery
This
is the most complex and time-consuming method of Oracle disaster
recovery and is really your last resort if the other options are not
suitable for any reason.
Incomplete recovery can be
performed with the help of Oracle's Recovery Manager tool RMAN or by
using o/s utilities for the restore and then SQL commands for the
recovery. Recovery in this scenario comprises the following steps:
- Shutdown the database (SHUTDOWN
IMMEDIATE)
- Start the instance and mount the
datbase (STARTUP
MOUNT)
- restore
all the data files and archived logs (as well as the control
file
if the database structure has changed) using o/s commands or the restore
database command in RMAN
- recover the database (by use of the recover
database command in SQL or RMAN) until just before the
start of the batch job (specify the time or scn)
- finally, open the database and reset the logs
Your database will then be in a usable state with the data as it was
before the batch job started.
User Error
Another
scenario requiring Oracle disaster recovery is a user dropping a table,
a whole tablespace or deleting a data file. In this scenario,
a
user is not necessarily a business user - it could be a developer or
even a dba. Never the less you still have to deal with it.
In
this scenario, recovery is the first step, the 2nd step is to
see
if you need to change procedures or security policies to prevent this
happening again! But that's outside this discussion.
So what options are available here?
FLASHBACK
TABLE
Let's
start with the easiest one - a dropped table. A feature introduced with
Oracle 10g is the concept of a recycle bin for dropped tables (and
dependent objects). To recover a dropped table, it is simply a case of
issuing the following SQL command:
FLASHBACK
TABLE emp TO BEFORE DROP;
Note however that dependent objects such as
indexes will have to be renamed, but it's certainly another quick and
easy recovery option!
Recovering a data file/tablespace
Oracle disaster recovery
resulting from the loss of a tablespace or a datafile involves four
steps that can all be done with the database open
for general use - as long as the
SYSTEM
and
UNDO
tablespaces have not been affected.
- Set the affected tablespace offline (alter
tablespace <ts> offline temporary)
- Restore the datafile from backup (using o/s
commands or RMAN)
- Recover the datafile i.e. roll forward using
archived and online redo logs (recover
datafile <filename>)
- Set the affected tablespace online (alter
tablespace <ts> online)
If either the
SYSTEM
or
UNDO
tablespace has been affected the Oracle database will shut down
automatically and so the first and last steps are not required. Again
the whole process is a lot easier if RMAN is used as RMAN keeps track
of where the backup sets are stored and optimises the restoration of
the data files.
To recover a
whole tablespace it's just a case of subsituting the word tablespace
for datafile and specifiying the tablespace name instead of the
datafile name.
Media (hardware) failure
This is probably
the least common of the Oracle disaster recovery scenarios we've looked
at but it is important to know how to deal with this should
the
need arise.
The first step is to determine whether the failure
is permanent or temporary and how wide-spread it is. For a permanent
failure, lost data files need to be restored to a different
location, your Oracle database made aware of the changes and
then
the files recovered as per user errors. With a temporary failure files
can be restored to their original location if they've been lost or
damaged
Assuming a permanent media failure, these are the steps required for
Oracle disaster recovery:
- Shutdown the database (SHUTDOWN
IMMEDIATE)
- Restore the datafiles from backup (using o/s
commands or RMAN)
- Re-start and mount the database (STARTUP
MOUNT)
- Rename each of the affected datafiles (ALTER
DATABASE RENAME FILE <old_location>
TO <new_location>)
- Recover the datafiles i.e. roll forward using
archived and online redo logs (RECOVER
DATAFILE <filename>)
- Re-open the database (ALTER
DATABASE OPEN) and make it available for general use
It
is possible to do this recovery with the database open but it could
well be quicker and easier to shutdown the database rather than having
end-users complain that parts of the application are not working.
See the Oracle documentation for full
details on these commands and Oracle disaster recovery options.