Thursday 12 July 2012

SQL*Net Wires Crossed

I was trying to set up Data Guard between two servers and I kept getting connection errors from RMAN on the primary to the secondary:
RMAN>  connect auxiliary sys/syspassword@dgtest_stdby

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-04006: error from auxiliary database: ORA-01017: invalid username/password; logon denied
I tried to validate everything and it looked okay ("tnsping" worked), but I could not connect through SQL*Net at all, whether from RMAN or SQL*Plus. Local direct connections worked fine though. Eventually I managed to work out what was wrong.  So in case anyone else runs into the same problem ...

What I had originally was:
  • A newly created database (using CREATE DATABASE the old fashioned way)
  • An Oracle password file created using orapwd with the SYS password
  • A listner.ora on each server defining a static service name for the database on that server
  • A tnsnames.ora on each server defining the mappings of connection names to servers and services
The problem was that I had never explicitly set a password for the SYS user in the newly created database. I could still connect locally, which I assume is because I was in the "dba" group in the operating system. But I could not connect remotely no matter what password I used. So the solution was:
  • ALTER USER SYS IDENTIFIED BY NEW_PASSWORD
  • This automatically updates the $ORACLE_HOME/dbs/ora$ORACLE_SID password file on the server
  • Copy (ftp) this file over to the second, standby server
Now I could connect over SQL*Net to either database.

I would say that the error message is not particularly helpful, as it seems that all manner of different error conditions can give rise to the same error message. And there is almost no way of investigating this, other than trial and error - which I was getting sick of after 2 hours of getting nowhere. I knew that the underlying SQL*Net worked, as tnsping worked, and I knew that the Listener configurations were correct. Eventually I simply decided to change the SYS password to be something completely different and lo and behold I could now connect to the primary database remotely. It then took another few minutes before I realised that I needed to copy the Oracle password file to the standby server again, because it had changed.

No comments: