Sunday, 22 July 2012

Oracle 11gR2 on Arch Linux

The Oracle Database software is only "supported" on Red Hat Enterprise Linux and Oracle Enterprise Linux I believe. But it can be made to work on many other Linux distributions one way or another, as a lot of the stuff is the same or equally available on other Linux distributions. While this is not supported or recommended in any way by Oracle, it can be useful if you have a different Linux distribution installed on a test system for one reason or another and cannot change it. Here I will present the recipe for how I got Oracle 11g Release 2 installed on Arch Linux. Remember that none of this should be used in any kind of production like environment as it is unsupported.

Arch Linux

Arch Linux is a Linux distribution based on a rolling release model. This primarily means that there is only ever one version of Arch, being the "latest version", and whenever new releases of software packages are made by the developer they are then added to the Arch repositories after a small delay. This means that you can always get the latest version of any available software package on Arch, and that there is never a big upgrade or reinstall hassle with Arch every 6 or 12 months. Instead you regularly do a system upgrade which pulls down the latest versions of all installed packages from the repository, so that your system is always "up to date".

Another thing about Arch is that it is highly customisable to whatever combination of Linux software applications you want. It does this by only installing a "bare bones" set of Linux packages that gives you a text only console with a "login" prompt (classic old school UNIX). From this you then install all the extra packages you want, to arrive at your customised Linux, just as you want it. No bloat from other applications you never wanted in the first place. This means installing and configuring Arch is more than just booting from a Live CD and clicking on "Install". But Arch has an excellent Wiki including an "Installation Guide" that explains everything.

Arch has its own package manager (pacman) that handles all of the downloading, resolving dependencies and installation of the packages. So if a package is available you can just install it and everything it needs through one simple command.

I'm assuming for this article that you already have Arch Linux installed, and know enough about how to install it. And that you have installed Arch plus X Windows plus a full Desktop Environment (e.g. Gnome or Xfce) plus a Display Manager to handle the initial user login (e.g. GDM). If you don't use a full blown Desktop Environment then you will need the equivalent combination of Window Manager, Application Menu, etc., which is also possible in Arch.

Extra packages

Oracle 11g requires "gcc" which is in the "base-devel" package along with other development tools, such as "make". Install this, plus some other needed packages:
pacman -S base-devel elfutils libaio unixodbc sysstat pdksh icu gdb
pacman -S libstdc++5
The last one - "libstdc++5" - is needed because the make files provided with Oracle make explicit reference to this. Since Oracle 11g was originally compiled on the current at that time version of Red Hat Linux the standard C library has moved on to version 6, but Oracle still wants something called version 5 when installing.

Symbolic links

Oracle expects some commands to exist in the /bin directory when they are now in the /usr/bin directory as standard. You can work around this by creating symbolic links from one location to the other for these files:
ln -s /usr/bin/basename /bin/basename
ln -s /usr/bin/tr /bin/tr
ln -s /usr/lib/libgcc_s.so.1 /lib/libgcc_s.so.1
Other symbolic links are mentioned by other people for other Linux distributions, but for Arch these are the only ones I needed to make - other links already existed for some other files. Also bash is the default shell on Arch, which is the same as on Red Hat.

Normal Oracle Pre-Installation

  • Configure the Linux kernel and other settings as stated in the Oracle installation instructions. This includes:
    • Settings in /etc/sysctl.conf
    • Settings in /etc/security/limits.conf
    • Adding a line to /etc/pam.d/login if not already present
  • Create the groups needed (oinstall and dba), and the "oracle" user account.
  • Create the directories where the software will be installed e.g. /apps/oracle and /apps/oraInventory
  • I create a ".profile" file for the "oracle" user and set ORACLE_BASE, ORACLE_HOME based on the installation directories. Also set ORACLE_SID, and add ORACLE_HOME/bin to PATH.
 

Install Oracle 11g

Get the Oracle 11g software distribution - downloadable from oracle.com - and unzip the files to another directory. Run the installer and make your choices from the screens it presents - I normally go for "Install Software Only" and "Single Instance".

You will get the following issues when running the Installer:
  • All prerequisite checks will fail. This is because Oracle uses "rpm" to find out what packages are installed, and Arch does not use "rpm", so Oracle thinks everything is missing.
    • Providing you have installed the packages listed earlier, simply click on the "Ignore All" check box on the top right of the screen, then click "Next".
    • Click "Yes" on the pop-up dialog box that warns you about ignoring prerequisites. Then click "Install" as normal on the final screen
  • You will get a failure message during installation about "target agent nmhs", from the make file "sysman/lib/ins_emagent.mk".
    • This only affects the Enterprise Manager Agent, which is generally not used, and so can be ignored.
    • It is due to a change in how "gcc" scans libraries to find referenced symbols in a program.
    • If you really want this Agent, a workaround is possible by editing the makefile - just Google it to find a solution.
    • Assuming you don't want this Agent, just click "Continue" to ignore it can carry on.
Eventually you will get the normal pop-up dialog box about running the "root.sh" scripts as root, and then the installation is complete.

From here you can go on and create a database using the Database Configuration Assistant (dbca) or the "CREATE DATABASE" SQL statement, whichever you are more familiar with.

Tuesday, 17 July 2012

Python for the DBA (1)

I really like the Python programming language for general purpose programming and quickly putting together little utilities. There are similarities to Perl in terms of being an interpreted language with powerful data processing capabilities, but there the direct comparisons end. Python is a more modern language than Perl, and has a relatively clean design (nothing is ever perfect though, and Python itself has evolved over the years). The main reasons I prefer Python over Perl are:
  • Very clean syntax - no need for obscure characters in front of variable names and so on
  • Very readable code - it is generally clear what a piece of code is doing
  • Minimal syntax dressing overhead - Python avoids the need for begin / end block statement markers
  • True support for functions
    • Perl just pushes all the arguments into a single list that you must disassemble
  • Strongly typed - Python supports multiple data types and checks at run time that operations are valid
  • Rich set of types - number, string, list, dictionaries, plus others e.g. set, tuple
  • Supports classic "function" based programming - just like 'C' does
    • Very simple to write code to directly do what you want, and modularise common code into functions
  • Also supports full "object oriented programming" - full support for Classes as first level objects
    • Perl does not truly do classes as first level objects
I'm not trying to say that Python is better than Perl in all cases, or other programming languages. I'm just saying that Python is a very good and usable programming language, and that I prefer it over Perl now. With Perl I find it gets very confusing when you try and use any level of complexity and the syntax is not obvious or consistent, whereas with Python I don't get any of this because it is such a cleanly designed language. With Python I can easily build a small application by just putting together the necessary source code while using functions for modularity (what I call "just build and run"), avoiding the overheads of defining classes with fully object oriented programming. However, if I have a more complex set of requirements then I can do a full blown object based solution using classes within Python (too complex to just build, so some initial design is needed, probably with test cases too).

Python is very useful for a variety of different scenarios:
  • building a small application or utility by just writing the code - no "compile and link" steps as it is interpreted
  • building something iteratively (top down design & development) as your code is always runnable (being interpreted)
  • prototyping object based code quicker and easier than compiled languages such as Java
  • ability to extend Python with your own libraries - it is written in C and can call your own compiled libraries
  • quickly prototyping real Java code that uses standard Java Classes in the JVM via Jython 
    • Jython is a Python interpreter written in Java
    • this also uses far fewer lines of source code than would be needed in Java, because of Python's fundamentally different design
Again, some of these capabilities are not unique to Python, but the combination of them all together make it a winner for me.

Python Overview

Python should be straightforward enough to understand on reading it. The only major difference to other languages is that Python does not have begin / end statement block markers and instead relies on statement indenting to determine which statements are part of the same block. You will see that statements with embedded statement blocks in them have a colon (':') at the end of the first line, and the following lines are indented. Initially it takes some getting used to, but it does end up with less typing on your part and a consistent code layout.

Although Python supports what I call direct coding (write some statements and then just run them), it is really fully object oriented behind the scenes. Thus you see many standard functions actually returning objects, against which you invoke one of their methods (see Database Example below). The syntax of "object.method (arguments)" occurs frequently, rather than the non-object way of "function (object, other-arguments)".

Python Development

You could edit and run Python programs from the command line ("python filename.py" - the ".py" extension is just a common convention), or you could use an Integrated Development Environment, such as Eclipse with the PyDev plugin. The nice thing about this is that it does syntax checking for you as you type, so you can easily spot various errors before you run your code. And you can run your Python application from within the development environment, so you don't have to leave the editor each time you want to test something.

Python Database Access

Like Perl, Python defines a standard API for database access. The intention is that developers can write code based on this standard API, and a variety of drivers can be provided for access to different databases. By changing the driver used you can connect to different databases, but the bulk of your code remains the same.

There are a bunch of drivers out there for Oracle, but the most common one seems to be cx_Oracle.

Python Database Examples

Using Python is relatively straightforward. I won't describe the database API in detail, as it is obvious when you use it. In summary, you call a "connect" library function with connection details and get back a connection handle (really an object). Using this you can then create a new statement handle, execute it, and fetch back any data or other results. One neat thing is that Python has a "fetchall" method that fetches back all data rows for a SELECT into a single list (actually a list of rows, each row being a list of columns). This then lets you traverse the data using Python logic, with no more fetch calls. I assume that there are efficiencies with the "fetchall" call over how the data is transferred back over the network, but I have not done any tests about this. It will of course end up using more memory within your Python program to hold all the results fetched.

Example

Connect to an Oracle database and execute a SELECT from a table, printing out the data fetched.
from cx_Oracle import connect

conn = connect ("scott", "tiger", "orcl")
# Or: conn = connect('scott/tiger@orcl')
curs = conn.cursor()
curs.execute("SELECT name, value FROM atable")
rows = curs.fetchall()
for i in range(len(rows)):
    print "Row", i, "name", rows[i][0], "value", rows[i][1]
Notes:
  • len is a built in function that returns the number of entries in a list
  • range is a built in function that produces a list of numbers from 0 to one less than the supplied value
  • List members can be accessed using array like notation

Thursday, 12 July 2012

SQL*Net Wires Crossed

I was trying to set up Data Guard between two servers and I kept getting connection errors from RMAN on the primary to the secondary:
RMAN>  connect auxiliary sys/syspassword@dgtest_stdby

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-04006: error from auxiliary database: ORA-01017: invalid username/password; logon denied
I tried to validate everything and it looked okay ("tnsping" worked), but I could not connect through SQL*Net at all, whether from RMAN or SQL*Plus. Local direct connections worked fine though. Eventually I managed to work out what was wrong.  So in case anyone else runs into the same problem ...

What I had originally was:
  • A newly created database (using CREATE DATABASE the old fashioned way)
  • An Oracle password file created using orapwd with the SYS password
  • A listner.ora on each server defining a static service name for the database on that server
  • A tnsnames.ora on each server defining the mappings of connection names to servers and services
The problem was that I had never explicitly set a password for the SYS user in the newly created database. I could still connect locally, which I assume is because I was in the "dba" group in the operating system. But I could not connect remotely no matter what password I used. So the solution was:
  • ALTER USER SYS IDENTIFIED BY NEW_PASSWORD
  • This automatically updates the $ORACLE_HOME/dbs/ora$ORACLE_SID password file on the server
  • Copy (ftp) this file over to the second, standby server
Now I could connect over SQL*Net to either database.

I would say that the error message is not particularly helpful, as it seems that all manner of different error conditions can give rise to the same error message. And there is almost no way of investigating this, other than trial and error - which I was getting sick of after 2 hours of getting nowhere. I knew that the underlying SQL*Net worked, as tnsping worked, and I knew that the Listener configurations were correct. Eventually I simply decided to change the SYS password to be something completely different and lo and behold I could now connect to the primary database remotely. It then took another few minutes before I realised that I needed to copy the Oracle password file to the standby server again, because it had changed.

Monday, 2 July 2012

All Outer Joins Are Not Equal

When executing SQL involving ANSI syntax OUTER JOIN's Oracle first converts this to its own native syntax before executing the SQL, and sometimes this conversion is not totally transparent thereby affecting and restricting the execution plan produced.

This started when I had a long running query, and it happened to involve outer joins between most of the tables (but not all of them). From the execution plan it was clear that the main problem was a full table scan on a very large table - out of a total cost of 35,000, 28,000 was from this full table scan alone. So replacing the full table scan by a more efficient index access, if possible, could lead to a significant improvement.

After some analysis I created an index on all of the referenced columns in the table. This meant that all necessary data for the query could be obtained from the index, with no need to visit the data rows in the table at all. As the data row was quite wide (180 bytes), the index was much narrower and much smaller in size. Also, because the index stores data in sorted order by the leading columns, it meant that the rows the query wanted would all be stored together within the index structure, further reducing the number of disk I/Os needed to retrieve it all. But, the Oracle Optimizer would not use this index at all. It just seemed to ignore it, and always do a full table scan.

At this point I tried another of my techniques for understanding how a query gets executed - remove tables from the query one at a time, until you get a more "sensible" execution plan, and then build it back up again to the final query by adding a table at a time, monitoring the execution plan after each change.

What I noticed was that when I got down to 3 tables from the original 7, the Optimizer would now choose to use the new index (doing an Index Fast Full Scan, at a lower cost of 4,000). But when I added back in a fourth table, it went back to doing a Full Table Scan at a cost of 28,000.

I did a 10053 trace on the execution of each query (3 table and 4 table queries) and noted that in the SINGLE TABLE ACCESS PATH section of the 3 table query it was costing an Index Fast Full Scan (IndexFFS), whereas in the 4 table query it was not costing this access path, and only had the Full Table Scan costed.

At this point while I could see what was going on (in one case the Optimizer knows about the Index Fast Full Scan as an option, but in the other case it doesn't) I could not explain why it was happening, nor what to do about it. So I turned to the Oracle Forums and raised a question about Why Optimizer ignoring Index Fast Full Scan? While waiting for replies I also continued to tinker with the queries in the hope of gaining some extra understanding of what was going on.

The cause of the problem was identified by Dom Brooks as being the fact that Oracle internally rewrites ANSI syntax OUTER JOIN's to its own native syntax (using "(+)") before then optimizing and executing the query, and that this can result in some complex SQL that cannot be optimised as you might otherwise expect. Also see his earlier reply which gives more background information and links to other explanations about this.

When it rewrites the ANSI syntax JOIN's to Oracle native syntax, it wraps each join up in its own in-line view, to ensure that it is equivalent to the ANSI meaning. It then tries to merge these inline views back into the main query. For a single outer join, and other simple cases, this view merging happens smoothly. But in other cases it does not, and the Optimizer is left with some complex SQL with the joins being performed within the inline views. And this restricts the options available to it when building the final execution plan.

There is more information on this at the Oracle Optimizer Blog - Outer Joins in Oracle
"In Oracle, ANSI left and right outerjoins are internally expressed in terms of left outerjoined lateral views. In many cases, a left outerjoined lateral view can be merged and the ANSI left (or right) outerjoin can be expressed entirely in terms of Oracle native left outerjoin operator. (A lateral view is an inline view that contains correlation referring to other tables that precede it in the FROM clause.)"
So for a simple single outer join such as the following:
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2 ON (T1.x = T2.x);
it first becomes this on initial conversion to a lateral inline view:
SELECT T1.d, LV.c
FROM T1,  LATERAL (SELECT T2.C FROM T2 WHERE T1.x = T2.x)(+) LV;
which then becomes this when the view is merged:
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) ;
However, if the joins are more complex, or there are other issues e.g. NULL values allowed within indexed columns, then it may not be possible to collapse the lateral views back into a single level query. In which case the options then open to the Optimizer are less than they would be otherwise, as it must treat the unmerged views on their own.

So if you have a query involving OUTER JOIN's that is not getting as good an execution plan as you think it could get, it may be due to you having used ANSI join syntax. And when Oracle rewrites this to its native syntax it is not able to merge back together the inline views for one reason or another. In my case it seems to have been due to the fact that almost all columns allowed NULL values to be stored in them (except the primary key columns), and NULL values are not stored within indexes on disk. As a result, Oracle was not able to utilise this extra index when more tables were added to the query that were also outer joined.