To show how easy it is to get data into a Python program I'll show a little utility to unload data from one table in a database to a flat, text file as one record per line with each data field separated by special delimiters.
Assuming the following:
- Your Python program already has a database connection
- This is relatively easy, but does vary by specific database driver being used
- And it has opened the output file to unload to i.e. an open file handle and not the file name
- And it has the name of the table, the field delimiter characters and the record terminator characters in local variables
def unload_table_to_file (db_conn, table_name, out_data_file, field_separator, record_terminator) :
# Preparation steps before doing the real work
# Replace literals of special characters in the separator and terminator
record_terminator = record_terminator.replace ('\\n', '\n')
field_separator = field_separator.replace ('\\t', '\t')
# Simplest SQL to get the data, assume column ordering is correct
select_SQL = "select * from " + table_name
# Now do the work - read in all records and output them
print ("=== Unloading . . .")
select_cursor = db_conn.cursor ()
select_cursor.execute (select_SQL)
for data_row in select_cursor.fetchall () :
# Convert Python native data types to simple string representations
output_fields = []
for data_field in data_row :
output_fields.append (str(data_field))
out_data_file.write (field_separator.join (output_fields) + record_terminator)
print ("=== Finished")
The main lines are:- First make sure that if any "escaped characters" are in the delimiters then they are the real escape characters
- "
\t
" means the TAB character, and "\n
" means the NEWLINE character
- "
- We just use a "
select * from table
" to get all the data into the Python code- The String concatenate operator is just the "
+
" operator i.e. "+
" is overloaded depending on data type
- The String concatenate operator is just the "
- And we execute this "select" via a Cursor created from the database connection passed in
- The real work happens in 2 nested loops - for each data row, for each data field in each data row
- The fields from one record are put into a list (essentially equivalent to an array)
- This is so we can use a single action at the end to concatenate them all together with each data field value separated by the "field separator"
- Then we "write" out this concatenated list of fields to the output file
- The "
str
" function will return the String representation of a data value e.g. convert an integer to its String representation- This assumes that there exists a data type specific "
str
" method for each possible data type
- This assumes that there exists a data type specific "
- The "
join
" method of a String variable iterates over a list of data, and returns one String of the list contents concatenated together with the value of the String variable between each one.
,
' as the field separator) or something more complicated to avoid issues where a comma may appear in the data itself e.g. "|__|
"
as the field separator (vertical bar, underscore, underscore, vertical
bar). And this flexibility is all handled by this Python code.Caveats
This is probably the most simple Python code to achieve the goal of reading all the data in a table and writing out to a flat, text file. There are several things it does not do at all, and other things that could be enhanced with it.- It is assumed that all data types being used can be cleanly converted to valid String representations using "
str()
"- This may not be true for some data types, or you may wish to handle some data types differently e.g. Dates
- Dates should probably be handled differently, to ensure the full date and time values are output
- You could execute an "
ALTER SESSION
" SQL statement to set the value ofNLS_DATE_FORMAT
to something suitable - Python has a native Date / Time data type, "
datetime
", which OracleDATE
columns are mapped to, and you could test for this
- You could execute an "
- You could add extra functionality such as debugging / tracing messages (use the "
logging
" module) or other progress messages- You may wish to add extra counters to count the number of records and fields per record
No comments:
Post a Comment