- 1.1 Schema Migration
- 1.2 Data Migration
- 1.3 Database professionals use the following methods of emulating ROWID in PostgreSQL:
This article contains practical recommendations for Oracle to Postgres migration based on the experience of an Intelligent Converters company that completed a wide range of related projects. The most common reason for Oracle to Postgres migration is reducing expenses on IT infrastructure by transferring from proprietary DBMS to an open-source equivalent with sufficient scalability, security, and customization capabilities. The suggested tips and tricks are valid for all Oracle versions starting from 11g and all versions of on-premises and cloud PostgreSQL.
The generic approach to Oracle to Postgres migration is designed according to the extract-transform-load (ETL) method and includes the following phases:
- Assessment and planning – Locate Oracle-specific features of storing and processing data with related scope of use.
- Schema migration – Find the automated solution for schema migration and implement it with a focus on correct types mapping and translation of default values, indexes, constraints, and relationships between tables.
- Data migration – Find the appropriate tool for data migration in view of Oracle system tolerance to downtime and overhead. Run the data migration focusing on all required transformations to make the data compatible with PostgreSQL.
- Migration of the database logic – Translate PL/SQL code to PostgreSQL dialect of the procedure language using special software tat can partially automate the conversion.
- Testing – Run performance and functional tests covering all key modules and workflows of the resulting database. Fine-tune all the issues and bottlenecks being found.
Those basic migration steps are explored in detail below
Schema Migration
Although Oracle data types are similar to PostgreSQL, there are some types that have no direct equivalent.
For example, the Oracle DATE type contains both date and time parts while PostgreSQL supports the following options:
- date – pure date without time part
- Time – Net time without date part with optional timezone details
- Timestamp – Date and time with optional timezone details
There are two approaches to conversion of Oracle dates to PostgreSQL: using TIMESTAMP or setting up the extension ‘oracfe’ that allows working with dates in Oracle style through the type oracle.
Another example of types requiring special attention is Oracle spatial type SDO_GEOMETRY. It must be converted into Postgres geography or geometry types. Also, PostgreSQL requires PostGIS extension to be installed in order to work with spatial data types.
Other Oracle types missing in PostgreSQL are listed in the table below:
Oracle | PostgreSQL |
BINARY_FLOAT | REAL |
BINARY_INTEGER | INTEGER |
BINARY_DOUBLE | DOUBLE PRECISION |
BLOB, RAW(n), LONG RAW | BYTEA (1GB limit) |
CLOB, LONG | TEXT (1GB limit) |
DATE | TIMESTAMP |
NUMBER, NUMBER(*) | DOUBLE PRECISION or BIGINT if it is a part of Primary Key |
NUMBER(n,0), NUMBER(n) | n<5 – SMALLINT 5<=n<9 – INT 9<=n<19 – BIGINT n>=19 – DECIMAL(n) |
NUMBER(p,s) | DECIMAL(p,s) |
REAL | DOUBLE PRECISION |
When converting numeric types, it is important to understand the semantics of usage. If accuracy is the top priority, the Oracle ‘number’ type must be mapped in PostgreSQL NUMERIC.
When focusing on calculation speed, the best mapping for numeric types would be Postgres REAL or DOUBLE PRECISION.
Data Migration
Data migration may take essential time and resources, especially for large databases. That is why choosing the right migration strategy and tools is extremely important. There are three most popular techniques of Oracle to Postgres data migration:
- snapshot – it provides that all data is migrated at one step
- piecewise snapshot – it provides that data is migrated by fragments in parallel threads or processes
- change data replication (CDR) – it provides tracking incremental changes for continuous data replication
The snapshot approach requires the largest downtime of the source database since it must be locked for the entire reading data to avoid data loss or corruption. Piecewise snapshot has much less downtime due to parallel reading the source data but it is still required.
CDR method has the least requirements for downtime. It has two major implementations having its own strengths and weaknesses. The first implementation is called trigger-based because it creates triggers in the source database on insert, updates, and delete for each table being migrated.
The purpose of those triggers is to track data changes and to store the related information in the dedicated service table.
Then the CDR tool replicates all changed data to the target database. The trigger-based approach may cause some overhead in the Oracle database due to writing changes into the service table for every data update.
The second CDR technique called ‘Transaction Log’ uses Oracle transaction logs to track changes and replicate it into the PostgreSQL database. Unlike the trigger-based CDR, this approach does not modify the source database.
However, Transaction Log CDR also has weaknesses as follows:
- Oracle DBMS permanently moves transaction log files into archives, so CDC software must constantly replicate changes from the transaction log before it is archived.
- If some error occurs while replicating changes from the transaction log (for example, losing
- connection to the target database), data may be lost or corrupted due to a lack of control over the transaction log.
The most essential challenge of Oracle to Postgres data migration is handling Oracle-specific data formats having no direct equivalent in PostgreSQL and the external data.
Earlier in this article we consider BYTEA as PostgreSQL data type that is most suitable for binary data. Unfortunately, this approach is not suitable for large binary data exceeding 10MB due to the algorithm of reading BYTEA. Postgres does not support piecewise or stream-like reading BYTEA, the data can only be extracted as a single fragment. Obviously, such a method of reading may cause essential RAM overhead on large data. The workaround for this issue is to use PostgreSQL LARGE OBJECT type. All LARGE OBJECT data is stored in the dedicated service table ‘pg_largeobject’ which can store up to 4 billion rows per database. Max allowed volume of LARGE OBJECT is 4TB and it is also acceptable for piecewise reading.
Another bottleneck of Oracle to Postgres data migration is the correct handling of ROWID which is a system pseudo-column to identify each record in a table. Although PostgreSQL offers the service field ‘ctid’ for similar purposes, this is not a direct equivalent of ROWID since it may be changed due to the vacuuming procedure.
Database professionals use the following methods of emulating ROWID in PostgreSQL:
- Use the primary key or unique index to identify rows
- Add a column of type serial (or smallserial/bigserial) with an auto-generated sequence of incremental values
If it is not possible to create a primary key over a single column, build a primary key or unique index on multiple columns that are unique value for every row.
Finally, Oracle provides an ‘external table’ feature to link external data that is stored outside the database to use it as a regular table. PostgreSQL offers a Foreign Data Wrapper library (‘file_fdw’ extension) for the same purpose. For example, it can be used to work with external CSV files.
Challenges of PL/SQL Migration
This section explores the most common issues of migration of packages, store procedures, functions, and triggers from Oracle to Postgres.
Packages. Since PostgreSQL does not support packages, this feature must be implemented by combining all package entries inside the schema with the same name. Package scope variables can replaced by named rows of the dedicated service table.
Empty strings. In Oracle, empty strings are equal to NULL, but in PostgreSQL, they are not. For example, the following Oracle query:
SQL> select NVL('','This is null') as null_test from dual; will produce the output:
NULL_TEST
————
This is null
The same query in PostgreSQL gives another output:
testdb=# select coalesce('','This is null') as null_test;
null_test
———–
(1 row)
String concatenation with NULL values also has different behavior in Oracle and Postgres. In Oracle the query:
SQL> Select 'String' || NULL as cs from dual; will produce the output:
CS
——
String
In PostgreSQL the same query gives another result as follows:
synctest=# Select 'String' || NULL as concat_str;
concat_str
————
(1 row)
Types casting. Unlike Oracle, Postgres implements strict types casting in stored procedures, functions,
operators or when INSERT and UPDATE statements. The workaround for passing arguments into functions or procedures could be using the special pseudo-type ‘anyelement’ as follows:
create or replace function my_concat(str1 anyelement, str2 anyelement)
returns varchar language plpgsql
as $$
begin
return str1::varchar || str2::varchar;
end;
$$;
There can be either only one anyelement argument in stored procedure/function or all parameters of the same type anyelement. In PostgreSQL queries, all types must be strictly cast.
Triggers. Oracle syntax of trigger requires the source code to be included in the CREATE TRIGGER statement. On the other hand, in Postgres trigger’s source code is supposed to be arranged as a standalone function referenced from the CREATE TRIGGER statement:
CREATE OR REPLACE FUNCTION before_update_employees_proc()
RETURNS TRIGGER LANGUAGE PLPGSQL
AS
$$
BEGIN
IF (NEW.first_name <> OLD.first_name) OR (NEW.last_name <>
OLD.last_name) OR (NEW.email <> OLD.email)
THEN
INSERT INTO changes_log(id,changed_on)
VALUES(OLD.id,now());
END IF;
RETURN NEW;
END;
$$
CREATE TRIGGER before_update_employees
BEFORE UPDATE
ON employees
FOR EACH ROW
EXECUTE PROCEDURE before_update_employees_proc();
Autonomous Transactions. Oracle offers autonomous transactions that allow a subprogram to commit or roll back atomic SQL statements without committing or rolling back the overall transaction. Let us illustrate working with autonomous transactions in the following example. Assume, some data is inserted into a table from insert-trigger and this operation must succeed even if insert-trigger is failed. To achieve the goal, the corresponding INSERT statement is arranged inside an autonomous transaction:
CREATE OR REPLACE PROCEDURE insert_critical_data(v_data varchar2)
IS
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
INSERT INTO info_table (data) VALUES (v_data);
Commit;
END;
PostgreSQL does not support autonomous transactions directly, so the workaround could be to
refactor the source code to avoid the necessity of autonomous transactions. Another solution is to use PostgreSQL ‘dblink‘ extension. When a new connection is established with dblink, any query is immediately committed separately from the main transaction. For example, this function inserts a row into a table even if the calling transaction is rolled back:
CREATE OR REPLACE FUNCTION insert_critical_data(v_data TEXT)
RETURNS void
AS
$BODY$
DECLARE
v_sql text;
BEGIN
PERFORM dblink_connect('myconn',
'dbname=mydbname user=… password=… host=… port=…');
v_sql := format('INSERT INTO info_table (data)
VALUES (%L)', v_data);
PERFORM dblink_exec('myconn', v_sql);
PERFORM dblink_disconnect('myconn');
END;
$BODY$
LANGUAGE plpgsql;
Built-in Functions. Oracle and PostgreSQL expose rich capabilities through a rich set of built-in functions. Those functions are similar yes not equal, so it is important to convert them properly.
This is list of Oracle functions requiring careful conversion into PostgreSQL:
Oracle | PostgreSQL |
ADD_MONTH($date,$n_month) | $date + $n_month * interval ‘1 month’ |
DECODE($exp, $when, $then, …) | CASE $exp WHEN $when THEN $then … END |
INSTR($str1, $str2) | *POSITION($str2 in $str1) |
ROWNUM | **row_number() over () |
SYSDATE | CURRENT_DATE |
SYS_GUID() | uuid_generate_v1() |
Conversion of views, triggers, stored procedures, and functions can be partially automated using special software such as Oracle to PostgreSQL Code Converter. The product can convert Oracle built-in functions into PostgreSQL equivalents and process the most common PL/SQL syntax patterns into PostgreSQL equivalents. It is important to know that complex source code may require manual post-processing of the converter’s output.
Useful Modules for Oracle to PostgreSQL Migration
PostgreSQL offers useful modules or extensions to emulate some Oracle capabilities and test the results of migration.
One of the most important modules for Oracle to PostgreSQL migration is orafce since it implements specific Oracle types, functions, and operators.
PostgreSQL extensions ‘pgTAP’ and ‘pg_prove’ can be used for functional testing of stored procedures functions. After installing ‘pgTAP’, it creates a lot of stored functions used for writing tests.
PostgreSQL extension plpgsql_check helps to validate PL/pgSQL code. For example, if there is a missing variable declaration or misspelled column name inside the PostgreSQL function, this will definitely fail during execution. This extension is able to diagnose such issues before going to production through the following query:
select * from plpgsql_check_function_tb('{name of function}');
In order to analyze the performance of PL/pgSQL code in functions and stored procedures module plprofiler can be used.
Finally, PostgreSQL offers an extension ‘oracle_fdw’ to connect to external data sources (for example, to Oracle databases) through a technique known as Foreign Data Wrapper. Assume there is an Oracle database
‘OCRL’ running on the server ‘myserver.mydomain.com’. Then oracle_fdw must be configured as follows:
CREATE EXTENSION oracle_fdw;
CREATE SERVER OR SERVER FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver
'//myserver.mydomain.com:1521/ORCL');
ACCORD ON FOREIGN SERVER oraserver TO pguser;
CREATE USER MAPPING FOR pguser SERVER oraserver OPTIONS (user 'oracle user', password 'oracle password');
CREATE FOREIGN TABLE oratable ( { column definitions } ) SERVER oraserver OPTIONS (schema 'ORAUSER', table 'ORATAB');
Then you can use ‘oratable’ as a regular PostgreSQL table.