Tags

,

Or:  “why we should read the manual from time to time…and not blindly follow our intuition…”.

Basically I thought that I know the MERGE command by heart. So, no big deal to write a sql statement that handles changes in the source table and moves them to a target table.

Even removing rows from target, when they are logically deleted in the source should not be a problem as there is an update delete clause in the merge command. So everything went fine…but somehow it did not work the way I expected the merge command to work.


Test case:

CREATE TABLE empl (empno     NUMBER
                  ,ename     VARCHAR(30 CHAR)
                  ,hiredate  DATE
                  ,leavedate DATE)
/

CREATE TABLE empl_current (empno  NUMBER
                          ,ename  VARCHAR2(30 CHAR))
/

INSERT INTO empl VALUES (1,'MUELLER',DATE '2017-01-01', NULL);
INSERT INTO empl VALUES (2,'MEIER'  ,DATE '2017-02-01', NULL);
INSERT INTO empl VALUES (3,'HUBER'  ,DATE '2017-03-01', NULL);
INSERT INTO empl VALUES (4,'HOFER'  ,DATE '2017-04-01', NULL);
INSERT INTO empl VALUES (5,'SCHMIDT',DATE '2017-05-01', DATE '2017-05-31');

COMMIT;

Step 1:

Write a merge that transfers all employees from empl to empl_current if they have not left the company

MERGE INTO empl_current tar
     USING (SELECT empno
                  ,ename
              FROM empl
             WHERE leavedate IS NULL
                OR leavedate > SYSDATE) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename;

Checking empl_current should now show employees 1 to 4 but not 5 as 5 has left the company already.

SELECT * FROM empl_current; -- should show 1 - 4

EMPNO  ENAME
------ ---------------------
     1 MUELLER
     2 MEIER
     3 HUBER
     4 HOFER

Ok, all fine so far.


Step 2:

Now doing some updates on the source table and check whether the MERGE does what it should do.

UPDATE empl
   SET ename = 'Meier'
 WHERE empno = 2; 

UPDATE empl
   SET leavedate = DATE '2017-06-30'
 WHERE empno = 3; 

COMMIT;

So the previous merge would now handle the changes on employee 2 but not the one of employee 3 as I do not have a delete clause yet.

So I need to add a delete clause to the merge command.

MERGE INTO empl_current tar
     USING (SELECT empno
                  ,ename
                  ,leavedate
              FROM empl
             WHERE leavedate IS NULL
                OR leavedate > SYSDATE) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename
                      DELETE WHERE leavedate IS NOT NULL
                              AND leavedate <= sysdate;

SELECT * FROM empl_current; 

EMPNO ENAME
------ ---------------------
     1 MUELLER
     2 Meier     ---> changed 
     3 HUBER     ---> still here (?)
     4 HOFER

Now, the reason for HUBER (3) still being here is, that it cannot be deleted, as it is not a match as the row is not part of the source dataset due to the filter I applied….


Step 3:

The source data set should also contain rows, that should not be inserted but eventually deleted.

So changing the source filter.

MERGE INTO empl_current tar
     USING (SELECT empno
                  ,ename
                  ,leavedate
              FROM empl             
           --  WHERE leavedate IS NULL    /* remove this line */
           --     OR leavedate > SYSDATE  /* and this too */
           ) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename
                      DELETE WHERE leavedate IS NOT NULL
                               AND leavedate <= sysdate;


SELECT * FROM empl_current; 


EMPNO ENAME
------ ---------------------
     1 MUELLER
     2 Meier
     4 HOFER
     5 SCHMIDT   ---> ???

Ok, Huber has gone but we have Schmidt now in the target table despite the fact that he has left the company…so what we need to add  is a filter on the “NOT MATCHED” insert statement to not add rows that we are not (or no longer) interested in…


Step 4:

Improve the filter on the insert statement in the “MATCHED” case. So no rows get inserted that we are not interested in.

MERGE INTO empl_current tar     
     USING (SELECT empno
                  ,ename
                  ,leavedate
              FROM empl) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
                       WHERE leavedate IS NULL
                          OR leavedate > SYSDATE
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename
                      DELETE WHERE leavedate IS NOT NULL 
                              AND leavedate <= SYSDATE;

SELECT * FROM empl_current; 

EMPNO ENAME
------ ---------------------
     1 MUELLER
     2 Meier
     4 HOFER

This is, where I placed a trap…

I came this far with my first attempt … but I added an trap by just following my intuition and not reading the manual.

The reason for getting trapped was, that I asked myself the question:

“Why should I update a row, that I am going to delete anyway?” I think UPDATE and DELETE should exclude each other so we are only updating when not deleting and only deleting when not updating….

Thefore my MERGE looked like this:

MERGE INTO empl_current tar
     USING (SELECT empno
                  ,ename
                  ,CASE WHEN leavedate <= SYSDATE
                        THEN 'Y'
                        ELSE 'N'
                   END AS delete_flag
              FROM empl) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
                       WHERE delete_flag = 'N'
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename
                       WHERE delete_flag = 'N'
                      DELETE WHERE delete_flag = 'Y';

I added a delete flag, which made it easier to insert only unmatched rows that have not yet been deleted and it was also not to complicated to distinguish between update an delete too.

So add another testcase to get rid of  empno 4…

UPDATE empl
   SET leavedate = DATE '2017-08-31'
 WHERE empno = 4; 

MERGE INTO empl_current tar
     USING (SELECT empno
                  ,ename
                  ,CASE WHEN leavedate <= SYSDATE
                        THEN 'Y'
                        ELSE 'N'
                   END AS delete_flag
              FROM empl) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
                      WHERE delete_flag = 'N'
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename
                       WHERE delete_flag = 'N'
                      DELETE WHERE delete_flag = 'Y';

SELECT * FROM empl_current;                       

EMPNO ENAME
------ ---------------------
     1 MUELLER
     2 Meier
     4 HOFER  --- ????

Hmmm…

So, why is employee 4 still in the target table? I checked my statement over and over again and finally had a look to the documentation which says:

ORACLE Database SQL Language Reference

merge_update_clause

The merge_update_clause specifies the new column values of the target table. Oracle performs this update if the condition of the ON clause is true. If the update clause is executed, then all update triggers defined on the target table are activated.

Specify the where_clause if you want the database to execute the update operation only if the specified condition is true. The condition can refer to either the data source or the target table. If the condition is not true, then the database skips the update operation when merging the row into the table.

Specify the DELETE where_clause to clean up data in a table while populating or updating it. The only rows affected by this clause are those rows in the destination table that are updated by the merge operation. The DELETE WHERE condition evaluates the updated value, not the original value that was evaluated by the UPDATE SETWHERE condition. If a row of the destination table meets the DELETE condition but is not included in the join defined by the ON clause, then it is not deleted. Any delete triggers defined on the target table will be activated for each row deletion.

 

That was somehow surprising…but, after removing the WHERE clause on the update, everything worked again…

MERGE INTO empl_current tar
     USING (SELECT empno
                  ,ename
                  ,CASE WHEN leavedate <= SYSDATE
                        THEN 'Y'
                        ELSE 'N'
                   END AS delete_flag
              FROM empl) src ON (tar.empno = src.empno)
WHEN NOT MATCHED THEN INSERT (empno    ,ename)
                      VALUES (src.empno, src.ename)
                      WHERE delete_flag = 'N'
WHEN     MATCHED THEN UPDATE SET tar.ename = src.ename
                      DELETE WHERE delete_flag = 'Y';

SELECT * FROM empl_current; 

EMPNO ENAME
------ ---------------------
     1 MUELLER
     2 Meier

Conclusion

Testing is everything … from time to time you stumle upon things you thought you know by heart … and then you will be thankful for the great documentation job oracle did.