Tags

, ,

During my earlier work as a consultant and in my todays job as a technical supporter to the development team I encountered various SQL & PL/SQL antipatterns. I would like to share some of them with you.

The first antipattern is the “check existance by doing a complete count” pattern.


How do we check whether a particular value exists in a table?

E.g. we would like to know whether there is a person in UK or DE which earns at least 80% of the maximum salary defined by the job. If we found that kind of rows, we would have to call a given procedure.

An anti-pattern that can be often found to solve this kind of question, is reading way to many rows to take your decision.

...
   select count(*)
     into L_Counter
     from EMPLOYEES   e
     join JOBS        j on (j.JOB_ID        = e.JOB_ID)
     join DEPARTMENTS d on (d.DEPARTMENT_ID = e.DEPARTMENT_ID)
     join LOCATIONS   l on (l.LOCATION_ID   = d.LOCATION_ID)
    where e.SALARY > (j.MAX_SALARY * .8)
      and l.COUNTRY_ID in ('DE','UK');

   if L_Counter > 0 then
      procedure1();
   end if;
...

To reduce the number of rows to be read, we should add a rownum filter to stop searching as soon as we found a hit.

...
   select count(*)
     into L_Counter
     from EMPLOYEES   e
     join JOBS        j on (j.JOB_ID        = e.JOB_ID)
     join DEPARTMENTS d on (d.DEPARTMENT_ID = e.DEPARTMENT_ID)
     join LOCATIONS   l on (l.LOCATION_ID   = d.LOCATION_ID)
    where e.SALARY > (j.MAX_SALARY * .8)
      and l.COUNTRY_ID in ('DE','UK')
      and rownum = 1;

if L_Counter > 0 then
   procedure1();
end if;
...

A different approach would be to use a subquery to answer the question connecting it to dual using an EXIST clause. One of the main differences to the count/rownum approach is, that this kind of query may raise a no_data_found exception and we therefore do not have to check a counter variable. We can rely on a “normal continuation” if a row was found. If no row exists an exception will be raised.

...
   begin
      select 1
        into L_Counter
        from dual
       where exists (select 'Employee with > 80% salary in UK/DE found'
                       from EMPLOYEES   e
                       join JOBS        j on (j.JOB_ID        = e.JOB_ID)
                       join DEPARTMENTS d on (d.DEPARTMENT_ID = e.DEPARTMENT_ID)
                       join LOCATIONS   l on (l.LOCATION_ID   = d.LOCATION_ID)
                      where e.SALARY > (j.MAX_SALARY * .8)
                        and l.COUNTRY_ID in ('DE','UK'))

      procedure1();
   exception
      when no_data_found then
         ...
   end;
...


To update or not to update?

Nowadays the question whether to insert or update a row can easily be solved using the merge command. However I came accross situations where the insert had to be done using an API whilst the update was done directly.

Situations like that are another place where you can find the “check existance by count”-antipattern.

...
   select count(*)
     into L_Counter
     from EMPLOYEES   e
    where e.EMPLOYEE_ID = in_emprec.employee_id;

   if L_Counter > 0 then
      update EMPLOYEES e
         set ...
       where e.EMPLOYEE_ID = in_emprec.employee_id;
   else
      InsertEmployee(in_emprec => in_emprec);
   end if;
...

This solution has two problems:

  1. In case of an update we need to process the employees table twice (first for the count, second for the update)
  2. We ignore the fact, that between the count and the follow-up action the situation could change due to data manipulations done by other sessions.

As a solution for the insert or update question (if no merge is possible) my suggestion would be to try the update and do the API call if the update command ended in updating 0 rows.

...
   update EMPLOYEES e
      set ...
    where e.EMPLOYEE_ID = in_emprec.employee_id;
   
   if SQL%ROWCOUNT = 0 then
      InsertEmployee(in_emprec => in_emprec);
   end if;
...

Or (even better) thinking about handling update/insert in the API using a merge command.


Testing nested tables for existence of an element

A way of testing existance of elements I see quite often is the “SQL way”.  The collection is scanned with a SQL doing a count(*) (without stop key). Maybe this has been an evolutionary way a couple of years ago, but today …

...
   select count(*)
     into L_Count
     from table(L_NestedTable)
    where column_value = L_Value;
 
    if (L_COUNT > 0) then
...

Today the easiest way is to use MEMBER OF to check for existens of an certain element in a nested table.

...
   if L_value member of L_NestedTable then
...

Conclusion

Knowing bad habits is the best way to avoid them.

Most of the examples I found are based on:

  • copy/paste coding
  • sticking to things that worked in the past (without questioning them)
  • ignoring new features

This (and the other posts on the anti-patterns) should be a help to get over them.