Tags

,

With ORACLE 19c another gap in the listagg feature was filled. Now we are able to do a native distinct operation on the values in the list, so that the list no longer contains duplicates.

In prior releases we had to deduplicate the values before using them in the listagg, which was rather dreadful when doing other group functions in the same query.


LISTAGG Deduplication

Imagine we would have to list all product subcategories of the SH.Products table group by the product category.

select PROD_CATEGORY
     , listagg(PROD_SUBCATEGORY,' | ') as SubCategories
  from SH.PRODUCTS
 group by PROD_CATEGORY;

PROD_CATEGORY                 SUBCATEGORIES
----------------------------  -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Photo                         Camcorders | Camera Batteries | Camera Batteries | Camera Batteries | Camera Batteries | Camera Batteries | Camera Media | Camera Media | Camera Media | Cameras
Electronics                   Game Consoles | Home Audio | Home Audio | Y Box Accessories | Y Box Accessories | Y Box Games | Y Box Games | Y Box Games | Y Box Games | Y Box Games | Y Box Games | Y Box Games | Y Box Games                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
Software/Other                Accessories | Accessories | Accessories | Accessories | Accessories | Bulk Pack Diskettes | Bulk Pack Diskettes | Documentation | Documentation | Documentation | Documentation | Documentation | Documentation | Operating Systems | Recordable CDs | Recordable CDs | Recordable CDs | Recordable CDs | Recordable CDs | Recordable CDs | Recordable CDs | Recordable DVD Discs | Recordable DVD Discs | Recordable DVD Discs | Recordable DVD Discs | Recordable DVD Discs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
Peripherals and Accessories   Accessories | Accessories | Accessories | Accessories | Accessories | CD-ROM | CD-ROM | CD-ROM | CD-ROM | CD-ROM | CD-ROM | Memory | Memory | Modems/Fax | Modems/Fax | Monitors | Monitors | Printer Supplies | Printer Supplies | Printer Supplies | Printer Supplies                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
Hardware                      Desktop PCs | Portable PCs

As we can see, there are duplicates in the subcategory lists. So we had to do a deduplication in the past.

select PROD_CATEGORY
     , listagg(PROD_SUBCATEGORY,' | ') as SubCategories
  from (select distinct PROD_CATEGORY, PROD_SUBCATEGORY
          from SH.PRODUCTS)
 group by PROD_CATEGORY;

PROD_CATEGORY                SUBCATEGORIES
---------------------------  -------------------------------------------------------------------------------------------------------------
Electronics                  Y Box Accessories | Game Consoles | Home Audio | Y Box Games
Software/Other               Bulk Pack Diskettes | Recordable CDs | Recordable DVD Discs | Accessories | Documentation | Operating Systems
Photo                        Camera Batteries | Camera Media | Camcorders | Cameras
Peripherals and Accessories  CD-ROM | Accessories | Modems/Fax | Monitors | Printer Supplies | Memory
Hardware                     Portable PCs | Desktop PCs

Now, imagine we would like to add the sum of the list_prices to this query … to get the data consistent, we had to do the sum on the list_prices in two steps. First per distinct prod_category/prod_subcategory to get a deduplicated list of subcategories and then on the category level using the pre-calculated sums.

select PROD_CATEGORY
     , listagg(PROD_SUBCATEGORY,' | ') as SubCategories
     , sum(SumProdListPrice)           as SumProdListPrice
  from (select PROD_CATEGORY
             , PROD_SUBCATEGORY
             , sum(PROD_LIST_PRICE) as SumProdListPrice
          from SH.PRODUCTS
         group by PROD_CATEGORY
                , PROD_SUBCATEGORY) p1
 group by PROD_CATEGORY;

Oracle 19c spelling

Now this got much easier with Oracle 19c, the only thing we have to do is adding distinct to the listagg function

select PROD_CATEGORY
     , listagg(distinct PROD_SUBCATEGORY,' | ') as SubCategories
     , sum(PROD_LIST_PRICE)                     as SumProdListPrice
  from SH.PRODUCTS
 group by PROD_CATEGORY;

If you need to have the list ordered, then you have to use the within group syntax.

select PROD_CATEGORY 
     , listagg(distinct PROD_SUBCATEGORY,' | ') within group (order by PROD_SUBCATEGORY) as SubCategories  
     , sum(PROD_LIST_PRICE) as SumProdListPrice
  from SH.PRODUCTS
 group by PROD_CATEGORY;

WITHIN GROUP clause

Starting with version 19c the WITHIN GROUP clause is optional and only needed if the list has to be ordered.


Conclusion

Adding the possibility to remove duplicates from the list using the distinct keyword looks like a tinyness. But as most of the time duplicates in the list are not wanted this tinyness is really useful.