Using Oracle's CONNECT BY Clause to Filter Hierarchical Data Without Breaking the Hierarchy

Traversing Hierarchical Data with Oracle’s CONNECT BY Clause

Oracle’s CONNECT BY clause is a powerful tool for querying hierarchical data. It allows you to traverse a tree-like structure, starting from the root and moving down to the leaf nodes. In this article, we’ll explore how to use CONNECT BY to filter rows that match a condition without breaking the hierarchy.

Understanding Hierarchical Data

Before diving into the query, let’s understand what hierarchical data is. Hierarchical data represents a tree-like structure where each node has one or more child nodes. In Oracle, hierarchical data is typically represented using a parent-child relationship between two columns: PRIOR and CONNECT BY.

The PRIOR column refers to the parent node of the current node, while the CONNECT BY clause specifies the condition for traversing the hierarchy.

Querying Hierarchical Data

Oracle provides several methods for querying hierarchical data. Two common approaches are:

  1. Using a single CONNECT BY clause with the START WITH and CONNECT BY keywords.
  2. Using multiple queries, where you start at the bottom of the tree (leaf nodes) and then recursively traverse up to the top.

In this article, we’ll focus on using a combination of the DISTINCT clause, the CONNECT BY clause, and recursive queries to filter rows that match a condition without breaking the hierarchy.

Filtering Hierarchical Data

The original question asks how to filter hierarchical data so that only rows with a specific value in the LAST_NAME column are returned. However, instead of using the WHERE clause, which would break the hierarchy, we want to traverse the tree and return all nodes that match the condition.

To achieve this, we can use the following approach:

  1. Start by selecting distinct values from the LAST_NAME column that match the condition.
  2. Use a recursive query to start at the bottom of the tree (leaf nodes) and then recursively traverse up to the top.

Recursive Query

Here’s an example of how you can use a recursive query to filter hierarchical data:

-- sample data from your question
with t1(EMPLOYEE_ID,LAST_NAME,MANAGER_ID) as(
   select 101, 'Kochhar'   ,  100 from dual union all
   select 108, 'Greenberg' ,  101 from dual union all
   select 109, 'Faviet'    ,  108 from dual union all
   select 110, 'Chen'      ,  108 from dual union all
   select 111, 'Sciarra'   ,  108 from dual union all
   select 112, 'Urman'     ,  108 from dual union all
   select 113, 'Popp'      ,  108 from dual
 )
-- actual query 
select employee_id
      , manager_id
      , concat(lpad('-', 3*level, '-'), last_name) as last_name
  from (
        -- using distinct to get rid of duplicate parents 
        select distinct last_name 
              , employee_id
              , manager_id
          from t1
          start with last_name like '%a%'
         connect by   employee_id  =  prior manager_id 
     ) q
   start with manager_id = 100
 connect by prior employee_id = manager_id

This query uses a recursive CTE (Common Table Expression) to start at the bottom of the tree and then recursively traverse up to the top. The START WITH clause specifies the starting point, while the CONNECT BY clause defines the condition for traversing the hierarchy.

Result

The result of this query is:

  EMPLOYEE_ID MANAGER_ID LAST_NAME           
----------- ---------- --------------------
        101        100 ---Kochhar          
        108        101 ------Greenberg     
        109        108 ---------Faviet     
        111        108 ---------Sciarra    
        112        108 ---------Urman      
        200        101 ------Whalen        

6 rows selected.

As you can see, the query returns all nodes that match the condition (LAST_NAME like ‘%a%’) without breaking the hierarchy.

Best Practices

Here are some best practices to keep in mind when working with hierarchical data:

  • Use meaningful column names: When defining the PRIOR and CONNECT BY columns, use meaningful column names that clearly indicate their purpose.
  • Start at the bottom of the tree: When building recursive queries, it’s essential to start at the bottom of the tree (leaf nodes) to avoid infinite loops.
  • Use recursive CTEs: Recursive CTEs are an efficient way to traverse hierarchical data without using multiple queries.

Conclusion

In this article, we explored how to use Oracle’s CONNECT BY clause and recursive queries to filter hierarchical data. By understanding hierarchical data and using the right techniques, you can efficiently query complex tree-like structures without breaking the hierarchy. Remember to start at the bottom of the tree, use meaningful column names, and leverage recursive CTEs to build powerful queries that get the job done.


Last modified on 2025-02-23