Tag Archives: sql

Writing efficient SQL

The other day I gave a presentation on ‘Efficient SQL’. It was the first of a number of presentations, so I started with explaining some basic concepts.

Maybe it will be interesting for other people too. So here is the summary of it.

 

MAIN RULES OF EFFICIENT SQL

 

The main rules of efficient SQL are:

·         Less is better

·         More is better

That seems easy, since everything is better. But let me explain.

Less is better: The less I/O generated the better your statement will perform. Even though somebody might be able to think of some exceptions. It is safe to keep this as a rule of thumb.

Avoid unnecessary work for your query. You can do that by selecting only the rows that you need. (That sounds obvious, but I’ll give an example soon). In complex queries, make sure that you select the smallest possible set in every part of your query. Rather than collecting a huge amount of data and then selecting what you need, select the smallest set possible before you join.

Also select only the columns that you need. One argument is that it might give Oracle a chance to skip the table access, and use an index-only access.

Think about the use of ‘select *’. Most often this is a complete waste of resources. In packaged software, it can easily lead to bugs when the table definition changes. In all situations, it will cost extra resources to collect the data, send them to the client and then filter out the data that is not needed.

 

More is better: This is of course not about I/O. It is about the information that you give Oracle about your data and your query. Add as many predicates as possible, since it can help the optimizer do a better job. When 2 tables are joined on an ID-column. But you know, from your knowledge of the data, that another column can also be used as a join-condition then use both join conditions.

It will help the optimizer work out the relationship between the tables, and select the optimal plan. If 2 columns have related data, and the predicate on one column means that the other column is restricted too, put a predicate on both columns.

 

Consider the following:

SQL> create table Xxx_inner

  2  as select * from dba_objects

  3  where object_type='TABLE';

 

Table created.

 

SQL> create index xxx_inner_n1 on xxx_inner(object_id,object_type);

 

Index created.

 

SQL> create index xxx_inner_n2 on xxx_inner(Object_type);

 

Index created.

 

SQL> create table xxx_outer

  2  as select * from dba_objects;

 

Table created.

 

SQL> insert into  xxx_outer

  2  select * from dba_objects;

 

71136 rows created.

 

SQL> create index xxx_outer_n1 on xxx_outer(Object_id,object_type);

 

Index created.

 

SQL> create index xxx_outer_n2 on xxx_outer(object_type);

 

Index created.

 

SQL> analyze table xxx_inner compute statistics;

 

Table analyzed.

 

SQL> analyze table xxx_outer compute statistics;

 

Table analyzed.

 

SQL> set autotrace traceonly;

SQL> select o.*

  2  from xxx_outer o

  3  where o.object_id in (select object_id from xxx_inner i)

  4  and o.object_type='PROCEDURE';

 

no rows selected

 

Statistics

———————————————————-

        113  consistent gets

       1064  bytes sent via SQL*Net to client

 

Consistent gets is the number of times data was read from the buffer cache into our session memory. It is therefore a good measure of the amount of work a session needed to do while executing a query.

SQL> select o.*

  2  from xxx_outer o

  3  where o.object_id in (select object_id

    from xxx_inner i

    where i.object_type=o.object_type)

  4  and o.object_type='PROCEDURE';

 

no rows selected

 

Statistics

———————————————————-

         15  consistent gets

       1064  bytes sent via SQL*Net to client

 

SQL> select o.object_id

  2  from xxx_outer o

  3  where o.object_id in (select object_id

    from xxx_inner i

    where i.object_type=o.object_type)

  4  and o.object_type='PROCEDURE';

 

no rows selected

 

Statistics

———————————————————-

         15  consistent gets

        254  bytes sent via SQL*Net to client

 

 

Here we see two tables created from ‘DBA_OBJECTS’. We join these tables on ‘OBJECT_ID’. In this case, we have some information that the Oracle Optimizer does not have. The object_id,object_type combination is the same in both tables. Unaware of this fact, Oracle has to search all object_id’s for ‘XXX_INNER’. When we tell Oracle that the object_type is the same, it can skip most of the records resulting in less I/O.

When we finally tell Oracle that we’re only interested in the object_id, we also reduce the network traffic by 75%.

 

The concepts

 

For every query, Oracle will have to collect some data. And in most cases, it will have to join one or more data-sets. Let’s see what options Oracle has for collecting data and joining the data together. In this presentation we only look at the basic options, not the more sophisticated features. So don’t expect this list to be complete. We will see how data can be retrieved from the database, and how tables / data-sets can be joined together. We will not yet go into the most efficient way to do it, because the most efficient way to retrieve data depends on many factors. Only when you have a basic understanding of the concepts, we can start thinking about the most efficient way to do things.

 

Collecting data

 

To gather data from a table, Oracle can use 3 different options, called ‘Access Paths’:

·         Full Table Scan

·         Index Scan / Table Access by Rowid

·         Index Scan

The Full Table Scan (FTS) is exactly as the name implies, a full scan of the table. All the records of the table are read into memory, and checked against the predicates in the query (where clause or Join condition). The blocks of the table don’t have to be read in any particular order. Oracle will just try to get the blocks of the table as quickly as possible.

When a sizable part of the table needs to be selected, this will be the most efficient access path. It will be the only one available, if there is no predicate available that matches (part of) an index.

The ‘Index Scan / Table Access by Rowid’, will use an index to decide which rows need to be read into memory. Then based on the rowid in the index, the correct row will be retrieved. The rowid refers directly to the position on disk where the row is located.

And the last option is the ‘Index Scan’. The difference with the previous option is that Oracle does not need to get the table data anymore. The data in the index is sufficient to answer the query.

 

So which one of these is the most efficient?

As usual, it depends. Many people think a FTS is less efficient than an Index Scan. But consider this:

SQL> create table xxx_access as select * from dba_objects;

 

Table created.

 

SQL> create index xxx_access_n1 on xxx_access (object_id);

 

Index created.

 

SQL> select object_id,object_name from xxx_access where object_id>0;

 

71139 rows selected.

 

Elapsed: 00:00:08.02

 

Execution Plan

———————————————————-

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=287 Card=78799 Bytes=6225121)

   1    0   TABLE ACCESS (FULL) OF 'XXX_ACCESS' (TABLE) (Cost=287 Card=78799 Bytes=6225121)

 

Statistics

———————————————————-

       5803  consistent gets

    2727442  bytes sent via SQL*Net to client

      71139  rows processed

 

SQL> select /*+ INDEX (xxx_access) */ object_id, object_name from xxx_access where object_id>0;

 

71139 rows selected.

 

Elapsed: 00:00:10.05

 

Execution Plan

———————————————————-

0     SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1290 Card=78799 Bytes=6225121)

1   0   TABLE ACCESS (BY INDEX ROWID) OF 'XXX_ACCESS' (TABLE) (Cost=1290 Card=78799 By=6225121)

2   1     INDEX (RANGE SCAN) OF 'XXX_ACCESS_N1' (INDEX) (Cost=161 Card=78799)

 

Statistics

———————————————————-

      10771  consistent gets

    2727442  bytes sent via SQL*Net to client

      71139  rows processed

 

The FTS needs 5803 I/O operations. While the Index Scan takes almost double at 10771 I/O’s and 2 seconds more. (out of 10 sec’s!)

Imagine what will happen, when you try this with a multi-million row table.

 

But for sure an Index Scan will be the most efficient when it can be done? Again, it depends. Consider this:

 

SQL> create table test1 (l number, txt varchar2(500));

Table created.

 

SQL> create index test1_idx on test1(l,txt);

Index created.

 

SQL> Begin

  2  For I In (Select  Level L, Rpad('ABC',500,To_char(Level)) Txt

  3            From Dual Connect By Level <= 50000) Loop

  4    Insert Into Test1 Values (I.L,I.Txt);

  5    If Mod(I.L,5)!=0 Then

  6       Delete From Test1 Where L=I.L;

  7    End If;

  8  End Loop;

  9  Commit;

 10  end;

 11  /

PL/SQL procedure successfully completed.

 

SQL> analyze table test1 compute statistics;

Table analyzed.

 

SQL> set autotrace traceonly;

SQL> select l from test1 t where l>0;

10000 rows selected.

 

Execution Plan

———————————————————-

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=773 Card=10000 Bytes=40000)

   1    0   TABLE ACCESS (FULL) OF 'TEST1' (TABLE) (Cost=773 Card=10000 Bytes=40000)

 

Statistics

———————————————————-

       1393  consistent gets

     140549  bytes sent via SQL*Net to client

      10000  rows processed

 

SQL> select /*+ INDEX(t,test1_idx) */ l from test1 t where l>0;

10000 rows selected.

 

Execution Plan

———————————————————-

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3338 Card=10000 Bytes=40000)

   1    0   INDEX (RANGE SCAN) OF 'TEST1_IDX' (INDEX) (Cost=3338 Card=10000 Bytes=40000)

Statistics

———————————————————-

       4003  consistent gets

     140549  bytes sent via SQL*Net to client

      10000  rows processed

 

 

 

 

How can the Full Table Scan be more efficient?

 

Good question. To answer it, we have to look at the amount of work that Oracle has to do to get the data. We start with the FTS:

As mentioned, a FTS needs to scan all the blocks in a table. In my system, the table XXX_ACCESS was created with 1152 blocks. So basically Oracle will read 1152 blocks. However Oracle has optimized this process. One of the most noticeable optimizations is the db_file_multiblock_readcount, which tells Oracle to read multiple blocks in one I/O operation. So instead of doing 1152 reads, Oracle reads 4 (on my system) blocks at a time in 288 reads. (Take a look at the explain plan above again, and notice the cost of the Full Table Scan!).

Now how much work does an Index Scan / Table Access rowid need to do?

To answer that question, you need to know the structure of an index. An index in Oracle is a B-Tree structure. It looks like an inverted tree, with the top being the ‘Root-block’. The lowest level contains the ‘Leaf blocks’, that hold the index keys and the matching rowid’s. The index keys in the leaf blocks are sorted. So the lowest value will be in the utter-left block, the highest in the far right block.

All upper level blocks show the ranges that the lower level blocks contain.

To find a range of data, Oracle starts at the root block, and follows the pointers to the first leaf block containing an index key within the range. From here Oracle can walk the leaf blocks from left to right (or right to left, if needed).

See the following picture:

That means that Oracle has to walk the ‘height’ of the index (Index level). Then read a number of leaf blocks. And for every entry in the leaf-block, it needs to retrieve the data from the table.

In the worst case scenario, the data is spread throughout the table. And for every index key, Oracle has to retrieve a different table block. In that case, the amount of work is: (index level – 1) + number of index leaf blocks + (Number of keys * Number of table blocks).

It will be obvious that is a lot more I/O than just reading the table once. The formula is not completely correct, because Oracle does not read the same table block twice, when the next rowid from the index is in the same block.

But the formula should make it clear that with more data being retrieved, the cost of the index access / Table Rowid, is increasing faster than the cost of the Full Table Scan.

 

What happened to Index Only Access?

 

The example showed that a Full Table Scan is still more efficient than using only an index. Even though Oracle would only have to walk through the index to retrieve all the data.

This is caused by a feature of the index structure and the way the data was entered into the table. This caused the index to grow bigger than the actual table:

SQL> select table_name,num_rows,blocks from dba_Tables where table_name='TEST1';

 

TABLE_NAME             NUM_ROWS     BLOCKS

——————– ———- ———-

TEST1                     10000        772

 

SQL> select index_name, distinct_keys, leaf_blocks from dba_indexes where index_name='TEST1_IDX';

 

INDEX_NAME           DISTINCT_KEYS LEAF_BLOCKS

——————– ————- ———–

TEST1_IDX                    10000        3334

 

This situation can occur when a lot of inserts and deletes are taking place in the same transaction. The space for the deleted table rows can be reused immediately. But the space for the deleted index keys only comes available after the commit. NOTE: The space in the index will be reusable after the commit!

Another situation where this can happen is with an index key based on a sequence. So new data is only inserted at the end of the range. When you delete a lot of values on the lower end of the range (but not all). Richard Foote has some excellent material on his website about Deleted Index Keys:

http://en.wordpress.com/tag/index-delete-operations/

 

Join Mechanisms

 

I hope you’re not yet in despair. Because so far we have only retrieved data from single tables. In most cases, we need to join tables together to get our data. Now we will look at 3 basic forms of joining data-sets together. Note that it is not necessarily only tables that we join. It can also be a result set from an earlier part of your query. For example we might collect some data from an Index Only Access and then join it to data from a Full Table Scan.

Oracle has 3 mechanisms to perform Joins: Nested Loops, Hash Join, Merge Join. Let’s take a look at them.

Nested Loops

 

The Nested Loop join loops over a smaller data set, and for every record in that set, it searches a matching record in the second data set. Visually, it looks like this:

 

You can see that the first set is fully scanned. This is the Outer or Driving Set. For every record in this set, we look up a matching record in the second set. Usually this will be done through an index, even though this is not mandatory.

The efficiency of this mechanism depends on the size of Set A, and the number of records we need to  retrieve from Set B.

We will see some situations where a Nested Loops join is more or less efficient.

The second join mechanism is the

Hash Join

 

For a Hash Join, Oracle builds a hash table from a (preferably) smaller data set (Build set), where the key is a hash value derived from the join columns. Then Oracle reads the second data set (Probe set), derives the hash value on the join columns and probes the hash table for a match.

Schematically, it looks like this:

For the Hash Join, it is not relevant if the data is retrieved through an index, or from a Table scan. Both data sets need to be read fully. One of the features of hashing is that collisions may occur. A collision means that 2 different values result in the same hash value. Therefore, the full data set is stored in the hash table. When a match is found on the hash values, Oracle double-checks the actual values of the join columns to see if they match.

There is a major drawback for this mechanism. When the hash table does not fit into the available memory (hash_area_size). Then Oracle will dump parts of the hash table to disk (Temp tablespace). A bitmap of all possible hash values is kept in memory with a bit indicating if a hash-value is used or not.

So while scanning the Probe set, Oracle can check if there is a possible match. If that part of the hash table is not in memory, Oracle will set aside the records from the probe table. When finished with the probe set, the next part of the hash table is loaded into memory, and the records that were set aside are tested again. This is called a ‘Multipass Hash Join’, instead of the ‘Onepass Hash Join’ where the entire hash table is held in memory.

You can imagine that the hash join can be very efficient even without index access. However it can deteriorate quickly when a ‘Multipass Hash Join’  is needed.

That brings us to the

Merge Join

 

A Merge Join is possible when 2 datasets are both ordered on their join columns. The datasets can then be read in an alternating way. You start reading the first dataset, then you read the second dataset until you find a matching value, or exceed the value. If it is a matching value, you can start building your result set. If you pass the join value, you continue reading the first dataset again.

Schematically, it looks like this:

 The main requirement is of course that both data sets are ordered, before the join takes place. If that is the case, this is probably the most efficient join mechanism. It can handle all kinds of join comparisons, including range comparisons.

Back to efficient SQL

 

Now it is time to go back to the main focus of this article. How to write efficient SQL.

We have seen how Oracle can access data and join it together. It is the job of the Oracle Optimizer to find out the most efficient way to do that, for a given query. It does this based on statistics on the tables and indexes. It is not in the scope of this article to discuss how to gather statistics. But we will see the use of several of the statistics. These statistics include (but are not limited to) the number of rows in the table, the number of blocks, the average row length, etc. For an index, they include (but are not limited to) the number of leaf blocks, the number of keys per leaf block, the number of datablocks in the table per key, etc.

In an ideal world, we would be able to trust the optimizer to do the right thing all the time.

However, we live in an imperfect world, and the Oracle Optimizer does not always have perfect information about the data or your query. Either the statistics might not be up to date, or they might not include some dependencies within the data. Also your query might not give the optimizer all the information that you have. (see the first example in this article, where we can give the optimizer extra information by adding a predicate).

Consider this query:

Select * From

Tiny  T, –7089 rec

Small S   — 71151 Rec

Where t.Object_id = s.Object_id;

 

Which explain plan is more efficient:

 

0 SELECT STATEMENT  

1  HASH JOIN        

2   TABLE ACCESS FULL  TINY

3   TABLE ACCESS FULL  SMALL

 

Or:

 

0 SELECT STATEMENT                      

1  TABLE ACCESS BY INDEX ROWID SMALL     

2   NESTED LOOPS                       

3    TABLE ACCESS FULL         TINY    

4    INDEX RANGE SCAN          SMALL_IDX

 

The first explain plan uses a Hash Join, with Tiny as the Build Table. The second uses a Nested Loops Join, with Tiny as the Inner Table.

I hope you have decided that you don’t know which one is the most efficient. Because you cannot know based on the information you have. If I selected 2 non-overlapping sets of data, the Nested Loop Join might be more efficient.  It would scan Tiny for 7089 records, then do 7089 index lookups on Small (without result), so it would need to read approximately 7089+7089=14178 blocks.

 

If the 2 sets would have a 1-n relationship, and every occurrence of object_id in Small is also in Tiny, both tables would need to be read fully and the Hash Join would be more efficient.

 

That means that to write efficient SQL, you’ll need to have an understanding of the data. And you must have considered the optimal execution plan for your query. When you write your query, keep in mind the mantra in first part of this article. Less is Better, More is Better. You want to give Oracle as much information as possible, and you want to get as small as possible result sets.

 

The last part that we will do in this part, is to look at the

Explain plan

 

To see what Oracle will do when executing a query, you can make an explain plan. Many tools have built-in options to show explain plans on queries. Alternatively, you can use the Oracle commands:

SQL> Explain plan for

  2  Select * From

  3  Tiny  T, –7089 rec

  4  Small S   — 71151 Rec

  5  Where t.Object_id = s.Object_id;

 

Explained.

 

SQL> select * from table( dbms_xplan.display() );

 

PLAN_TABLE_OUTPUT

——————————————————————————–

Plan hash value: 803478362

 

—————————————————————————-

| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

—————————————————————————-

|   0 | SELECT STATEMENT   |       |  7089 |  1287K|  1264   (1)| 00:00:16 |

|*  1 |  HASH JOIN         |       |  7089 |  1287K|  1264   (1)| 00:00:16 |

|   2 |   TABLE ACCESS FULL| TINY  |  7089 |   650K|   213   (0)| 00:00:03 |

|   3 |   TABLE ACCESS FULL| SMALL | 71151 |  6392K|  1049   (1)| 00:00:13 |

—————————————————————————-

 

 

PLAN_TABLE_OUTPUT

——————————————————————————–

Predicate Information (identified by operation id):

—————————————————

 

   1 – access("T"."OBJECT_ID"="S"."OBJECT_ID")

 

15 rows selected.

 

I took the previous query, and ran an explain plan on it. First I should tell you how I created tiny and small, so you can verify:

 

create table tiny as select * from dba_objects where mod(object_id,10)=0;

 

create table small as select * from dba_objects;

 

create index tiny_idx on tiny(object_id);

 

create index small_idx on small(object_id);

 

Analyze Table Tiny Compute Statistics;

 

analyze table small compute statistics;

 

You can see that both tables were created from dba_objects and tiny is a subset of small. Therefore a Hash Join makes most sense.

 

Let’s take a closer look at the explain plan now. Keep in mind that all the numbers are estimates from the optimizer. They are based on the statistics available, and the runtime numbers might be completely different.

The execution starts at the rows furthest right. And top down. In this case that means that we start with a Full Table Scan of Tiny. Then we do a Full Table Scan of Small, and finally we Hash join the 2 sets together.

The result of the Hash Join is returned, and becomes the result of the query (Select).

 

Let’s take a closer look at the Full Table Scans. After the name of the table, we see ‘Rows’,’Bytes’,’Cost’ and ‘Time’.

The ‘Time’ is an estimate of the amount of time it will take to complete this step. It is useful in query tuning  to see which step will take the most time.

The ‘Bytes’ are an estimate of the amount of data used to complete this step. I find it useful when a hash join is involved. Because when it exceeds my hash_area_size, Oracle will need a ‘Multi-Pass Hash Join’.

The ‘Cost’ is often used for query tuning, and people will try to ‘tune down’ the cost. This makes sense, because the cost is an estimate for the amount of I/O Oracle needs to do for this step. However, there is a logical trap in this.

Oracle has used the cost of different explain plans for the original query to decide the most efficient one. Now when we change the query, the cost can no longer be compared to the original query. After all, it is a different query. It will return a different result. Unless, of course there is some information about the data we are selecting, that we didn’t give the optimizer initially.

 

To me, it makes more sense to look at the ‘Rows’. This is the number of records Oracle expects from that step in the query. If this is very different from the number you are expecting, something is wrong. Either the statistics are outdated, or the Optimizer is missing some information that you have, or you are not selecting the data you are expecting.

 

In the query above, we see that Oracle has made the perfect assumptions. Tiny will return 7089 rows, Small will return 71151 rows and there is a 1-1 relationship  between them. So the join will return one record for each record in Tiny. No use looking for a more efficient plan here. Unless again, we did not query what we are looking for.

 

 

 

Selecting from custom types in (workflow) tables

 

After a long silence, it’s time to write another blog entry. I received a request to write about Oracle Reports. And I think that will be a new series (Even though I didn’t finish the workflow series yet). 

 

But the last few days, I’ve been working on cleaning up workflow tables. 

Most of these tables are very straightforward and you can find queries and descriptions in the workflow series. However, there are some more complex cases. There are the advanced queuing (aq) tables. And also some data hidden in wf_item_attribute_values for items started by the Business Event System (BES) (reminder to self: Write that article about BES too).

 

In the aq-tables, the payload of the message (i.e. the data transferred by the message) is stored in a custom type. The same goes for  the event_data in wf_item_attribute_values. In this article we’ll see how we can get the data from those ‘strange’ columns. 

 

Let us start with wf_item_attribute_values. Processes that are started from a business event, store the data from the originating business event in the column ‘EVENT_VALUE’. This has a type ‘WF_EVENT_T’. When you query it in a sql*plus session, you will see a huge column filled with something like: 

 

EVENT_VALUE(PRIORITY, SEND_DATE, RECEIVE_DATE, CORRELATION_ID, PARAMETER_LIST(NAME, VALUE), EVENT_NA
----------------------------------------------------------------------------------------------------
WF_EVENT_T(0, '17-SEP-09', NULL, NULL, WF_PARAMETER_LIST_T(WF_PARAMETER_T('TASK_ID', '16719879'), WF('ABORT_WORKFLOW', 'N'), WF_PARAMETER_T('SUB_GUID', '73BAB9A51BAF5307E04400144F687CA0')), 'oracle.ap, NULL, NULL)

 

The problem that many ebs-dba’s are facing is how to select the data inside this column. In this case, it would be the task_id that we are interested in. Oracle delivers several API's for use in PL/SQL. But sometimes you want plain SQL. 

One way to do it is to use a clever substr/instr construction or a regular expression. But the efficient way to do it, is to tell Oracle which info you want. 

Let’s take a look at the custom type. As mentioned, it is WF_EVENT_T. We can find the description in the DBA_TYPES table.

 

select owner,type_name,typecode,attributes,methods 
from dba_types 
where type_name='WF_EVENT_T';

OWNER                          TYPE_NAME                      TYPECODE                       ATTRIBUTES    METHODS
------------------------------ ------------------------------ ------------------------------ ---------- ----------
APPS                           WF_EVENT_T                     OBJECT                                 13         31

 

 

This tells us that the WF_EVENT_T is an object type. 

We can find its attributes in DBA_TYPE_ATTRS: 

select owner,type_name,attr_name,attr_type_owner,attr_type_name,length 
from dba_type_attrs 
where type_name=’WF_EVENT_T’;

OWNER		TYPE_NAME	ATTR_NAME         	ATTR_TYPE_OWNER		ATTR_TYPE_NAME              
---------	------------	-------------------	----------------	--------------
APPS   		WF_EVENT_T      CORRELATION_ID					VARCHAR2                    
APPS     	WF_EVENT_T      ERROR_MESSAGE       				VARCHAR2                    
APPS     	WF_EVENT_T      ERROR_STACK         				VARCHAR2                    
APPS     	WF_EVENT_T      ERROR_SUBSCRIPTION  				RAW                         
APPS     	WF_EVENT_T      EVENT_DATA          				CLOB                        
APPS     	WF_EVENT_T      EVENT_KEY					VARCHAR2                    
APPS     	WF_EVENT_T      EVENT_NAME          				VARCHAR2                    
APPS     	WF_EVENT_T      FROM_AGENT         	APPS			WF_AGENT_T
APPS     	WF_EVENT_T      PARAMETER_LIST    	APPS			WF_PARAMETER_LIST_T
APPS     	WF_EVENT_T      PRIORITY         				NUMBER
APPS     	WF_EVENT_T      RECEIVE_DATE     				DATE
APPS     	WF_EVENT_T     	SEND_DATE        				DATE
APPS     	WF_EVENT_T     	TO_AGENT         	APPS			WF_AGENT_T

You see that the attributes are defined including their datatype, which can be a seeded datatype (VARCHAR2) or a custom one (WF_PARAMETER_LIST_T). Now that we now the attributes of the type, we can select them directly. To select the ‘PRIORITY’, just use an extra qualifier: 

select v.event_value.priority
from   wf_item_attribute_values v 
where  ROWNUM = 1;

EVENT_VALUE.PRIORITY
--------------------
                   0

But how about the ‘PARAMETER_LIST’? That is where the task_id was stored. Let’s check the WF_PARAMETER_LIST_T definition: 

select owner,type_name,typecode,attributes,methods 
from   dba_types 
where  type_name ='WF_PARAMETER_LIST_T';

OWNER	  TYPE_NAME         	TYPECODE  		ATTRIBUTES  METHODS
--------- -------------------	-------------------	----------- -------	
APPS      WF_PARAMETER_LIST_T	COLLECTION       		  0       0

 

This time, the type is a collection. We can find more info about a collection with: 

select type_name,coll_type,elem_type_owner,elem_type_name 
from 	 dba_coll_types 
where  type_name='WF_PARAMETER_LIST_T';

TYPE_NAME                      COLL_TYPE                      ELEM_TYPE_OWNER                ELEM_TYPE_NAME
------------------------------ ------------------------------ ------------------------------ --------------
WF_PARAMETER_LIST_T            VARYING ARRAY                  APPS                           WF_PARAMETER_T

 

So the WF_PARAMETER_LIST_T is a Varray of WF_PARAMETER_T. Before we look at selecting from Varrays, we first check what WF_PARAMETER_T looks like:

select owner,type_name,typecode,attributes,methods 
from   dba_types 
where  type_name ='WF_PARAMETER_T';

OWNER	  TYPE_NAME           TYPECODE	ATTRIBUTES METHODS
--------- ------------------- ---------	---------- ----------
APPS  	  WF_PARAMETER_T      OBJECT             2          4

 

That is an object type again. So we select: 

select owner,type_name,attr_name,attr_type_owner,attr_type_name
from   dba_type_attrs 
where  type_name='WF_PARAMETER_T';

OWNER	TYPE_NAME             ATTR_NAME	ATTR_TYPE_OWNER	    ATTR_TYPE_NAME      
--------- ------------------- --------- ------------------- --------------
APPS	WF_PARAMETER_T        NAME			    VARCHAR2            
APPS	WF_PARAMETER_T	      VALUE			    VARCHAR2           

 

Ok. We now know the whole structure of the parameter list. Back to the Varray.  A Varray (Varying Array) is of course an array structure. Since this is similar to a table structure, you can cast the Varray into a table. Then you use the casted table to select your data. Let’s do that to get the names from the parameter list.

select t.name
from   wf_item_attribute_values v
,      table(v.event_value.parameter_list) t
Where  v.event_value IS NOT NULL 
And    ROWNUM = 1;

NAME
------------------------------
TASK_ID

 

Now that’s a neat trick. We can join our table to its own column! 

In our case, we only have a task_id parameter. We could do the same again, to get the value of the parameter from the value column.

But to join wf_item_attribute_values to itself is a very expensive operation. Take a look at the explain plan: 

PLAN_TABLE_OUTPUT
 
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                           |  Name                     | Rows  | Bytes | Cost  | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                           |     1 |    62 |   432M|       |       |
|*  1 |  COUNT STOPKEY                      |                           |       |       |       |       |       |
|   2 |   NESTED LOOPS                      |                           |   207G|    11T|   432M|       |       |
|   3 |    PARTITION RANGE ALL              |                           |       |       |       |     1 |    77 |
|   4 |     PARTITION HASH ALL              |                           |       |       |       |     1 |     8 |
|*  5 |      TABLE ACCESS FULL              | WF_ITEM_ATTRIBUTE_VALUES  |    25M|  1453M|   948K|     1 |   616 |
|   6 |    COLLECTION ITERATOR PICKLER FETCH|                           |       |       |       |       |       |
-----------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(ROWNUM=1)
   5 - filter(SYS_OP_NOEXPAND("V"."EVENT_VALUE") IS NOT NULL)
 
Note: cpu costing is off  	 

 

 

That can kind of hurt if it,were it not for the NL with a COUNT STOPKEY (Because of the rownum=1). The reason for the expensive plan, is because for every row from wf_item_attribute_values, Oracle needs to get the data from event_value. It basically does a carthesian join with itself.

So here is another way to get your data: 

select v.item_key
,      (select value
        from table(v.event_value.parameter_list) t
        where t.name='TASK_ID' ) task_id
from   wf_item_attribute_values v
where  v.event_value is not null
and    rownum=1;

ITEM_KEY						TASK_ID
-------------------------------------------------	--------
oracle.apps.jtf.cac.task.createTask-155563676		16719879

 

Now we use a scalar subquery, where Oracle will access only the event_value for the rows that will be returned to the user. You can see this in the explain plan by the collection iterator picklefetch (= the operation that collects the data from a collection type) being pushed up to just before the select. 

PLAN_TABLE_OUTPUT
 
---------------------------------------------------------------------------------------------------------------
| Id  | Operation                         |  Name                     | Rows  | Bytes | Cost  | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                           |     1 |    60 |   948K|       |       |
|*  1 |  COLLECTION ITERATOR PICKLER FETCH|                           |       |       |       |       |       |
|*  2 |  COUNT STOPKEY                    |                           |       |       |       |       |       |
|   3 |   PARTITION RANGE ALL             |                           |       |       |       |     1 |    77 |
|   4 |    PARTITION HASH ALL             |                           |       |       |       |     1 |     8 |
|*  5 |     TABLE ACCESS FULL             | WF_ITEM_ATTRIBUTE_VALUES  |    25M|  1453M|   948K|     1 |   616 |
---------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)='TASK_ID')
   2 - filter(ROWNUM=1)
   5 - filter(SYS_OP_NOEXPAND("SYS_ALIAS_1"."EVENT_VALUE") IS NOT NULL)
 
Note: cpu costing is off

 

Still expensive. But much better.

 

So that will let us peek into the  event_value column on wf_item_attribute values. Be aware that different processes will enter different parameters into the event_value column. But with the information above you should be able to understand how to select the data and use it in your own queries. 

 

Now let’s take a look at the AQ-tables? 

As I mentioned, they carry their payloads in custom types too. One that most ebs-dba’s will have seen is WF_NOTIFICATION_OUT. When the workflow engine generates a notification, it will be stored in ‘WF_NOTIFICATIONS’. However, a message will be posted on ‘WF_NOTIFICATION_OUT’ too. This message will be read by the workflow notification mailer, which will use it to get the relevant data from WF_NOTIFICATIONS.

 

If you haven’t run the notification mailer for a long time, you might want to clean up WF_NOTIFICATION_OUT a bit. There is a script available from Metalink to do the job. (There is also a quicker but unsupported way). But you might want to see the records in WF_NOTIFICATION_OUT related to WF_NOTIFICATIONS to decide if a cleanup is appropriate. 

 

Let’s look at WF_NOTIFICATION_OUT:

Name                    Null?    Type
----------------------- -------- ----------------
Q_NAME                           VARCHAR2(30)
MSGID                   NOT NULL RAW(16)
CORRID                           VARCHAR2(128)
PRIORITY                         NUMBER
STATE                            NUMBER
DELAY                            DATE
EXPIRATION                       NUMBER
TIME_MANAGER_INFO                DATE
LOCAL_ORDER_NO                   NUMBER
CHAIN_NO                         NUMBER
CSCN                             NUMBER
DSCN                             NUMBER
ENQ_TIME                         DATE
ENQ_UID                          NUMBER
ENQ_TID                          VARCHAR2(30)
DEQ_TIME                         DATE
DEQ_UID                          NUMBER
DEQ_TID                          VARCHAR2(30)
RETRY_COUNT                      NUMBER
EXCEPTION_QSCHEMA                VARCHAR2(30)
EXCEPTION_QUEUE                  VARCHAR2(30)
STEP_NO                          NUMBER
RECIPIENT_KEY                    NUMBER
DEQUEUE_MSGID                    RAW(16)
SENDER_NAME                      VARCHAR2(30)
SENDER_ADDRESS                   VARCHAR2(1024)
SENDER_PROTOCOL                  NUMBER
USER_DATA                        SYS.AQ$_JMS_TEXT _MESSAGE

 

The regular AQ-information is there. And our payload in USER_DATA. This time it’s an AQ-type. 

Let’s follow the same procedure: 

select type_name,typecode,attributes,methods
from   dba_types
where  type_name='AQ$_JMS_TEXT_MESSAGE';

TYPE_NAME                      TYPECODE                       ATTRIBUTES    METHODS
------------------------------ ------------------------------ ---------- ----------
AQ$_JMS_TEXT_MESSAGE           OBJECT                                  4         34

An object type. So let’s see its attributes: 

select type_name,attr_name,attr_type_owner,attr_type_name
from   dba_type_attrs 
where  type_name='AQ$_JMS_TEXT_MESSAGE';

TYPE_NAME                      ATTR_NAME                      ATTR_TYPE_OWNER                ATTR_TYPE_NAME      
------------------------------ ------------------------------ ------------------------------ --------------
AQ$_JMS_TEXT_MESSAGE           HEADER                         SYS                            AQ$_JMS_HEADER
AQ$_JMS_TEXT_MESSAGE           TEXT_LEN                                                      INTEGER
AQ$_JMS_TEXT_MESSAGE           TEXT_LOB                                                      CLOB                
AQ$_JMS_TEXT_MESSAGE           TEXT_VC                                                       VARCHAR2      

 

We can already read the text_len, text_lob and text_vc. The last 2 contain an XML with a reference to the notification. But the information that I want to show you is in the header. This is a type ‘AQ$_JMS_HEADER’. When we check this one, we see that it again is an object with these attributes:

select type_name,typecode,attributes,methods
from   dba_types
where  type_name='AQ$_JMS_HEADER';

TYPE_NAME                      TYPECODE                       ATTRIBUTES    METHODS
------------------------------ ------------------------------ ---------- ----------
AQ$_JMS_HEADER                 OBJECT                                  7         31

select type_name,attr_name,attr_type_owner,attr_type_name
from   dba_type_attrs 
where  type_name='AQ$_JMS_HEADER';

TYPE_NAME                      ATTR_NAME                      ATTR_TYPE_OWNER                ATTR_TYPE_NAME
------------------------------ ------------------------------ ------------------------------ -------
AQ$_JMS_HEADER                 APPID                                                         VARCHAR2
AQ$_JMS_HEADER                 GROUPID                                                       VARCHAR2
AQ$_JMS_HEADER                 GROUPSEQ                                                      INTEGER
AQ$_JMS_HEADER                 PROPERTIES                     SYS                            AQ$_JMS_USERPROPARRAY
AQ$_JMS_HEADER                 REPLYTO                        SYS                            AQ$_AGENT
AQ$_JMS_HEADER                 TYPE                                                          VARCHAR2
AQ$_JMS_HEADER                 USERID                                                        VARCHAR2

 

As you can imagine, we need to drill down into ‘PROPERTIES’. The other attributes might or might not contain any data, depending on the notification. properties is a Varray of ‘AQ$_JMS_USERPROPERTY’:

select type_name,typecode,attributes,methods
from   dba_types
where  type_name='AQ$_JMS_USERPROPARRAY';

TYPE_NAME                      TYPECODE                       ATTRIBUTES    METHODS
------------------------------ ------------------------------ ---------- ----------
AQ$_JMS_USERPROPARRAY          COLLECTION                              0          0

select type_name,coll_type,elem_type_owner,elem_type_name 
from 	 dba_coll_types 
where  type_name='AQ$_JMS_USERPROPARRAY';

TYPE_NAME                      COLL_TYPE                      ELEM_TYPE_OWNER                ELEM_TYPE_NAME
------------------------------ ------------------------------ ------------------------------ --------------------
AQ$_JMS_USERPROPARRAY          VARYING ARRAY                  SYS                            AQ$_JMS_USERPROPERTY

 

One more level to check: 

select type_name,typecode,attributes,methods
from   dba_types
where  type_name='AQ$_JMS_USERPROPERTY';

TYPE_NAME                      TYPECODE                       ATTRIBUTES    METHODS
------------------------------ ------------------------------ ---------- ----------
AQ$_JMS_USERPROPERTY           OBJECT                                  5          0

select type_name,attr_name,attr_type_owner,attr_type_name
from   dba_type_attrs 
where  type_name='AQ$_JMS_USERPROPERTY';

TYPE_NAME                      ATTR_NAME                      ATTR_TYPE_OWNER                ATTR_TYPE_NAME
------------------------------ ------------------------------ ------------------------------ -------
AQ$_JMS_USERPROPERTY           JAVA_TYPE                                                     INTEGER
AQ$_JMS_USERPROPERTY           NAME                                                          VARCHAR2
AQ$_JMS_USERPROPERTY           NUM_VALUE                                                     NUMBER
AQ$_JMS_USERPROPERTY           STR_VALUE                                                     VARCHAR2
AQ$_JMS_USERPROPERTY           TYPE                                                          INTEGER

 

So let’s see what properties we have. I just query the whole contents of properties for the first row in wf_notification_out (This particular system doesn’t have a WF-mailer running. If it is running, chances are that you won’t have any records in WF_NOTIFICATION_OUT).

select p.*
from   (select * 
        from wf_notification_out 
        where rownum=1) n
,      table(n.user_data.header.properties) p;

NAME	              	TYPE  STR_VALUE	                                   NUM_VALUE JAVA_TYPE                             
----------------------- ----- -------------------------------------------- --------- ---------
BES_EVENT_NAME	      	  100 oracle.apps.wf.notification.send                              27
BES_EVENT_KEY	      	  100                                               39388680        27
BES_PRIORITY	      	  200                                                     50        23
BES_SEND_DATE	       	  100 2009/03/06 01:12:23		                            27
BES_RECEIVE_DATE	  100 2009/03/06 01:12:34		                            27
BES_FROM_AGENT	      	  100 WF_NOTIFICATION_OUT@TESTDB.STIJF.COM                          27
BES_ERROR_SUBSCRIPTION    100 C10E7C2EF71253C1E0340800208D03E1		           	    27
NOTIFICATION_ID	      	  100 39388680		                                            27
ROLE		       	  100 FND_RESP535:21704		                                    27 
GROUP_ID		  100 39388680		                                            27
Q_CORRELATION_ID	  100 XDPWFSTD		                                            27

There you go. Among others, the ‘NOTIFICATION_ID’ is there. It maps to the notification_id on ‘WF_NOTIFICATIONS’. And by now, we have seen enough to select it directly in our queries:

select n.msgid,(select str_value 
                from   table(n.user_data.header.properties) 
		where  name='NOTIFICATION_ID') notification_id
from   wf_notification_out n
where  rownum=1;

MSGID			    	 NOTIFICATION_ID
-------------------------------- ---------------
64672D4985E57075E04400144F687CA0	39388680

That concludes this article. Based on the above, you will be able to select the data you want.