The Helsinki Declaration (IT-version): 2009

Monday, September 7, 2009

Using pipelined table function as the UI API object

In my previous post I've introduced you to an example WoD application page, and showed how the render-flow could be built in an Helsinki manner using a ref-cursor. The UI technology stack would call a stored procedure which in turn would return a ref-cursor. Inside the UI code stack all that needs to be done is fetch the rows from the ref-cursor (until %NOTFOUND), then close the cursor, and display the rows retrieved in the UI. This approach uses a PL/SQL based (or procedural) interface with the DBMS: the UI-code layer calls a stored procedure, which acts as the UI API object for that page.

Sometimes the technology used for the UI-layer might be more comfortable using a SQL-based contract with the DBMS. So instead of calling a stored procedure, we would like to submit SQL-statement. More specifically, in the example at hand, a query.

Using SQL based contract

In this post I'll demonstrate how the same page can be serviced by the DBMS via a pipelined table function. Pipelined table functions are a great enabler of pushing business logic into the DBMS tier while at the same time enabling a SQL-based contract of the UI-layer with the DBMS.

Here's how we are going to setup the contract.

We'll use the application context feature inside the DBMS to first convey to the data-tier some context of the page. This will be exactly the same context as with the procedural contract: the username that's logged in, the EAN and the book (search) string. Then we'll develop a view that employs a pipelined table function that uses this context to produce the correct set of rows that is to be returned to the UI-layer.

So our contract with the UI-developer will be:

call a stored procedure supplying the three context variables, and then
query the rows by doing a select * from a view (without any where-clause).

Let's develop the context procedure first. This requires:

creating a (generic) package to populate name-value pairs inside a context
creating a context specifically for this page and attaching it to the package
creating a UI-interface object specifically for this page, using the package to enter the three name-value pairs into the context.

Here's the code for that.

The package which has a "setter" to enter a name-value pair into a context (a.k.a. namespace):

create or replace package x_ctx as
procedure p_set(p_namespace in varchar2
               ,p_attribute in varchar2
               ,p_value     in varchar2);
end;
/
create or replace package body x_ctx as
procedure p_set(p_namespace in varchar2
               ,p_attribute in varchar2
               ,p_value     in varchar2) as
begin
  --
  dbms_session.set_context(p_namespace,p_attribute,p_value);
  --
end;
end;
/

The context that requires the package to populate it:

create context bookorder_ctx using x_ctx
/

And finally the UI-interface procedure:

create or replace procedure p_bookorder_ctx
(p_username    in varchar2
,p_ean         in varchar2
,p_book_string in varchar2) as
begin
  --
  -- Put info in v$context.
  --
  x_ctx.p_set('BOOKORDER_CTX','USERNAME',p_username);
  x_ctx.p_set('BOOKORDER_CTX','EAN',p_ean);
  x_ctx.p_set('BOOKORDER_CTX','BOOK_STRING',p_book_string);
  --
end;
/

Now all that's left is to develop a view that uses a pipelined table function to retrieve the rows. A pipelined table function requires the creation of two types: a type that describes the structure of the rows that are being returned by that function, and on top of that, a table type. Here they are:

create type f_bookorder_t is object
(username   varchar2(10)
,ean        varchar2(9)
,title      varchar2(95)
,amount     number
,new_amount number)
/

create type f_bookorder_table_t as table of f_bookorder_t;
/

These types are then used in the code of the pipelined table function:

create or replace function tf_bookorder
(p_username    in varchar2
,p_ean         in varchar2
,p_book_string in varchar2) return f_bookorder_table_t pipelined as
--
type refcursor is ref cursor;
c1   refcursor;
r1   f_bookorder_t;
pl_where varchar2(2000);
begin
  --
  if p_username is null
  then
    return;
  elsif
    p_ean is not null
  then
    pl_where := 'and o.username = '''||p_username||''''||'
                 and b.ean      = '''||p_ean||'''';
  elsif
    p_book_string is not null
  then
    pl_where := 'and o.username      = '''||p_username||''''||'
                 and upper(o.ean||'' ''||b.author||'' ''||b.title||'' / ''||b.publisher) like ''%''||'''||upper(p_book_string)||'''||''%''';
  else
    pl_where := 'and o.username      = '''||p_username||'''';
  end if;
  --
  open c1 for
   'select f_bookorder_t
          ('''||p_username||'''
          ,b.ean
          ,o.ean||'' ''||b.author||'' ''||b.title||'' / ''||b.publisher
          ,o.amount
          ,to_number(null)) as bookorder
    from book b
        ,bookorder o
    where b.ean = o.ean '
        ||pl_where;
  --
  fetch c1 into r1;
  --
  while c1%FOUND
  loop
    --
    pipe row(r1);
    --
    fetch c1 into r1;
    --
  end loop;
  --
  close c1;
  --
  return;
  --
exception when no_data_needed
then close c1;
  return;
end;
/

Note that I'm again hardcoding binds into the sql-text of the ref-cursor, which is of course a bad idea. The purpose here is to show the pipelined table function concept. Tim Hall has a great article here that demonstrates how the values available inside the context can be readily reused as binds, and (this is what makes it great as far as I'm concerned) at the same time preventing having to code different 'open ref-cursor for ... using ...' statements.

And finally here's the view on top of this table function:

create or replace force view v_bookorder as
select o.username
      ,o.ean
      ,o.title
      ,o.amount
      ,o.new_amount
from table(tf_bookorder(sys_context('BOOKORDER_CTX','USERNAME')
                       ,sys_context('BOOKORDER_CTX','EAN')
                       ,sys_context('BOOKORDER_CTX','BOOK_STRING'))) o
/

We invoke the table function in the FROM clause while at the same time supplying it with the three values that have been set into the context.

Using XML in the contract

As annouced in the previous post, we'll also quickly show a contract based on XML.

So let's suppose the technology used for coding the UI-layer likes it when it gets an XML document that holds the rows that are to be displayed. Here's the procedure for a contract that returns the XML as a Clob:

create or replace procedure get_xml
(p_username    in varchar2
,p_ean         in varchar2
,p_book_string in varchar2
,p_xml         out CLOB) as
--
type order_t is record
(username varchar2(10)
,ean      varchar2(9)
,title    varchar2(95)
,amount   number
,new_amount number);
r1 order_t;
type refcursor is ref cursor;
c1   refcursor;
pl_where varchar2(2000);
begin
  --
  if p_username is null
  then
    return;
  elsif
    p_ean is not null
  then
    pl_where := 'and o.username = '''||p_username||''''||'
                 and o.ean      = '''||p_ean||'''';
  elsif
    p_book_string is not null
  then
    pl_where := 'and o.username      = '''||p_username||''''||'
                 and upper(o.ean||'' ''||b.author||'' ''||b.title||'' / ''||b.publisher) like ''%''||'''||upper(p_book_string)||'''||''%''';
  else
    pl_where := 'and o.username      = '''||p_username||'''';
  end if;
  --
  open c1 for
   'select o.username
          ,o.ean
          ,substr(o.ean||'' ''||b.author||'' ''||b.title||'' / ''||b.publisher,1,95) as title
          ,o.amount
          ,to_number(null) as new_amount
    from book b
        ,bookorder o
    where b.ean = o.ean '
        ||pl_where;
  --
  fetch c1 into r1;
  --
  p_xml := ''||chr(10);
  --
  while c1%FOUND
  loop
    --
     dbms_lob.append(p_xml,''||chr(10);
     dbms_lob.append(p_xml,'  '||r1.username||''||chr(10));
     dbms_lob.append(p_xml,'  '||r1.ean||''||chr(10));
     dbms_lob.append(p_xml,'  '||r1.title||''||chr(10));
     dbms_lob.append(p_xml,'  '||r1.amount||''||chr(10));
     dbms_lob.append(p_xml,'  '||r1.new_amount||''||chr(10));
     dbms_lob.append(p_xml,''||chr(10));
    --
    fetch c1 into r1;
    --
  end loop;
  --
  close c1;
  --
  dbms_lob.append(p_xml,'');
  --
end;
/

We could of course also return a true XMLType value, by simply casting the Clob to XMLtype prior to returning it. Alternatively we could have built an XMLType value straight from the start by using all the XML function support available to us inside the DBMS.

Tuesday, June 16, 2009

Continuing with part 2 of the Helsinki presentation

The second part of my two-hour Helsinki presentation brings the message that in order to avoid PL/SQL spaghetti when taking the "fat database" approach, one must employ a layered PL/SQL code architecture inside the DBMS. This means that UI-code, BL-code and DL-code should not be merged together inside single pieces of PL/SQL code blocks, but instead remain completely separated. Now, for UI-code that's often easy since this code is most of the time developed outside the DBMS (unless you are using APEX). But even then no BL-, nor DL-code should be embedded inside the UI-technology du-jour that you use. Instead you need to introduce database UI-API's that act as services to the UI-layer: these services implement the required data retrieval and data manipulation that the UI requires and act as the "contract" that the UI has with the DBMS.

This "design by contract" is a very important aspect of the Helsinki way of building a WoD application. I'll give examples of these contracts in this and future posts.

Let's assume we have a simple database design of two tables: a BOOK table (which holds book titles) and a ORDERLINE table (which holds orders against titles for a user).

The EAN column is primary key in the BOOK table and columns USER and EAN are primary key in the ORDERLINE table. EAN is also foreign key back to the BOOKS table. All fairly simple so far.
There are also two "other" constraints involved in this two-table database design.

A user cannot have more than a (sum) total of fifty books in ORDERLINE.
A title cannot be ordered more than a (sum) total of eighty in ORDERLINE.

I use SQL assertions quite often nowadays to document constraints more formally. The two introduced for this small example can be specified as follows.

create assertion no_more_than_50_per_user as
check(not exists(select 'a user with more than 50 books'
               from (select username,sum(amount) as total
                     from orderline
                     group by username)
               where total > 50)
);

create assertion no_more_than_80_per_title as
check(not exists(select 'a title with more than 80 ordered'
               from (select ean,sum(amount) as total
                     from orderline
                     group by ean)
               where total > 80)
);

(By the way: had Oracle provided us with support for SQL assertions, the DL-code part would be done now.)

So that's our database design. And here is the UI-page that we need to build.

This page shows all current orderlines for a given user (shown at the top of the page). The user is given the possibility to change the amount of each orderline and press the "Save" button to store such changes. Orderlines can optionally be filtered: the user has a "EAN" item, "Title string" item, and "Search Orders" button to do so.

The full page flow is shown here.

In the Helsinki approach it is important to fully understand the "flows" that can happen for a page: it will help identifying the UI-API's that are required (per flow). In this case,

The page is entered via some other page / menu of the application.
The page is then rendered with no additional search criteria: all orderlines of the user that's currently logged on are retrieved and displayed. We need a UI-API for this obviously.
The user can supply addiotional search criteria and re-render the page by pressing the "Search Orders" button. For this flow we need to design some UI-API (which can preferably be combined with the one mentioned above).
The user can also enter new values for the currently ordered amount per orderline and press the "Save" button. The changes are then processed and the page is re-rendered. For this flow we also need to design some UI-API.
Finally the user can exit the page by pressing the "Exit" button.

Let's first take a short look at traditional approaches of UI-code communicating with the DBMS to retrieve and transact the data.

One way to implement flows 2 and 3 (i.e. the data retrieval) would be to code the SQL join that retrieves the necessary data. This SQL statement would be "parameterized" by three values: the user that's currently logged in, the value of the EAN item on the page, and the value of the Title string value on the page.

Not a very elegant query. It's basically a single query covering four cases.

Both search items are empty.
Only a value for EAN is supplied by the user.
Only a value for Title string is supplied by the user.
Values for both items are supplied by the user.

For the DBMS (read, optimizer) it would be much better to have a dedicated query per case above.

Another (worse) approach typically found in mid-tier centric architected applications is this one.

This again illustrates that mid-tier centric applications, due to their single-table only accesses, are very chatty: they communicate back and forth to the DBMS way more than is necessary.

A traditional approach to implement flow 4 (the Save button) goes as follows. The UI-code detects for which rows the user has entered a new amount. Per such row the UI-code then generates an update statement that writes the new amount in the row, using the PK-value to locate the row to be updated.

And, since we are changing data in this flow, the constraints will need to be validated: in this case two queries will be executed to verify our two constraints. By the way, I'm totally ignoring concurrency issues here now (will return to that in a future post).

Note: all SQL statements mentioned in above traditional approaches, will be burried away inside the UI-code layer. They are embedded in there, somewhere. They sit outside the DBMS which, in Helsinki, is not good.

The Helsinki approach

The Helsinki approach to this example would be to:

first implement all involved DL-code separately using database table triggers (or declarative constraints where possible), then
design/build the UI-API's required for the various flows, and finally
do "the rest": also known as, BL-code (of which there is not much in this particular example).

So let's start with number one: DL-code. Currently I see three options to do this: you Build the triggers on Your Own (BYO). I've demonstrated how this could be done while taking into account concurrency issues and efficiency, at Tom Kyte's blog a while ago (see this post for the specific url to Tom's blog). Or, you use a framework to generate the trigger code for you.

<Warning>
Shameless commercial advertisement about to happen...
</Warning>

As far as I know there are two database frameworks that can do this for you: CDM*RuleFrame (part of the Headstart offering of Oracle), or RuleGen (available from www.RuleGen.com, the company I work for). I just started running another blog where I intend to demonstrate RuleGen more often in the future. For now I've posted a highlevel description of how the two constraints of our example here would be fed into RuleGen.

Once the DL-code is taken care of, you continue with step 2, being: design the database API for the page.

In this case the page has three items that determine what data (orderline rows) is to be shown:

the user that's logged in (on the top of the page, slightly to the left)
the value of the EAN enterable item
the value of the "Title String" enterable item

For flows 2 and 3 (first entry of this page, and user presses "Search Orders" button), we can design the UI-API in a couple of ways. We'll describe three alternatives: one using a ref-cursor, one using a pipelined-function view, and one using XML.

Here's the first one.

The UI-code layer calls a stored procedure and supplies the three (IN) values, user, ean, search-string, which the stored procedure then uses to build the query-text that is to be used for that case. This query-text is then returned as a ref-cursor in an OUT-parameter of the stored procedure. Note, we should now return a query-text *per case*, and not have one generic query-text (using nvl's). Here's a stored procedure get_rows_page_1 demonstrating this:

create or replace procedure get_rows_page_1
(p_username    in varchar2
,p_ean         in varchar2
,p_book_string in varchar2
,p_cursor     out sys_refcursor) is
--
begin
--
if p_username is null
then
--
open p_cursor for
  select bo.username
        ,bo.ean
        ,substr(bo.ean||' '||b.author||' '||b.title||' / '||b.publisher,1,95) as title
        ,bo.amount
        ,to_number(null) as new_amount
  from book b
      ,bookorder bo
  where b.ean = bo.ean
    and 1 = 0;                         -- ***** NO USER INFO  ************************
--
else
--
if p_ean is null and p_book_string is null
then
  --
  open p_cursor for
    select bo.username
          ,b.ean
          ,substr(bo.ean||' '||b.author||' '||b.title||' / '||b.publisher,1,95) as title
          ,bo.amount
          ,to_number(null) as new_amount    -- **** NOTE: ALSO RETURNING THE UPDATE-ITEM *********
    from book b
        ,bookorder bo
    where b.ean = bo.ean
      and bo.username = p_username;     -- ***** NO ADDITIONAL FILTERS  ************************
  --
elsif p_ean is not null and p_book_string is null
then
  --
  open p_cursor for
    select bo.username
          ,b.ean
          ,substr(bo.ean||' '||b.author||' '||b.title||' / '||b.publisher,1,95) as title
          ,bo.amount
          ,to_number(null) as new_amount
    from book b
        ,bookorder bo
    where b.ean = bo.ean
      and bo.username = p_username
      and bo.ean = p_ean;               -- **** ONLY EAN FILTER  ************************
  --
elsif p_ean is null and p_book_string is not null
then
  --
  open p_cursor for
    select bo.username
          ,b.ean
          ,substr(bo.ean||' '||b.author||' '||b.title||' / '||b.publisher,1,95) as title
          ,bo.amount
          ,to_number(null) as new_amount
    from book b
        ,bookorder bo
    where b.ean = bo.ean
      and bo.username = p_username
      and(b.title like '%'||p_book_string||'%'  -- **** ONLY TITLE STRING FILTER  ************************
         or b.author like '%'||p_book_string||'%'
         or b.publisher like '%'||p_book_string||'%');
  --
else /* both filters not null */
  --
  open p_cursor for
    select bo.username
          ,b.ean
          ,substr(bo.ean||' '||b.author||' '||b.title||' / '||b.publisher,1,95) as title
          ,bo.amount
          ,to_number(null) as new_amount
    from book b
        ,bookorder bo
    where b.ean = bo.ean
      and bo.username = p_username
      and bo.ean = p_ean
      and(b.title like '%'||p_book_string||'%'  -- **** BOTH FILTERS  ************************
         or b.author like '%'||p_book_string||'%'
         or b.publisher like '%'||p_book_string||'%');
  --
end if;
--
end if;
--
end;
/

We've now succesfully pushed all business logic into the database-layer. All the UI-code now needs to do is: execute this procedure and fetch rows from the ref-cursor returned until %NOTFOUND, and then render the page.

I'll show the other two alternatives (a pipelined-function view and using XML) in the next blogpost.

Thursday, June 11, 2009

The Helsinki Platform

Nice picture from "OraDude" showing the Fat Database, or Helsinki's first observation.

Saturday, June 6, 2009

Two weeks to go for ODTUG

ODTUG Kaleidoscoop 2009 is approaching soon. I'll be hosting two presentations.

Fat Databases: a Layered Approach
This will basically be the Helsinki Declaration talk, only crammed into just one hour. I'll probably skip the four observations and go straight to the WoD application and its code classification (DL, BL and UI-code). And close with a short demo by building a page (with Apex of course) the Helsinki way.
The talk will be on wednesday morning 8AM! So you'll need to get up early if you want to join. It's in the Regency Ballroom I (sounds like a big room...).
Semantic Query Optimization
I've been giving this talk for a few times now. The first time was at Hotsos 2008. Semantic query optimization is about using integrity constraints in the optimization phase that generates the execution plan. A few years ago I investigated the capabilities of Oracle's optimizer (10G and 11G) in this area. The talk summarizes this investigation and draws a few conclusions.
This talk will be right after my other talk at 9.15AM in the same location: Regency Ballroom I. So you can just remain seated and be entertained by me during two consecutive slots.

Hope to see you there!

Thursday, June 4, 2009

Rebound for database design?

My fellow Oaktable member Robyn Sands posted something very "inline" with the Helsinki declaration. Good comments too.

Thursday, May 28, 2009

Resuming transmission...

I've been busy lately: preparing papers and presentations for the upcoming ODTUG, presenting at Hotsos-revisited, and presenting at the Dutch DBA-symposium. All spare time went into these activities, and the Helsinki blog just had to wait in line. But now I intend to resume transmission over here again.

A couple of weeks ago I received following comment on this blog:

toon,

Suppose we have a registration form on our web site, and the web guys want to check if the username supplied in the form is already in use.

This would be a stored procedure in my UI-api, right ? (one in parameter and a return value to represent true or false)

The quick answer was, of course, yes. But then I started thinking about how I would implement this. We would like to have a robust solution, and for this there are a few things you need to be aware of.

There is some time in between the ‘in-use check’ and the final ‘user registration is executed’ moment. Other users could be checking for the same username during this time.
We are dealing with (stateless) browser users. They can just stop their “user registration” process by closing the browser or moving on to another url.
You need to cater for concurrency.

After some careful thinking here is the outline of a “start” solution as I would have designed it.

Obviously there would be some users table that holds already registered (i.e. in-use) usernames.

create table web_user
(userid     number not null
,username   varchar2(30) not null check(username = upper(username))
,...more columns...
,primary key (userid)
,unique(username)
);

In addition to that I would introduce a table to hold “in-use check” calls.

create table web_user_inuse_check
(sessionid varchar2(...) not null
,username varchar2(30) not null check(username = upper(username))
,checktime date not null
,primary key (sessionid)
,unique (username)
);

The meaning of this table would be: “username [username] was found to be not in use when checked by session [sessionid] at [checktime].” The sessionid would likely map to the value of the http sesssion cookie that is maintained on the middle tier for every connected browser.

This design would have an additional constraint as follows (specified as an SQL assertion):

create assertion no_overlap as
check(not exists (select username from web_user
                  intersect
                  select username from web_user_inuse_check));

Which would be implemented separately using triggers on both tables (not shown in this post).

Next, two procedures would be involved: one for the “inuse” check, and one for the actual creation of a “not-yet-used” username. (in pseudo code)

procedure p_check_already_used
(p_username in varchar2
,p_sessionid in varchar2
,p_inuse out varchar2) as
begin
--
-- Serialize calls to this procedure.
--
Appropriate dbms_lock call to serialize *per username*.
Will raise exception if lock could not be acquired.
--
-- Possible scenarios:
-- 1) username already in use.
-- 2) username not in use and also no inuse_check available.
-- 3) username not in use and inuse_check available by other session.
--   3a) inuse_check is more than 15 minutes ago.
--   3b) inuse_check is 15 minutes or less ago.
-- 4) username not in use and inuse_check available by same session.
--
If [scenario 1]
Then p_inuse := ‘YES’
Elsif [scenario 2]
Then create (using merge on sessionid) inuse_check row for this session;
    P_inuse := ‘NO’;
Elsif [scenario 3a]
Then delete inuse_check row of other session;
    Create (merge) inuse_check row for this session;
    P_inuse := ‘NO’;
Elsif [scenario 3b]
Then p_inuse := ‘YES’;
Elsif [scenario 4]
Then P_inuse := ‘NO’;
End if;
--
-- Commit transaction (also releases dbms_lock lock).
--
Commit;
--
Exception when “lock could not be acquired”
  --
  P_inuse := ‘YES’;
  --
end;

And finally the user-creation procedure.

procedure p_create_user
(p_sessionid in varchar2
,p_username in varchar2
,other user data) as
Begin
--
If “Our (sessionid!) inuse_check row has disappeared”
Then -- It took us more than 15 minutes to fill in the form, and
    -- some other session has claimed this username now.
    --
    -- Quickly check if we can claim it again.
    --
    If p_check_already_used = ‘YES’
    Then error
    End if;
End if;
--
-- Now register our user.
--
Delete our inuse_check row;
Insert web_user row;
Commit;
--
End;

I know this may seem an overly complex solution. And I would probably check with the customer first whether dealing with the “concurrency scenarios” is really necessary or not.

Final remark: I would put both procedures inside a package (BL-code), and create two stub procedures on top of the package (UI API’s) which would be granted (execute) to the schema used by the yafet to connect to the DBMS.

Friday, April 17, 2009

Helsinki code layers in the DBMS

Ok, let's continue with the second part of "The Helsinki Declaration". That would be the part where I zoom in on the DBMS and show you how best to do this database centric thing.

We have seen that the DBMS is the most stable component in everybodies technology landscape. We have also concluded that the DBMS has been designed to handle WoD application BL-code and DL-code. And current DBMS's are really good at this, if the underlying database design is a sound relational design and a good fit for the requirements of the WoD application. Now the latter part is up to the qualities of the person that designs the database. The former part is all about education.

That's why I repeat here, that in order for the Helsinki approach to succeed, you'll require:

Educated database designers
I specifically emphasize educated. You need to educate yourself to prevent fud when designing databases. Database design is all about set theory and predicate logic. Again this is what Ted Codd gave us. There are plenty of books by Chris Date to read and so to educate yourself in this area. I can of course also recommend that you study Lex' and mine book.
Experienced plsql developers
I specifically emphasize experienced since I see that many plsql developers are still stuck in the mid-nineties. They have not evolved their knowledge of the DBMS, which is in the case of Oracle is now at it's 11th major release. Compared to Oracle7 (of the mid-nineties) a lot has been added, and a lot has been improved. Many then-Myths, are no longer Myth now.

With these two prerequisites in place, here's how you implement a WoD application the Helsinki way: you'll need to introduce a layers of code inside the DBMS.

In this layered approach we materialize the WoD code classification that I have introduced: UI-logic (often outside the DBMS), BL-logic, and DL-logic. But we also introduce a separate layer, called UI API-objects, between the UI and BL code layers, and a separate layer between the BL code and the tables in the database design (database design API objects).

Layer 1: UI API objects

In the Helsinki declaration you need to introduce API objects inside the DBMS whose specific purpose is to be called by UI-code. These, and only these, objects will be the entry points, or API if you like, for UI-code to communicate with the DBMS. Per WoD-page you need to design these API objects. Every object in this layer is specific for some page of the WoD application. This layer is a very important layer. Why? Because these (and only these) objects might require re-engineering in case you switch to a new yafet. This layer will give you agility and enable you to easily take on new yafets every so many years. I'll give examples of how to design this layer in future posts.

Layer 2: Business logic

Nothing new here. This is the materialization of BL-code

Layer 3: Database design API objects

In this layer you implement "data services" that are shared by multiple BL-code modules. Whenever you are writing BL, and you see yourself copy-pasting or rewriting a piece rBL or wBL code that you have written before, you need to introduce a database design API object for it. Write it once, and reuse it in different BL-code sections. This layer is somewhat less important, but often found inside your software architecture.

Layer 4: Database design

This layer is sort of pro-forma. It only holds the tables (and indexes) that are part of the database design for the WoD application.

Layer 5: Data logic

This is the materialization of DL-code. Also a very important layer. The reason why data logic is visualized beneath the database design (layer 4), has to do with how Helsinki implements data logic. We'll use database triggers to implement data logic.

And now you say: "Oh no! Triggers? Toon are you serious? Tom Kyte says all triggers are evil, therefor I cannot and will not use them. "

Let's have a brief intermezzo on this topic.

The reason why Oracle rockstar Kyte is against triggers is (from what I understand) because they:

make things happen automagically, and
are rarely implemented correctly.

I understand his point, but would like to introduce some nuance around this ubiquitous opinion on the use of triggers. And I can give you the nuance because of the clear code classification that was introduced for WoD applications.

Listen.

Tom's reason 1 is due to people implementing triggers that hold wBL (write-BL) code. And I fully agree on this point: when you put wBL code in your table triggers, and you then perform dml against the table, things happen automagically. The wBL-code will in its turn perform more dml that you, the issuer of the dml against the table, might not have realized. And if there are many tables with triggers of this kind, then a cascading effect will take place, making more magic happen (which is worse). wBL code should never be executed via triggers.

<related observation>
This automagic behaviour is also why I never use cascade-delete on foreign keys, nor the default clause on columns of tables, nor the fixed-length CHAR datatype. They introduce magic when parent rows are deleted, when a (partial) row is inserted, or a less-than-the-fixed-length value is provided.
</related observation>

Tom's reason 2 is due to people implementing DL-code in triggers, and thinking they have done it right. Implementing DL-code is inherently complex matter. It is complex due to concurrency issues that need to be solved, and due to efficiency levels you would want to have (DL-code shouldn't validate a constraint when the transaction at hand doesn't necessitate it). In chapter 11 (of the book), I fully explain the issues that need to be solved when implementing DL-code. Also, a few years ago, I have demonstrated an example constraint implementation using triggers on Tom's blog. You can find it here. The point I want to make is that just because implementing DL-code via triggers is complex, doesn't mean triggers are evil and it shouldn't be done. To me it means that you need to educate yourself so that you can use triggers correctly. By the way: implementing DL-code not via triggers, but embedded inside wBL-code sections, is complex too. Maybe slightly less complex, but still complex. Triggers have the big advantage that they offer a once-and-for-all type of implementation. Having to always go through a wBL-code section (TAPI's, VAPI's, or whatever you would like to call them), is just not good enough for me.

<related observation>
I'd like to give you one closing thought here. Suppose, just suppose, Oracle would have supported SQL's CREATE ASSERTION statement, thereby providing you with a means to implement all constrainsts declaratively. Just like FK's and PK's are declarative means of implementing a constraint. You would not have to write a single line of DL-code: Oracle accepts your assertion and deduces (computes) when and how the constraint is best validated. Would you use ASSERTIONS? I think yes. You're using PK's and FK's now too, aren't you? And the way declarative constraints are validated "under the hood", is conceptually the same as you building triggers that hold DL-code: at the appropiate moments per row, per statement, Oracle fires the necessary DL-code that's part of the DBMS to validate the constraint.
</related observation>

End of intermezzo.

I want to emphasize that layer 1 (UI API objects) and layer 5 (DL code) need to be separated from the BL and DB-design API layers. Again this all boils down to BL being "the rest". Implementing DL separately, and introducing the design-by-contract UI API layer, will ensure that "the rest" becomes maintainable (thus addressing the Helsinki adversaries' main argument).

Preferably I would use a separate schema inside the database to hold the UI-API objects: I often refer to this schema as the FE (FrontEnd) schema. The yafet would connect to the database with yet another schema: one that only as CREATE SESSION system privilege and all necessary object-privileges on the UI-API objects inside the dedicated FE-schema.

In the next post I will demonstrate the Helsinki code layers with an example database design, some constraints, and an example Window-on-Data page on top of that design.

Hopefully that will be posted in less than two weeks this time, but I do have a real life and a real job next to this blogging too :-).

Sunday, April 12, 2009

People ask the wrong question

People who know me, know that I am enthusiastic about Apex. But I am certainly not an Apex expert. By far not. The DBMS is where my knowledge is. But because they know of my enthusiasm, I often get the question whether Apex is mature enough for building a critical or large-scale WoD application.

I then (sigh and) reply by saying: "You are asking the wrong question."

Pay attention please.

In the Helsinki approach of building a WoD application the technology choice for UI-code is almost irrelevant. Why? Because you will use the yafet to implement UI only. All BL and DL will go into the DBMS. The development of a WoD application in the Helsinki way, will first and foremost be a DBMS project.

The majority of time spent developing a WoD application in the Helsinki way, will be spent on designing and developing the PLSQL and SQL inside the DBMS to implement all necessary BL and DL code.

So the question you should have asked, is:

"Is the DBMS mature enough to support this critical and large-scale WoD application project?"

To which the answer of course would be: 100% affirmative. This was the point of my first observation. The DBMS is mature.

So you see, the yafet that you use to implement UI-code is a second order question. It is still a relevant question of course, but for different reasons than usually intended by the one who poses the wrong question. Their intention usually is about the yafet's capabilities to do large scale BL and DL development. Which in Helsinki is irrelevant. The yafet should only be well capable of rendering UI. Most Yafets (even the sexy new ones like Apex) can do that, no problem. If you intend to also do BL and DL inside the yafet you're on your own. And you need to go ask a different person (I told you I'm not an Apex expert).

The reason why the question is not totally irrelevant has to do with the capabilities of the people involved in building the WoD application. Since choosing the yafet is of second order importance, you might as well choose one that those people are comfortable with.

Remember observation four, the investment required for developers to become productive? Helsinki helps you here: you would only need to learn how to use the UI features of the yafet. And forget about its BL and DL capabilities.

Adverseries of the Helskinki approach always bring up the issue of how to prevent that all DBMS hosted plsql and sql code (for the BL and DL part of the WoD application) becomes one big unmaintainable bucket of spaghetti. The answer to that question is: you need to materialize the code classification and create layers of code inside the DBMS. I will start blogging about that in my next post.

By now I have reached the end of part 1 of my 2-hour presentation and break by leaving my audience with the following closing guideline.

A beautiful example of practicing this exact guideline is Oracle's exabyte storage server. Without going into the details of it, what Oracle does here is push bits of a SQL execution plan further down into the technology stack which increases the execution of SQL by an order of a magnitude.

I rest my case.

Tuesday, April 7, 2009

The Helskinki approaches to WoD application development

[continuing from my previous post]

In a very similar way as I did here for MVC, the Helsinki UI/BL/DL code classes can be mapped across the client, middle and data tiers too:

What I do differently here compared to the earlier display of MVC mapping across the tiers, is that whenever the M is distributed across two tiers, I divide the M into BL and DL. The guideline of how to split up the M, is now made explicit: we divide it into BL and DL. Btw. this only happens in option 5.

1 TFT: UI, BL and DL all go into the middle tier.
2 FFT: UI goes into the client tier, BL+DL go into the middle tier.
3 FTT: All go into the client tier.
4 FTF: BL+DL go into the data tier. UI into the client tier.
5 TFF: DL goes into the data tier. Rest in the middle tier.
6 TTF: BL+DL go into the data tier, and UI in the middle tier.
7 FFF: BL and DL in every tier.
8 TxF: Low and behold! A new option. With Apex (Oracle Application Express) we can push UI into the DBMS. So all goes into the DBMS. The lowercase x denotes the absent mid tier in this scenario.

[I will talk about Apex in a separate post at some time, just thought it was worth mentioning here now]

In the Helsinki declaration we only allow the options that have BL and DL in the data tier: option 4 (if you require a fat, responsive client program), option 6 (html-frontend + you really want to do some Java) and 8 (html-frontend + all you have is PL/SQL knowledge).

All other options have some object-oriented design as the foundation of the WoD application. Which is just plain wrong, and therefore prohibited by the Helsinki declaration. I discussed that here: BL and DL just do not benefit from object orientation: WoD applications work best when they sit on top of a relational database design.

Going back to the picture I've shown you before:

What this means is that we map the DBMS like this:

During the presentation I then mention that inside the Java realm only two classes should be allowed: a Row class and a TableOfRows class. In Helsinki, Java is used only to generate UI: it receives the row or table-of-rows to be displayed from the DBMS. All it needs to do is display the data in a user friendly way. Needless to say that this approach completely and utterly disarms the Java developer(s). Your relationship with them turns into a love/hate relationship:

They'll love you because you make their job infinitely more easy.
They'll hate you because their billable-hours go down the drain.

Of course, if you use Apex, there is no relationship at all...

Advantages of the Helsinki approach

It should not suprise you that this approach completely turns around the main two disadvantages I mentioned earlier when building WoD applications in a mid-tier centric way. Designing and programming a WoD application in the Helsinki way, will greatly reduce the number of roundtrips to the DBMS.

I cannot repeat this enough: if the database design fits your WoD application needs, the Helsinki approach will always outperform any other architecture. Adversaries often bring up the argument, that by pushing 'application logic' (which they never clearly define) into the DBMS layer, that layer will immediately become the bottleneck. I have two responses to this argument.

You will actually put more stress on the DBMS in a mid-tier centric approach. I gave you the example of 70,000 calls to display twenty customers on a page. I know it was an extreme case. But decrease that to 700 calls: I'm willing to bet one monthly salary that that still causes more CPU cycles to be spent inside the data tier compared to an Helsinki approach. Even the other example I gave (GrandParent, Parent, Child report) which calls the DBMS in the order of 50 times causes more activity inside the DBMS that would have been necessary in a Helsinki approach.
By far the majority of the WoD applications built today, will run just smoothly on todays powerful SMP servers. The percentage of (Helsinki) WoD application deployments that will cause the DBMS to become the bottleneck is very, very small. And, being an Oaktable member, I probably should not say this, but one can always consider clustering the DBMS tier (Oracle RAC) to provide more CPU power in that tier.

The second one was about TCO. Building a WoD application the Helsinki way, will create the opportunity to easily migrate to the next sexy du-jour UI technology.

I have experienced this first-hand. One of my customers had built a WoD application around 2002 in a database-centric Helsinki manner. They used the then hot JDeveloper release with its accompanying MVC frameworks: UIX, BC4J and Struts. Two years ago they were faced with a long overdue upgrade of the JEE technology stack: upgrade to newest release of application server, and newest release of JDeveloper with its current versions of the accompanying MVC frameworks. After having spent six months trying to get this done, it was decided to just rebuild all pages in Application Express (the now sexy du-jour UI-technology). Since no DL and BL logic sat outside the data tier, this was achieved in just over two months by four people, without any change in all BL and DL code that sat inside the DBMS.

I think that could be called agile, not?

A collegue of mine back then coined the phrase: UI-code is throw-away code. I could not agree more. It is the top of the technology stack where all the volatility sits, and continuous change happens. Use it thinly and you get agility.

Thursday, April 2, 2009

Issues with current trend

In my last post I have introduced the code classification used by the Helsinki declaration (as opposed to MVC used by JEE):

User Interface (UI) code: all code that creates UI and responds to events in the UI, the same as JEE's View and Control
Data Logic (DL) code: all code that maintains data integrity constraints, a well defined subset of JEE's Model
Business Logic (BL) code: all other code, the remaining part of JEE's Model, query and/or transaction composing and executing code

I would like to point out before continuing with this post that these three code classes may appear to you as somewhat 'conceptual'. But every WoD application somehow performs these functions. Together they are what a WoD application is all about. So you may have a WoD application that does not reflect these code classes at all, but the point that I make here is that somehow your code in your WoD application is doing these same three things: execute UI, BL and DL code. Show me a line of your code, and I can tell you in what Helsinki code class it falls.

I concluded my previous post by showing you how the three Helsinki code classes interact with each other.

Using this picture I can visualize the current trend of building WoD applications as follows:

We do not use the DBMS anymore. All code, in particular all BL and DL code, is implemented within a du-jour technology (XYZ above) that lives outside of the DBMS.

In this post I'd like to explain how this current trend of not using the DBMS (our first observation) combined with the Yafet technology explosion (our third observation) leads to, what I consider, two rather serious issues in todays WoD applications.

Let's start with the first issue.

Scalability and performance issues

Implementing BL and DL code outside the DBMS leads to chatty applications: these applications call the DBMS many times. Let's demonstrate this using that same picture again. I have drawn a vertical yellow line in it vizualizing the current trend: everything to the left of this line is implemented outside the DBMS.

Say the end user has an order-entry page. He/she enters a few orderlines, and then presses the Save button. This generates one context switch (one call) from UI-code to BL-code. The BL-code then starts processing these orderlines. This causes the generation of an order of a magnitude more than one (which is ten) context switches from BL-code to the DBMS. Per orderline however a couple of constraints are likely to be involved. The majority of these constraints require DL-code that queries the current state of the DBMS. So we have another order of a magnitude more calls to the DBMS.

So one event in the top of the WoD code-stack generates two orders of magnitude more events lower down in the WoD code-stack. Mind you, this is not WoD application software specific, but a general phenomenon in all software. The same is also true for instance in an operating system, which has many layers of code classes too.

In todays WoD applications it is often worse than just the two orders of magnitude described above:

If you put BL in middle tier, you’ll call the DBMS an order of magnitude more
If you also put DL in the middle tier, you’ll call the DBMS two orders of magnitudes more
But, if you then also don’t use SQL effectively, you’ll call the DBMS three orders of magnitudes more

I have given you an rBL (read-Business Logic) example of not using SQL effectively at the end of this post. Let me share another experience in this area. A few years ago I was asked to help analyze a performance issue in a very simple user-enters-search-criteria, user-hits-search-button, WoD-app-displays-rows scenario. This involved a 'search customers' page where the user would enter a leading string of the customer name. It took almost a full minute for the application to come up with the first set of twenty rows to be displayed. This WoD app obviously was built in the current trendy manner. So what do you do? You try to find out where time is spent. I performed a sqltrace of this action. And guess what: the trace reveiled that the application was sending in the order of 70,000 SQL statements to the DBMS.

70,000...

Yes folks. 70,000 SQL select statements to come up with the first twenty customers. I am not kidding you. Unbelievable. Obviously the DBMS is not the problem here: it is servicing 70,000 queries in just less than 60 seconds! The problem is the architecture of the WoD application.

This particular scenario could be done with just three calls to the DBMS which would take a fraction of a second to execute on top of a good relational database design.

Open cursor
Perform array fetch of first twenty rows
Close cursor

(actually with the current SQLNet protocol this might even be less than three roundtrips)

But using a 'black box' that is designed to instantanously provide the data that is to be displayed, would be so uncool. No, ... executing 70,000 queries and doing lots of application logic yourself, is way more cool.

This chatty application behaviour leads to:

more latency hits: every time you go back and forth to the DBMS you will be hit with some latency. If this happens a lot, it starts impacting the performance.
more 'system' cpu: context switches are accompanied by 'fixed costs' incurred by the network, OS and DBMS software layers that are constantly creating and destroying contexts. System cpu is pure overhead. It's not adding any value to the business using the WoD application.
more data transfer out: since complex SQL processing (that is any SQL involving more than one table) is effectively performed outside the DBMS, you are sending more data out of the DBMS to that place where the SQL execution plan is now effectively implemented.

This all impacts the performance and scalability of the WoD application.

What this chatty application behaviour also implies, is that the required iron power to run these applications is absurd. I have seen deployments of not too big WoD applications that require farms of application servers, and n-way RAC configurations on top of expensive mid-end hardware configurations. And when I investigate the load coming from the end-user population I start thinking: had this application been architected differently, my bet is it would run just smoothly on a single, let's say, 2 quad-core Intel server that can be bought for about 10K Euro.

Let's move on to the second issue caused by the current trend of not using the DBMS and the ongoing Yafet technology explosion.

WoD application TCO (Total Cost of Ownership)

The second issue is about how the current trend of building WoD applications is impacting the TCO of such an application. Given the ongoing Yafet technology explosion, if you implement all BL and DL code in the volatile XYZ technology, then:

your application is legacy (technology-wise) within a year
it will be hard to find XYZ knowledgeable people
it will be very, very hard to migrate to the next sexy Yafet since it involves migrating BL and DL code (often the only option is to throw away, and code again)

I know of WoD applications built in the mid-nineties in a database-centric way, that today almost 15 years later still run the majority of the BL and DL code unchanged inside the (current version of the) PLSQL virtual machine. I have yet to see the first Java-centric WoD application that runs Java code unchanged in the current versions of the frameworks used to implement the M, the V and the C parts of that application. Often the MVC frameworks used 10 years ago are just not there anymore. So often the WoD applications built 10 years ago still run on those now by long desupported versions of the framework. There is no easy migration possible. It always involves a major effort impacting the TCO of that application.

So in conclusion:

Todays WoD applications suffer from performance issues. This is caused by not pushing down work that could have been taken care of lower in the technology stack, which in turn causes orders of magnitudes more events (context switches) lower down in the technology stack.
Todays WoD applications have a high cost of ownership. This is due to the high technology change rate at the upper end of the technology stack, where all application logic is implemented (which changes a whole lot less fast). Businesses are faced with either, having to rebuild major parts of the application using the newest Yafet, or having to pay high rates for scarce programmers still willing to work in the old-fashioned Yafet.

The opposite of DBMS independence is what we need!

The ugly state of affairs with regards to todays WoD applications is not in the least also caused by that other popular belief. The one that dictates applications should be built in a database independent manner. In the Helsinki declaration I cannot but conclude that quite the opposite is what our customers need. They do not need DBMS independence, by far not. They need Yafet independence. Yafets change all the time, they come and go. The DBMS is probably the most stable factor within the IT landscape of a business. So we need to architect their WoD applications in such a way that they (the applications) are as immune as possible for this ongoing technology change outside the DBMS.

So what this all leads to is that BL and DL code should be implemented inside the DBMS.

Before I talk more about that, the next post will first map the Helsinki code classes accross the client, middle and data tiers (like I did with the MVC code classes here) and revisit the seven thin/fat-thin/fat-thin/fat alternatives again.

Tuesday, March 31, 2009

(slightly off topic) Chris Date event @Dallas

You may want to check out this unique event.

[will return to the declaration shortly]

Saturday, March 28, 2009

Window-on-Data applications

Up till now I have been focussing on technology. We have seen DBMS´s evolve, the web and n-tier architectures come into existence, Yafets prosper, and developer productivity go down the drain. I also spent some time discussing the Java/JEE bandwagon. And used MVC to discuss various technical application architectures. Ohhh, if only there were just technology. Life as an application developer would be so wonderful.

But wait. We have end users too. Shoot...

Let's focus on our customers for a bit: the end users. What is it that they want? Well, I claim that they still want the same (as twenty years ago). In essence that is. They want application frontends that enable them to do their job efficiently. The look & feel of the frontend obviously has evolved (observation two) but they still want to be able to:

find a customer
look at the details of that customer
enter new orders for that customer
bill that customer
etc.

I personally know people working in organisations that today do in a Ajax-enabled web browser application, what they did five years ago in a poor html application, and ten years ago using a GUI fat client application, and fifteen years ago using a character mode client application, and twenty years ago using a blockmode device attached to a mainframe. These end users still want what I call: Window-on-Data (WoD) applications. These applications provide "windows" on top of data that is relevant for them. A window (or page, or screen, or form, or whatever) inside a WoD application enables an end user to do two things. First one being: to query data.

Typically the user would enter some search criteria on the page, then hit a search button. The server processes this request and returns data that's then displayed in the 'window'. Sounds familiar?

And the second thing a WoD application enables an end user to do: to transact data.

In this scenario the user navigates through already displayed data, marks some to be deleted, maybe changes some values, or enters new data. Then hits a save button. Server processes this request again, and comes back with message indicating whether the transacation was successful.

I claim that the majority of the "database web applications" that are being built today (in more complex ways every five years), are still these type of applications: applications that enable users to query and/or transact data.

By the way... It might be interesting to note now that DBMS's were born to do exactly this. The number one design criterium for a DBMS is: to query and transact data. This is why DBMS's came about. Another interesting fact to note now is that data can be "modelled" in many ways. After hierarchical (triangular wheels) and network models (square wheels), E.F. Codd gave us the relational model of data. And guess what? The wheel doesn't get any rounder. The relational model of data is the end station. It is how we should develop database designs. Now and in the far future. SQL DBMS's are made to deal with (query and transact) lots and lots of data in a manageable and scalable way. These DBMS's nowadays can do this really good, provided (big if here) the data sits in a sound relational database design. Put in another way: a WoD application should have as its foundation a relational database design, for it to reap the benefits a SQL DBMS can provide.

Now don't tell me that we should model the real world in terms of objects... That to me is just going backwards to a network/hierarhical way of designing the data. Not just that, it also merges (hardwires) data and behavior back together again. History has shown that this creates an inproductive and inflexible status quo.

Let's now gradually start developing the Helsinki declaration (or Helsinki in short). The scope of Helsinki is WoD applications. Which doesn't bother me, as they represent by far the majority of the applications that are being built. By the way, technically a WoD application doesn't require end users. More than once I've been challenged with an application that surely couldn't be classified as a WoD application. But the application in those cases did provide a "window on data". Only the window was used by some other piece of software. That to me is still a WoD application. I may go into this in more detail, when I feel like talking about SOA.

Like JEE, Helsinki also has sort of an MVC classification for code. There are some subtle differences though.

I declare that every line of code that is written to implement a WoD application can be classified into one of these three classes:

User interface code
Business logic code
Data logic code

Let's discuss each of these three in some more detail.

Helsinki User Interface (UI) code

All code that you write to:

create user interface that displays row(s) of data, and/or code you write to,
respond to events triggered by the end user using the user interface, which then typically changes the user interface, most likely after first calling business logic code,

is UI code.

And you know what? I can actually see how object-oriented programming concepts might help us here. For instance a button object that has a 'when-pushed' method, or a multi-record display grid object that has methods like 'sort on column', 'give next set of records', etc.

Compared to JEE's MVC, Helsinki UI-code equals the V and the C together.

In my presentation I then always skip the Business Logic class, and first tell the audience what Helsinki means by Data Logic (DL) code.

Helsinki DL code

As stated above, a WoD application should be based on a sound relational database design. As a crucial part of that database design we typically will have data integrity constraints. These are assertions that tell us what data is and is not allowed inside the table structures that together make up the database design. When discussing integrity constraints with other people I find there are two tribes.

Those who think all (and no more) that can be dealt with declaratively falls within the realm of a constraint. In Oracle this would result into check constraints, primary keys, unique keys and foreign keys.
Those who know that there is a broader more generic concept of a constraint introduced in E.F. Codd's relational model of data. SQL implementations just happen to give us declarative constructs for above common constraint types that appear in almost every database design.

Helsinki deals with the latter (broader) concept of a constraint. So for instance in an EMP-DEPT database design, statements (sometimes referred to as business rules) like "we do not allow two or more presidents in the EMP-table" or "a department that employs a manager should also at least employ an administrator" are considered to be data integrity constraints. WoD application database designs will have, sometimes many of, these type of constraints.

Actually the SQL standard has the concept of an assertion to implement these other constraints declaratively too. Theoretically what we are dealing with here is predicate logic: boolean expressions over the data held in the table structures that must evaluate to true at all times. Btw. this is a main theme of the book I wrote with Lex de Haan.

In Helsinki every WoD application code line that you write to validate (i.e. guarantee the continued truth of) your data integrity constraints, is classified as a Data Logic code line. In a WoD application, a lot of code will sit in this class and not in the business logic class discussed hereafter.

Compared to JEE's MVC, Helsinki DL-code equals a well defined subset of M.

Ohh, and one final remark: I really fail to see how object orientation can help me implement DL code. This is all set theory and predicate logic: OO concepts just do not fit in here.

Helsinki Business Logic (BL) code

I left this one for last on purpose. This is because I always define business logic code by saying what it is not. In Helsinki, if you write a line of code for a WoD application, and that line of code does not classify as either UI-code nor DL-code, then that line of code must be BL-code. In short BL-code is the rest. If it ain't UI-code and it ain't DL-code, then is must be BL-code.

Now if you prune UI and DL code, you'll see that what is left is either:

code that composes and executes queries in a way the business of the end user requires it(I refer to this as read-BL code), or
code that composes and executes transactions in a way the business of the end user requires it (I refer to this as write-BL code).

This is pure procedural code: involving if-then-else, loop-endloop etc. I'll show you this in future posts. Point being made: again OO-concepts cannot help me here at all.

Compared to JEE's MVC, Helsinki BL-code equals a well defined subset (the other, remaining part) of M.

Code classes interaction

I'll conclude today's post with a picture of how the three Helsinki code classes interact with eachother.

This figure illustrates the following relations between the code classes:

UI-code holds calls to rBL-code (queries) or wBL-code (transactions)
rBL-code holds embedded SELECT statements that read data from the database
wBL-code holds embedded DML statements that write to the database
wBL-code often also holds embedded queries reading the database
DL-code often requires the execution SELECT statements
DL-code is typically called from wBL-code
DL-code can also be called by the DBMS via database triggers that fire as a result of the execution of DML-statements originating from wBL-code
Sometimes you'll see that UI-code calls DL-code directly; this is often done to create a responsive, more user friendly, user interface

This post has introduced you to the main concepts used in the Helsinki declaration. Try to remember them, as I will refer back to these quite a lot.

Wednesday, March 25, 2009

JEE and traditional MVC (Part 2)

In the previous post I gave a high level introduction into the MVC design pattern. This pattern classifies all code that you write to implement a database web application, into three classes:

Model code
View code
Control code

I also showed that within the JEE architecture code can be deployed to many tiers. In this post I will talk about alternative MVC approaches by looking at the amount of application code that is deployed in each tier. A tier is considered fat if it holds lots of code, and thin if it holds little code. In this way we can generate eight alternatives:

Number eight, Thin-Thin-Thin, is irrelevant and therefor not shown. I'll discuss each of above seven approaches in this post.

Let's start with number 1: Thin-Fat-Thin.

This is the typical JEE way of creating an HTML-based application. And it is I think by far the most popular approach. Control is implemented for instance with some control-framework (used to be Struts, but I think it's Spring nowadays). Model is implemented with EJB or some model framework (in an Oracle environment: ADF-BC). View is implemented with JSP or some view framework (e.g. UIX, JSF, ...). In this approach the browser displays a (poor) HTML GUI (thin), and sends HTTP requests to the controller. The controller then coordinates the execution of EJB code or model framework code. Inside these all business logic and data logic processing has been implemented (fat). Eventually simple (i.e. single table) queries , or simple (i.e. single row, primary-key based) DML statements are executed towards the database (thin). The controller then determines the next page to be sent back and initiates the View framework to do so. Both Control and View hold no business/data logic, since that would violate the MVC design pattern.

This is "client/server the JEE way". The client-tier runs a fat program (either Applet based, or Java-in-OS based). Control and View are implemented within this fat client-side GUI application. Model is implemented with EJB or some model framework. The client tier is in charge in this alternative (fat): it deals with creating the rich GUI and handling all UI-events (control) on behalf of the user. In this alternative the client delegates the execution of all business logic and data logic processing to EJB code or Model framework code, which is located centrally in the middle tier (fat). The database again only needs to serve simple queries and simple DML-statements (thin) initiated by the fat Model layer. In short, processing for V and C takes place in the Client tier and for M takes place in the Middle tier.

This is "client/server in the early days", pre Oracle7 (when stored pl/sql was not available yet). The application is again either Applet based or Java-in-OS based. The difference with the previous alternative is that all business and data logic is now also implemented within the rich GUI application running on the Client Tier (fat). So not only is this tier fat due to the rich GUI it offers, but also due to all the logic code it hosts. The fat client application will either communicate directly with the Data Tier via JDBC, or go through the Middle Tier via a thinly configured Model framework, i.e. it only offers database connectivity and does no additional logic processing in this case (thin). The database again needs to serve simple queries and simple DML-statements (thin) initiated by the fat Client tier. In short all dimensions M, V and C are located in the Client tier.

This I would call "client/server the right way". This is what client/server architecture evolved into at the end of the nineties. The M has moved from the Middle tier to the Data tier. The GUI is still rich (i.e. responsive) but all business and data logic is now taking place in the Data tier: the Middle tier (thin) only supports database connectivity from the Client to the Data tier. The database is now fully employed (fat) through the use of stored PL/SQL (functions, procedures, packages), triggers, complex views (updateable, possibly with Instead-of triggers). Also the complexity of data retrieval and manipulation is now dealt with by programming sophisticated SQL queries and DML statements. In short View and Control sit in the Client tier, and Model in the Data tier.

This is again an html-based application (thin Client tier). All dimensions of MVC are available in the Middle tier (fat). Business and data logic processing takes place not only within the Model framework, but also within the Data tier (fat). The big challenge in this alternative is: how do you divide this logic processing? What part do you implement within the Java realm, and what part within the PL/SQL realm of the DBMS? For this alternative to be manageable, you first need to establish a clear set of rules that prescribe how logic is to be divided accross the two tiers.

This one is my personal favorite. It is essentially (the popular) alternative one, only the M has now moved from the Middle tier to the Data tier. All business and data logic processing takes place inside the database (fat). The Model framework is deployed only for database connectivity: no additional logic code runs within this framework (thin). Compared to the previous alternative (five), the division of the business and data logic processing has been made 100% in favour of the Data tier. The database is now fully employed (fat) through the use of stored PL/SQL (functions, procedures, packages), triggers, complex views (updateable, possibly with Instead-of triggers). Also the complexity of data retrieval and manipulation is now dealt with by programming sophisticated SQL queries and DML statements. In short View and Control sit in the Middle tier and Model in the Data tier. This alternative is actually what the Helsinki declaration is all about (and I will revisit this, obviously much more, in later posts).

This is what I call the "Run, run, run! As fast as you can" alternative. Business and data logic is distributed among all tiers: within the fat client, within the Model framework and/or EJB layer, and also within the database. The big challenge introduced in alternative five, is even bigger here. Managing application development within this architecture must be a complete nightmare. You'll end up with this approach if you put a PL/SQL wizard, Java evangelist and RIA fan all in the same project team, without adding a architect...

Most Common Approaches

As said before, the most popular approach out there is: Thin-Fat-Thin the JEE way. Second most popular would be Thin-Fat-Fat which I see executed in two subbtle different ways:

Thin-Fat-Fat where JEE is 100% in the lead. So a class-hierarchy is modelled in the mid tier to support the business and data logic. And object classes are persisted one-on-one to tables in the data tier (this is what happens in Thin-Fat-Thin too). The relational DBMS ends up managing a hierarhical or network database design. Not a match made in heaven. By far not.
And due to "performance considerations" this approach has opportunistic pl/sql code development. Typically (allthough not exclusive) in the area of batch. Of course the SQL (which loves a sound relational database design) written at this point has to deal with a non-relational database design. This typically leads to ugly SQL, and ample reasons for the middle tier programmers to be negative about SQL and DBMS's in general.

The second Thin-Fat-Fat approach that I encounter is:

Thin-Fat-Fat where both sides are in the lead. By this I mean: database professionals get to create a relational database design that matches their perceived need of the application, and Java professionals get to create a class hierarchy that matches their perceived need of the application. Since both designs are never the same, this now leads to the infamous impedance mismatch. To connect the two designs the database professionals are asked to develop API´s that bridge the gap. The Java professionals then tweak thier Model framework such that these API´s are called.

This last approach (due to reasons other than not having a relational database design) leads to inefficient use of the DBMS too which I would like to illustrate with a small example.

Bear with me please. This is fun.

Let's say we have a GrantParent-Parent-Child three-table design. And the enduser needs to be given a report based on Parent rows. Per parent row (P) the result of some aggregation over its child rows (C) needs to be shown. And per parent row a few column values of its grandparent (GP) needs to be shown. So in SQL something of the following structure needs to be developed.


SELECT P.[columns]
      ,[aggr.function](C.[column])
      ,GP.[columns]
FROM P, C, GP
WHERE [join conditions P and C]
  AND [join conditions P and GP]
GROUP BY [P and GP columns]

Now what happens in this approach, is that the following API's are requested to be developed in the DBMS:

Give Parent rows
For a given Parent row ID, give corresponding Child rows
For a given Parent row ID, give the Grandparent row

The middle tier will then call the first API. And per Parent row it will call the 2nd API and perform the aggregation in the middle tier. And per Parent row it will call the 3rd API. Inside Java the results of calling these API's are then merged to form the set of rows required to be displayed on the page for the end user. Joining rows is done by the middle tier code. The calling of all these API's is necessary because in the middle tier "object instances" of the class hierarchy need to be populated first.

Obviously calling the DBMS many times has a performance hit (which I will revisit in a future post).

How much more easy would it be to just create a view using the query-text above. And then query the view and display the result set straight on to the page? Bypassing the object class hierarchy in the middle tier alltogether.

To be continued...