Not Cliché Architect

Wednesday, June 27, 2007

Considerations of creating a successful SDP

Hello

I’ve recently joined a BPO (Business Process Outsourcing) provider company which gives me an excellent opportunity to put my knowledge of SOA and SaaS in action. So I guess that’s what is going to shape my future posts here.
Well, here is one.
SDPs (Service Delivery Platforms) are playing almost the same role for delivering Software as a Service (SaaS) as Operating Systems do in desktop applications’ development and deployment. Rather than requiring each application to create the full stack of subsystems needed for it to run, an operating system provides an infrastructure through which general purpose services are reused. The following picture depicts the natural and ongoing process of extraction and generalization of functionality from application into frameworks and from there into the core platform components which leads to the improvement of economies of scale.

Figure 1: Borrowed from Microsoft's Architecture Journal

There would be the same concept in various levels offered by SDPs. There are different factors that can be used to specify the level of success of an SDP. What I mean by the Level of Success is SDP’s effectiveness and scalability, and the ability to provide highly reusable services – for example through an SDK - that will make the implementation and maintenance of SaaS-delivered applications less intensive.
Observation of existing SDP offerings seems to indicate that two most important factors are:

Services breadth: the completeness of the platform; in other word, the support for different stage of SaaS-delivered application life cycle (following picture)
Services depth: the degree of sophistication of the services it provides.

Figure 2: Borrowed from Microsoft's Architecture Journal

Hence there are two aspects that SDP implementers (mostly traditional hosters) and ISVs (Independent Software Vendors) who develop and deploy the service should take under consideration:

Different Application Archetypes; Business applications can be classified in different archetypes based on their characteristics and requirements. Two examples of these archetypes are OLAP and OLTP. Each of these application families has its own constraints and characteristics. For example OLTP will optimize for low latency, whereas latency for OLAP systems is not as important. The infrastructure to implement and support each is significantly different.
The point is that SDP’s effectiveness is pretty much dependent on the archetype served. The more knowledge of the application an SDP has, the greater its ability to increase the efficiency of running and operating it, and the greater the degree of sharing.
Patterns and Frameworks used in design and development; no matter what archetype an application is bound to, it can follow a pattern in design or development or it can use a framework to implement some of its services. An example of common, standard and widely adopted application infrastructure framework is Microsoft’s Enterprise Library.
I would say a valuable SDP provides an SDK including documentation, samples and even some basic tools for ISVs enabling them to develop their software using known patterns and frameworks. This way the SDP has a much increased ability to automate common procedures and offer more advanced operational management capabilities. Thus, finer-grain tuning, customization and troubleshooting will be available.
Additionally, hosters can offer a higher range of differentiated services with different monetization schemes. For instance, the hoster knows that all applications will log run-time exceptions. So basic run-time exception logging can be offered in the basic hosting package, and advanced logging, notification and escalation could become a premium offering. Notice that with this approach the ISV application doesn’t change, because all the logic resides on the SDP side.

Figure 3: Borrowed from Microsoft's Architecture Journal

Monday, June 04, 2007

MSA

You might've heard of MSA (Master of Science in Analytics) by now.
It’s an intensive 10-month professional graduate degree program designed by Institute for Advanced Analytics at North Carolina State University that focuses exclusively on the tools, methods, and applications of analytics and is designed to educate professionals with sophisticated technical skills necessary to navigate and analyze the masses of data that organizations are collecting.
The objectives of the program are:

provide students wit an understanding of basic concept and methodologies in the analysis of massive data sets
Show how these methods are applied to a variety of complex problems facing organizations, using real-world problems
Give students a sense of the broader context, such as security, privacy and ethical issues in the use of personal and confidential data

What makes this program unique is its emphasis on real-world, business-focused analytics. Comparing this program with other business related programs you'll realize that its aim is to provide the talent capable of leveraging world-class business intelligence systems. For example typical MBA degrees include limited instruction in statistics or advanced degrees in Data Mining don’t address critical and contextual issues such as data quality and integration, privacy, security and enterprise-wide decision making.
This endorses what the course designers believed that “Competing on analytics in corporations, government agencies and educational institutions is becoming a must”.

What has mostly caught my attention (and the reason I made this post) was that this program is about how to apply mathematic to get what you are looking for. Those who, like me, have studied applied mathematics and liked it and dealt with pure-math professors know what I mean.

If you like to participate and be one of the first graduates of this program, you better hurry. For more information you can take a look at the program’s website at NCSU.

Saturday, April 21, 2007

What your project success is driven by?

What would be your answer to the above question? Use-Case driven, Test-Driven, Scenario-Driven, or perhaps Feature-Driven.
People often talk about these drivers as the only forces steering projects and shaping project plans. But in fact these mechanisms are used for defining and managing projects’ scopes. I believe without Iterative and Incremental Development (IID) approach you won’t have the means to implement a practical solution that users and stakeholders can take advantage of. My main reason is latent in the definition of Stakeholder and Stakeholders’ role in the success of a project.Stakeholder in “Use Case Modeling” book of Kurt Bittner and Ian Spence is defined as an individual who is materially affected by the outcome of the system or project(s) producing the system. Hence, one could draw the conclusion that the best impetus for developing a system is its stakeholders’ feedback and their acceptance of the solution.They are the primary source of requirements, constraints, and risks for the project. They supply the funding and audience for the project and will make the decision whether the project is worthwhile.In my opinion, IID is the right approach to get stakeholders involved. You need to get their approval at the end of each iteration to be able to move to the next one. That empowers you to revise your plan and improve your development process.You can also embrace change requests – which the risks they impose increase as you get closer to the end of the project - from the outset of the project.
Utilizing IID you can suppress “Change Prevention Process” anti-pattern the goal of which is to prevent new requirement being added to the project or existing requirement expanded upon. Another word, sticking to the original plan and requirements and using it as an excuse to stop users from changing them. I would say that's a common issue in all projects avoiding IID. Of course, in order to avoid falling into "Never Ending Change Requests" pitfall all fundamental changes need to be detected and addressed before architecture is solidified.

To summarize, your project has to be Stakeholder Driven.

Friday, March 23, 2007

The lawyers who say NO!

A while ago I was reading an article at Dr.Dobb’s Journal from Scott Ambler (see his profile at IBM) here. I suggest you guys to read it as well if you haven’t yet.

It actually points out a hidden impediment which is an obstacle to many of software development teams that try to exercise IID (Iterative and Incremental Development); the lawyers who say NO!
Before I continue I must remind you that what you read here is my personal opinion and you might find it incorrect or disagree with. Well, that’s what the comments are for. I believe as a reader, it’s your obligation to share your opinion with the writer and other fellow readers.

“Lawyers who say NO!” is a metaphor for those people who approve the project’s funding which can be the customer itself, those who audit project’s compliance with what is planed business-wise, and those real lawyers who make the contract with customers. The message this article is trying to send is no matter how much you, as a software specialist, try to fix your process and development methods in order to mitigate the risks and achieve the goals, there will be non-technical obstacles that can make your efforts worthless. Unless everybody involved in the project has the same understanding of what has to be done and how (of course with their own point of view and different levels of details).

Customers and project investors need to understand the fact that that a close cooperation with the development team is the key role they can play in the success of a project and eventually getting a better ROI from the product.
Moreover, having a right governance process with which development methodology can be aligned and can resolve the potential clashes between technical and business views is crucial. Because business success partially rests on successful delivery of software products, business executives need to understand how their investments in information technology and software development are paying off. They demand visibility and accountability. That's where the governance shows up. I say partially because user experience is more than a good looking and bug-free software application.

So as you can see delivering a successful project requires a right collaboration process in place. Note that this is in addition to having a suitable Development Methodology and Governance Process.

Thursday, February 22, 2007

Tightly coupling .NET and Java components utilizing IIOP.NET

Loosely coupled sort of integrations such as SOA are common since business processes are becoming more dynamic and object-based development platforms are expanding. But I don’t think the era of tightly coupled systems is over yet for reasons such as lower integration cost or having stateful distributed objects (rare but valid requirement).

The objective of this post isn't to debate the pros and cons of such integration methods, but to introduce a .NET library using which .NET components can expose interfaces compliant with CORBA's IDL and thus, simply put, be integrated with any other component that understands IDL (e.g.: Java components via RMI-IIOP).
I'm assuming you're not familiar with CORBA and hence I'll start with a brief introduction to ORB and CORBA.

The following picture depicts the basic concept behind an ORB. The general purpose of an ORB is to provide communication means between different components of a software application. The component providing a service is represented by an object which encapsulates the code.
A client can request service from an object by sending a request through an ORB.

CORBA is OMG’s vendor-independent architecture that defines true interoperability by specifying how ORBs from different vendors can communicate.
Following figure shows some of the finer grained details from the CORBA model.

The shaded section between the application and the ORB infrastructure is the only part that is standardized by CORBA; semantics. CORBA doesn't standardize the underlying mechanisms though. Consequently the selected underlying mechanisms may not be compatible across different vendors.
To resolve this issue an additional standard called Internet Inter ORB Protocol (IIOP) has been defined to specify how different ORB mechanisms can interoperate transparently.

IIOP .NET is a .NET Remoting channel based on IIOP's conventions. IIOP .NET acts as an ORB and converts .NET’s CTS to CORBA’s types and vice versa making .NET objects accessible to Java components that deliver CORBA capabilities via RMI-IIOP (RMI over IIOP).

There have been other projects around this idea. But this one (IIOP.NET) seems to be the most stable one.
To see a complete example, please refer to the following URL:
http://www.codeproject.com/csharp/dist_object_system.asp

I'm really keen to know if there is anyone who has hands-on experience with IIOP .NET. What issues did you face and how did you resolve them?

Saturday, January 20, 2007

Database Row Level Security - Part 3 - SQL Server (and others)

In part 1 of this series Row (Record) Level Security was introduced and part 2 depicted its implementation in Oracle database.

I'd like to start the last part by answering one of the questions I was asked: "what's the point of doing this much configuration in an application that users never see the database? Essentially, application layer's control should suffice." To be pragmatic, I'd say not much most of the time. But if you are dealing with sensitive information (e.g.: medical records, payment cards, social security numbers) you shouldn't assume that all applications connecting to the database to be bug free. Studies show that most attacks exploit a weakness in user interface (a few web development frameworks were created to address that issue). When that happens, your data is at the mercy of your application and the attacker.

Update in 2010: For an example, please see this [PDF] report.

SQL Server unlike Oracle doesn’t have a built-in mechanism to provide RLS. It has to be done using a technique called Security Labeling. As a matter of fact, this technique can be used with any database (e.g.: using actual users in Oracle to achieve this may not always be an option).
A security label is a piece of information which describes the sensitivity of a data item (an object such as a table). It is a string containing markings from one or more categories. Users (subjects) have permissions described with the same markings. Each subject has a label of their own. The subject’s label is compared against the label on the object to determine access to that object.
For example, the following table fragment (object) has rows annotated with Security Labels. (Classification column)

ID	File Name	Classification
1	Mission in zone 1	SECRET
2	Mission in zone 2	TOP SECRET
3	Mission in zone 3	UNCLASSIFIED

And users have different access level:

Amir: with "SECRET" clearance
Michael: with "UNCLASSIFIED" clearance (no clearance)

Each user's clearance level (expressed as a security label) determines which rows in the table they can access. If Amir issues a SELECT * FROM <tablename> against this table, he should get the following result:

ID	File Name	Classification
1	Mission in zone 1	SECRET
3	Mission in zone 3	UNCLASSIFIED

And Michel with same query should see a different result:

ID	File Name	Classification
3	Mission in zone 3	UNCLASSIFIED

Access controls can get more complex than this. There may be more than one access criterion expressed in a security label. For example, in addition to a classification level, a piece of data may only be visible to members of a certain project team. Assume this group is called PROJECT YUK, and consider the following example.

ID	File Name	Classification
1	Mission in zone 1	SECRET, PROJECT YUK
2	Mission in zone 2	TOP SECRET
3	Mission in zone 3	UNCLASSIFIED

Let’s modify our user permissions as well.

Amir: with "SECRET, PROJECT YUK" clearance
Michael: with "UNCLASSIFIED" clearance (no clearance)
Charlie: with "TOP SECRET" clearance

We've added Charlie, a user with TOP SECRET clearance. We’ve also augmented Amir's label with the PROJECT YUK marking. Now, if Amir issues SELECT * FROM <tablename>, he should see the following results:

ID	File Name	Classification
1	Mission in zone 1	SECRET, PROJECT YUK
3	Mission in zone 3	UNCLASSIFIED

And Charlie will see the following results:

ID	File Name	Classification
2	Mission in zone 2	TOP SECRET
3	Mission in zone 3	UNCLASSIFIED

Although Charlie has a TOP SECRET clearance, he does not have the PROJECT YUK marking, so he can't see row 1. Amir's marking, however, satisfies both SECRET and PROJECT YUK marking, so he can see row 1. Row 2, requiring a TOP SECRET clearance, is visible to Charlie only.
This basic approach can be extended to additional markings. In some real-world scenarios, security labels can include several markings from different categories, and the number of possible label combinations can be quite large.

A subject can access an object if the subject label dominates the object label. Given two labels, A and B, label A is said to dominate label B if every category present in label B is satisfied by markings on label A. Determining whether the markings are satisfied depends on attributes of each category. For our purpose, each category can be characterized by the following attributes:

Domain: The possible markings in the category.

Hierarchical (yes or no): Whether or not the category is hierarchical. Hierarchical categories have an ordering among values. This order determines access. A marking can satisfy any marking at or below its level in the hierarchy. Nonhierarchical categories have no ordering among values. A marking is either present or not present.

Cardinality: How many values from the domain can be applied to the object.

Comparison Rule: Whether the subject must have any or all of the markings applied to the object from this category (referred to as the Any and All comparison rules, respectively). An alternative rule, InverseAll, can be used. This rule requires that each object must have all the markings held by the subject in order to be accessible.

Let me illustrate this with a few examples. Let's assume we have a security labeling scheme with two categories as shown in the following table:

Category	Domain	Hierarchical	Cardinality	Comparison Rule
Classification	TOP SECRET SECRET CONFIDENTIAL UNCLASSIFIED	Yes	1..1 (exactly one)	Any
Compartment	YUK ALB BC	No	0..* (0, 1, or many)	All

The question to ask is "does label A dominate label B?".

Example 1

Label A	SECRET, YUK
Label B	SECRET, YUK, ALB

To compare these labels, we must compare the markings in each category.

Classification: The SECRET marking in label A satisfies the SECRET marking in label B.
Compartments: The YUK compartment in label A does not satisfy the YUK, ALB compartments in label B (since ALL compartments in B must be present in A).

So, label A does not dominate label B.

Example 2

Label A	TOP SECRET, IRQ, AFG, BN
Label B	CONFIDENTIAL, IRQ, AFG

Classification: The TOP SECRET marking in label A satisfies the CONFIDENTIAL marking in label B.
Compartments: The YUK, ALB, BC compartments in label A satisfy those in label B.

So, label A dominates label B.

Example 3

Label A	SECRET, IRQ, BN
Label B	CONFIDENTIAL

Classification: The SECRET marking in label A satisfies the CONFIDENTIAL marking in label B.
Compartments: Label B has no compartments listed, which means there are no compartment requirements.

So, label A dominates label B.

To implement this, all the necessary logic is built in views. The intent is to simply wrap base tables in views with nearly identical definitions. Users (or applications) will then query or update views.

To achieve this:

Create tables to store label categories and markings along with properties of each unique security label combination.
Create tables to store roles and their associated marking values.
Create views.

Friday, December 22, 2006

Database Row Level Security - Part 2 - Oracle

In the first episode we briefly introduced Row Level Security. In this part I’m going to show you how to implement it in an Oracle 10g database.

Oracle 8i introduced a feature called VPD (Virtual Private Database); also known as Fine Grained Access Control that provides powerful row-level security capabilities.
VPD works by modifying the SQL commands to present a partial view of data to the users based on a set of predefined criteria. During runtime, WHERE clauses are appended to all the queries to filter only those rows user is supposed to see.

Important: implementing row level security using VPD requires each user to have an actual corresponding user object in database, not just a record in a table. If that's not feasible, then part 3 might be the way to go.

Here is the list of what we need to implement this:

An Application Context.
A procedure that sets a variable in the above mentioned context and is called when user login to the database.
A secured procedure that makes the WHERE clause using the variable that has been set in the context.
A RLS POLICY that put all these together to tell database how to filter queries.

I explain it with the doctors and patients example in part 1.
Let’s assume the tables look something like the following picture (the relationship could be actually many-to-many, but I simplified it).

Every time a doctor logs in to the system, a procedure is invoked which then sets the value of a variable in Application Context called, say, "logged_in_doc_id" to doctor's doc_id field queried from the Doctors table.

When a doctor queries the list of patients, another procedure filters the data. In this case, we simply add a WHERE clause to the query. Something like this: WHERE f_doc_id = logged_in_doc_id. This WHERE clause will be concatenated to the SELECT command and make it look like this (assuming his id in database is 1): SELECT * FROM PATIENTS WHERE f_doc_id = 1.

Here is the code:

1 & 2 –
As it can be seen, I’ve named the Application Context "NMSPC" and I’ve set the "logged_in_doc_id" variable to the value of "doc_id" field of the record that its "doc_name" field matches with the current logged in user’s name (USER).

CREATE OR REPLACE PACKAGE CONTEXT_SETUP AS
   PROCEDURE SET_SESSION_VAR;
END;

CREATE OR REPLACE PACKAGE BODY CONTEXT_SETUP AS
   PROCEDURE SET_SESSION_VAR IS
      t_id NUMBER(5);
   BEGIN
      SELECT doc_id INTO t_id
      FROM DOCTORS
      WHERE UPPER(doc_name) = UPPER(USER);
      DBMS_SESSION.SET_CONTEXT
           ('NMSPC', 'logged_in_doc_id', t_id);
   END;
END;

CREATE CONTEXT NMSPC USING CONTEXT_SETUP;

CREATE OR REPLACE TRIGGER EXEC_CONTEXT_SETUP
AFTER LOGON ON JIM.SCHEMABEGIN
   CONTEXT_SETUP.SET_SESSION_VAR();
END;

EXEC_CONTEXT_SETUP trigger is called whenever a user (in this instance "Jim") logs in.

3 – We need a procedure that creates the predicate; the WHERE clause:

CREATE OR REPLCAE PACKAGE CONTEXT_WHERE_MAKER AS
   FUNCTION WHERE_MAKER(obj_schema VARCHAR2,
                        obj_name VARCHAR2)
   RETURN VARCHAR2;
END;

CREATE OR REPLCAE PACKAGE BODY CONTEXT_WHERE_MAKER AS
   FUNCTION WHERE_MAKER(obj_schema VARCHAR2,
                        obj_name VARCHAR2)
   RETURN VARCHAR2 IS
   BEGIN
     IF UPPER(obj_name) = 'PATIENTS'
     BEGIN
      RETURN 'f_doc_id = SYS_CONTEXT("NMSPC",
                              "logged_in_doc_id")';
     END;
   END;
END;

SYS_CONTEXT is a function that helps us to retrieve the value of a variable in a particular context.

4 – The last thing we are going to create is a RLS POLICY that put all these together to make them meaningful.

BEGIN
   DBMS_RLS.ADD_POLICY
   ('ALL_TABLES', 'Patients',
    'pat_policy',
    'JIM', 'CONTEXT_WHERE_MAKER.WHERE_MAKER',
    'SELECT');
End;

Please note that "ALL_TABLES" is our schema under which all tables are created and all users have access to.