Tuesday, June 19, 2012

DS ARCHITETURE COMPONENTS

Data Integrator has a simple distributed architecture well documented in the technical manual. To understand the deployment best practices, let's break down the components into a few simple categories:

1) Server components
  - Job server: for running batch jobs. DI engine is a process started by the Job server when you run a batch job so it's NOT a deployable component.
  - Access server: for running real time jobs.
  - Web Administrator and Metadata reports: for DI administration and lineage analysis. It's web applications which can be deployed either on the Tomcat web server bundled with DI or a different web server in your company.

2) Client components
  - Designer: this is the development tool for creating all batch/real time jobs.

3) Repositories:
  - Local repository: this is a database instance that stores all DI built-in metadata and custom code. There should be one per developer in the DEV and QA, and only one in the Production.
  - Central repository: you don't need 3 central repos. Centrol repo is nothing but a version control tool for DI. It keeps the versioning history for all code checked into it. One is good enough, and you may need another one at most (I always advise my customers to use just one central repo to keep it simple).

So let's start with a simple basic deployment. You can deploy all components (a standard installation) on the 3 Windows 2003 servers: DEV, UAT, and PROD. Then you need to install only the client component (Designer) on each developer's workstation. On the DEV database server, you can create one local repo for each developer (for example, DEV_John), and a central repo for source control. On the UAT database server, you can create one local repo for each developer. On the Prod database server, just create one local repo. Of course, after these steps, you need to follow the technical manual to configure the job server and local repositories to talk to each other and other things.

1 comment: